Changing the state of data checksums in a running cluster

Started by Daniel Gustafssonover 1 year ago74 messages
#1Daniel Gustafsson
daniel@yesql.se
1 attachment(s)

After some off-list discussion about the desirability of this feature, where
several hackers opined that it's something that we should have, I've decided to
rebase this patch and submit it one more time. There are several (long)
threads covering the history of this patch [0]/messages/by-id/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com[1]/messages/by-id/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com, related work stemming from
this [2]/messages/by-id/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de as well as earlier attempts and discussions [3]/messages/by-id/FF393672-5608-46D6-9224-6620EC532693@endpoint.com[4]/messages/by-id/CABUevEx8KWhZE_XkZQpzEkZypZmBp3GbM9W90JLp=-7OJWBbcg@mail.gmail.com. Below I try to
respond to a summary of points raised in those threads.

The mechanics of the patch hasn't changed since the last posted version, it has
mainly been polished slightly. A high-level overview of the processing is:
It's using a launcher/worker model where the launcher will spawn a worker per
database which will traverse all pages and dirty them in order to calculate and
set the checksum on them. During this inprogress state all backends calculated
and write checksums but don't verify them on read. Once all pages have been
checksummed the state of the cluster will switch over to "on" synchronized
across all backends with a procsignalbarrier. At this point checksums are
verified and processing is equal to checksums having been enabled initdb. When
a user disables checksums the cluster enters a state where all backends still
write checksums until all backends have acknowledged that they have stopped
verifying checksums (again using a procsignalbarrier). At this point the
cluster switches to "off" and checksums are neither written nor verified. In
case the cluster is restarted, voluntarily or via a crash, processing will have
to be restarted (more on that further down).

The user facing controls for this are two SQL level functions, for enabling and
disabling. The existing data_checksums GUC remains but is expanded with more
possible states (with on/off retained).

Complaints against earlier versions
===================================
Seasoned hackers might remember that this patch has been on -hackers before.
There has been a lot of review, and AFAICT all specific comments have been
addressed. There are however a few larger more generic complaints:

* Restartability - the initial version of the patch did not support stateful
restarts, a shutdown performed (or crash) before checksums were enabled would
result in a need to start over from the beginning. This was deemed the safe
orchestration method. The lack of this feature was seen as serious drawback,
so it was added. Subsequent review instead found the patch to be too
complicated with a too large featureset. I thihk there is merit to both of
these arguments: being able to restart is a great feature; and being able to
reason about the correctness of a smaller patch is also great. As of this
submission I have removed the ability to restart to keep the scope of the patch
small (which is where the previous version was, which received no review after
the removal). The way I prefer to frame this is to first add scaffolding and
infrastructure (this patch) and leave refinements and add-on features
(restartability, but also others like parallel workers, optimizing rare cases,
etc) for follow-up patches.

* Complexity - it was brought up that this is a very complex patch for a niche
feature, and there is a lot of truth to that. It is inherently complex to
change a pg_control level state of a running cluster. There might be ways to
make the current patch less complex, while not sacrificing stability, and if so
that would be great. A lot of of the complexity came from being able to
restart processing, and that's not removed for this version, but it's clearly
not close to a one-line-diff even without it.

Other complaints were addressed, in part by the invention of procsignalbarriers
which makes this synchronization possible. In re-reading the threads I might
have missed something which is still left open, and if so I do apologize for
that.

Open TODO items:
================
* Immediate checkpoints - the code is currently using CHECKPOINT_IMMEDIATE in
order to be able to run the tests in a timely manner on it. This is overly
aggressive and dialling it back while still being able to run fast tests is a
TODO. Not sure what the best option is there.

* Monitoring - an insightful off-list reviewer asked how the current progress
of the operation is monitored. So far I've been using pg_stat_activity but I
don't disagree that it's not a very sharp tool for this. Maybe we need a
specific function or view or something? There clearly needs to be a way for a
user to query state and progress of a transition.

* Throttling - right now the patch uses the vacuum access strategy, with the
same cost options as vacuum, in order to implement throttling. This is in part
due to the patch starting out modelled around autovacuum as a worker, but it
may not be the right match for throttling checksums.

* Naming - the in-between states when data checksums are enabled or disabled
are called inprogress-on and inprogress-off. The reason for this is simply
that early on there were only three states: inprogress, on and off, and the
process of disabling wasn't labeled with a state. When this transition state
was added it seemed like a good idea to tack the end-goal onto the transition.
These state names make the code easily greppable but might not be the most
obvious choices for anything user facing. Is "Enabling" and "Disabling" better
terms to use (across the board or just user facing) or should we stick to the
current?

There are ways in which this processing can be optimized to achieve better
performance, but in order to keep goalposts in sight and patchsize down they
are left as future work.

--
Daniel Gustafsson

[0]: /messages/by-id/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com
[1]: /messages/by-id/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com
[2]: /messages/by-id/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de
[3]: /messages/by-id/FF393672-5608-46D6-9224-6620EC532693@endpoint.com
[4]: /messages/by-id/CABUevEx8KWhZE_XkZQpzEkZypZmBp3GbM9W90JLp=-7OJWBbcg@mail.gmail.com

Attachments:

v1-0001-Support-checksum-enable-disable-in-a-running-clus.patchapplication/octet-stream; name=v1-0001-Support-checksum-enable-disable-in-a-running-clus.patch; x-unix-mode=0644Download
From fb4e4705bc2b53a1531f0e0ffc235cc83d379b81 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Tue, 2 Jul 2024 15:20:43 +0200
Subject: [PATCH v1] Support checksum enable/disable in a running cluster

This allows data checksums to be enabled, or disabled, in a running
cluster without restricting access to the cluster during processing.

Data checksums could prior to this only be enabled at initdb time, or
when the cluster is offline using pg_checksums. This commit introduce
functionality to enable, and disable, data checksums without the need
for turning off the cluster.

A dynamic background worker is responsible for launching a per-database
worker which will mark all buffers dirty for all relation with storage
in order for them to have data checksums on write. Once all relations
in all databases have been processed, the data_checksums state can be
set to "on" and the cluster will at that point be identical to one
which had checksums enabled from the start.

While the cluster is writing checksums on existing buffers, checksums
are written but not verified during reading to avoid false negatives.
Disabling checksums will not touch any buffers (but existing checksums
cannot be re-used in case checksums are immediately re-enabled). While
disabling, checksums are again written but not verified to ensure that
concurrent backends which haven't started disabling checksums will
incur a verification error.

New in-progress states are introduced for data_checksums which during
processing ensures that backends know whether to verify and write
checksums. All state changes across backends are synchronized using
procsignalbarriers

Earlier versions of this patch were reviewed by Heikki Linnakangas,
Robert Haas, Andres Freund, Tomas Vondra, Michael Banck and Andrey
Borodin.

Authors: Daniel Gustafsson, Magnus Hagander
Discussion: https://postgr.es/m/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com
Discussion: https://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de
Discussion: https://postgr.es/m/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com
---
 doc/src/sgml/func.sgml                        |   71 +
 doc/src/sgml/monitoring.sgml                  |    8 +-
 doc/src/sgml/ref/pg_checksums.sgml            |    6 +
 doc/src/sgml/wal.sgml                         |   57 +-
 src/backend/access/heap/heapam.c              |    1 +
 src/backend/access/rmgrdesc/xlogdesc.c        |   24 +
 src/backend/access/transam/xlog.c             |  455 +++++-
 src/backend/access/transam/xlogfuncs.c        |   55 +
 src/backend/backup/basebackup.c               |    6 +-
 src/backend/postmaster/Makefile               |    1 +
 src/backend/postmaster/bgworker.c             |    7 +
 src/backend/postmaster/datachecksumsworker.c  | 1353 +++++++++++++++++
 src/backend/postmaster/meson.build            |    1 +
 src/backend/replication/logical/decode.c      |    1 +
 src/backend/storage/buffer/bufmgr.c           |    4 +-
 src/backend/storage/ipc/ipci.c                |    3 +
 src/backend/storage/ipc/procsignal.c          |   13 +
 src/backend/storage/page/README               |    4 +-
 src/backend/storage/page/bufpage.c            |    6 +-
 src/backend/storage/smgr/bulk_write.c         |    2 +
 src/backend/utils/activity/pgstat_io.c        |    2 +
 .../utils/activity/wait_event_names.txt       |    3 +
 src/backend/utils/adt/pgstatfuncs.c           |    6 -
 src/backend/utils/init/miscinit.c             |    9 +-
 src/backend/utils/init/postinit.c             |    7 +-
 src/backend/utils/misc/guc_tables.c           |   32 +-
 src/bin/pg_checksums/pg_checksums.c           |    2 +-
 src/bin/pg_upgrade/controldata.c              |    9 +
 src/include/access/xlog.h                     |   16 +-
 src/include/access/xlog_internal.h            |    7 +
 src/include/catalog/pg_control.h              |    1 +
 src/include/catalog/pg_proc.dat               |   21 +
 src/include/miscadmin.h                       |    4 +
 src/include/postmaster/datachecksumsworker.h  |   29 +
 src/include/storage/bufpage.h                 |   10 +
 src/include/storage/checksum.h                |    8 +
 src/include/storage/lwlocklist.h              |    1 +
 src/include/storage/procsignal.h              |    5 +
 src/test/Makefile                             |   10 +-
 src/test/checksum/.gitignore                  |    2 +
 src/test/checksum/Makefile                    |   23 +
 src/test/checksum/README                      |   22 +
 src/test/checksum/meson.build                 |   15 +
 src/test/checksum/t/001_basic.pl              |   78 +
 src/test/checksum/t/002_restarts.pl           |   83 +
 src/test/checksum/t/003_standby_restarts.pl   |  123 ++
 src/test/checksum/t/004_offline.pl            |   93 ++
 src/test/meson.build                          |    1 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |   34 +
 src/tools/pgindent/typedefs.list              |    5 +
 50 files changed, 2691 insertions(+), 48 deletions(-)
 create mode 100644 src/backend/postmaster/datachecksumsworker.c
 create mode 100644 src/include/postmaster/datachecksumsworker.h
 create mode 100644 src/test/checksum/.gitignore
 create mode 100644 src/test/checksum/Makefile
 create mode 100644 src/test/checksum/README
 create mode 100644 src/test/checksum/meson.build
 create mode 100644 src/test/checksum/t/001_basic.pl
 create mode 100644 src/test/checksum/t/002_restarts.pl
 create mode 100644 src/test/checksum/t/003_standby_restarts.pl
 create mode 100644 src/test/checksum/t/004_offline.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index f1f22a1960..5b6b8a80d9 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29381,6 +29381,77 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
 
   </sect2>
 
+  <sect2 id="functions-admin-checksum">
+   <title>Data Checksum Functions</title>
+
+   <para>
+    The functions shown in <xref linkend="functions-checksums-table" /> can
+    be used to enable or disable data checksums in a running cluster.
+    See <xref linkend="checksums" /> for details.
+   </para>
+
+   <table id="functions-checksums-table">
+    <title>Data Checksum Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_enable_data_checksums</primary>
+        </indexterm>
+        <function>pg_enable_data_checksums</function> ( <optional><parameter>cost_delay</parameter> <type>int</type>, <parameter>cost_limit</parameter> <type>int</type></optional> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Initiates data checksums for the cluster. This will switch the data
+        checksums mode to <literal>inprogress-on</literal> as well as start a
+        background worker that will process all pages in the database and
+        enable checksums on them. When all data pages have had checksums
+        enabled, the cluster will automatically switch data checksums mode to
+        <literal>on</literal>.
+       </para>
+       <para>
+        If <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter> are
+        specified, the speed of the process is throttled using the same principles as
+        <link linkend="runtime-config-resource-vacuum-cost">Cost-based Vacuum Delay</link>.
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_disable_data_checksums</primary>
+        </indexterm>
+        <function>pg_disable_data_checksums</function> ()
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Disables data checksum validation and calculation for the cluster. This
+        will switch the data checksum mode to <literal>inprogress-off</literal>
+        while data checksums are being disabled. When all active backends have
+        stopped validating data checksums, the data checksum mode will be
+        changed to <literal>off</literal>.  At this point the data pages will
+        still have checksums recorded but they are not updated when pages are
+        modified.
+       </para></entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+
+  </sect2>
+
   <sect2 id="functions-admin-dbobject">
    <title>Database Object Management Functions</title>
 
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 991f629907..0c0d21b00c 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3393,8 +3393,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Number of data page checksum failures detected in this
-       database (or on a shared object), or NULL if data checksums are not
-       enabled.
+       database. Detected failures are reported regardless of the
+       <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -3404,8 +3404,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Time at which the last data page checksum failure was detected in
-       this database (or on a shared object), or NULL if data checksums are not
-       enabled.
+       this database (or on a shared object). Last failure is reported
+       regardless of the <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index 95043aa329..0343710af5 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -45,6 +45,12 @@ PostgreSQL documentation
    exit status is nonzero if the operation failed.
   </para>
 
+  <para>
+   When enabling checksums, if checksums were in the process of being enabled
+   when the cluster was shut down, <application>pg_checksums</application>
+   will still process all relations regardless of the online processing.
+  </para>
+
   <para>
    When verifying checksums, every file in the cluster is scanned. When
    enabling checksums, each relation file block with a changed checksum is
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index 05e2a8f8be..bf6d28b40b 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -248,9 +248,10 @@
   <para>
    Checksums are normally enabled when the cluster is initialized using <link
    linkend="app-initdb-data-checksums"><application>initdb</application></link>.
-   They can also be enabled or disabled at a later time as an offline
-   operation. Data checksums are enabled or disabled at the full cluster
-   level, and cannot be specified individually for databases or tables.
+   They can also be enabled or disabled at a later time either as an offline
+   operation or online in a running cluster allowing concurrent access. Data
+   checksums are enabled or disabled at the full cluster level, and cannot be
+   specified individually for databases or tables.
   </para>
 
   <para>
@@ -267,7 +268,7 @@
   </para>
 
   <sect2 id="checksums-offline-enable-disable">
-   <title>Off-line Enabling of Checksums</title>
+   <title>Offline Enabling of Checksums</title>
 
    <para>
     The <link linkend="app-pgchecksums"><application>pg_checksums</application></link>
@@ -276,6 +277,54 @@
    </para>
 
   </sect2>
+
+  <sect2 id="checksums-online-enable-disable">
+   <title>Online Enabling of Checksums</title>
+
+   <para>
+    Checksums can be enabled or disabled online, by calling the appropriate
+    <link linkend="functions-admin-checksum">functions</link>.
+   </para>
+
+   <para>
+    Enabling checksums will put the cluster checksum mode in
+    <literal>inprogress-on</literal> mode.  During this time, checksums will be
+    written but not verified. In addition to this, a background worker process
+    is started that enables checksums on all existing data in the cluster. Once
+    this worker has completed processing all databases in the cluster, the
+    checksum mode will automatically switch to <literal>on</literal>. The
+    processing will consume two background worker processes, make sure that
+    <varname>max_worker_processes</varname> allows for at least two more
+    additional processes.
+   </para>
+
+   <para>
+    The process will initially wait for all open transactions to finish before
+    it starts, so that it can be certain that there are no tables that have been
+    created inside a transaction that has not committed yet and thus would not
+    be visible to the process enabling checksums. It will also, for each database,
+    wait for all pre-existing temporary tables to get removed before it finishes.
+    If long-lived temporary tables are used in the application it may be necessary
+    to terminate these application connections to allow the process to complete.
+   </para>
+
+   <para>
+    If the cluster is stopped while in <literal>inprogress-on</literal> mode, for
+    any reason, then this process must be restarted manually. To do this,
+    re-execute the function <function>pg_enable_data_checksums()</function>
+    once the cluster has been restarted. The background worker will attempt
+    to resume the work from where it was interrupted.
+   </para>
+
+   <note>
+    <para>
+     Enabling checksums can cause significant I/O to the system, as most of the
+     database pages will need to be rewritten, and will be written both to the
+     data files and the WAL.
+    </para>
+   </note>
+
+  </sect2>
  </sect1>
 
   <sect1 id="wal-intro">
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 91b20147a0..dff69ee0de 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -8340,6 +8340,7 @@ log_heap_visible(Relation rel, Buffer heap_buffer, Buffer vm_buffer,
 	XLogRegisterBuffer(0, vm_buffer, 0);
 
 	flags = REGBUF_STANDARD;
+
 	if (!XLogHintBitIsNeeded())
 		flags |= REGBUF_NO_IMAGE;
 	XLogRegisterBuffer(1, heap_buffer, flags);
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index e455400716..b220f1fd4b 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -18,6 +18,7 @@
 #include "access/xlog.h"
 #include "access/xlog_internal.h"
 #include "catalog/pg_control.h"
+#include "storage/bufpage.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -152,6 +153,26 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
 	{
 		/* No details to write out */
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state xlrec;
+
+		memcpy(&xlrec, rec, sizeof(xl_checksum_state));
+		switch (xlrec.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_VERSION:
+				appendStringInfoString(buf, "on");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				appendStringInfoString(buf, "inprogress-off");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				appendStringInfoString(buf, "inprogress-on");
+				break;
+			default:
+				appendStringInfoString(buf, "off");
+		}
+	}
 }
 
 const char *
@@ -203,6 +224,9 @@ xlog_identify(uint8 info)
 		case XLOG_CHECKPOINT_REDO:
 			id = "CHECKPOINT_REDO";
 			break;
+		case XLOG_CHECKSUMS:
+			id = "CHECKSUMS";
+			break;
 	}
 
 	return id;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 33e27a6e72..8770abee18 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -642,6 +642,16 @@ static XLogRecPtr LocalMinRecoveryPoint;
 static TimeLineID LocalMinRecoveryPointTLI;
 static bool updateMinRecoveryPoint = true;
 
+/*
+ * Local state fror Controlfile data_checksum_version. After initialization
+ * this is only updated when absorbing a procsignal barrier during interrupt
+ * processing. The reason for keeping a copy in backend-private memory is to
+ * avoid locking for interrogating checksum state. Possible values are the
+ * checksum versions defined in storage/bufpage.h as well as zero when data
+ * checksums are disabled.
+ */
+static uint32 LocalDataChecksumVersion = 0;
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -710,6 +720,8 @@ static void WALInsertLockAcquireExclusive(void);
 static void WALInsertLockRelease(void);
 static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);
 
+static void XLogChecksums(ChecksumType new_type);
+
 /*
  * Insert an XLOG record represented by an already-constructed chain of data
  * chunks.  This is a low-level routine; to construct the WAL record header
@@ -823,9 +835,10 @@ XLogInsertRecord(XLogRecData *rdata,
 		 * only happen just after a checkpoint, so it's better to be slow in
 		 * this case and fast otherwise.
 		 *
-		 * Also check to see if fullPageWrites was just turned on or there's a
-		 * running backup (which forces full-page writes); if we weren't
-		 * already doing full-page writes then go back and recompute.
+		 * Also check to see if fullPageWrites was just turned on, there's a
+		 * running backup or if checksums are enabled (all of which forces
+		 * full-page writes); if we weren't already doing full-page writes
+		 * then go back and recompute.
 		 *
 		 * If we aren't doing full-page writes then RedoRecPtr doesn't
 		 * actually affect the contents of the XLOG record, so we'll update
@@ -838,7 +851,9 @@ XLogInsertRecord(XLogRecData *rdata,
 			Assert(RedoRecPtr < Insert->RedoRecPtr);
 			RedoRecPtr = Insert->RedoRecPtr;
 		}
-		doPageWrites = (Insert->fullPageWrites || Insert->runningBackups > 0);
+		doPageWrites = (Insert->fullPageWrites ||
+						Insert->runningBackups > 0 ||
+						DataChecksumsNeedWrite());
 
 		if (doPageWrites &&
 			(!prevDoPageWrites ||
@@ -4513,9 +4528,7 @@ ReadControlFile(void)
 
 	CalculateCheckpointSegments();
 
-	/* Make the initdb settings visible as GUC variables, too */
-	SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no",
-					PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
+	LocalDataChecksumVersion = ControlFile->data_checksum_version;
 }
 
 /*
@@ -4549,13 +4562,349 @@ GetMockAuthenticationNonce(void)
 }
 
 /*
- * Are checksums enabled for data pages?
+ * DataChecksumsNeedWrite
+ *		Returns whether data checksums must be written or not
+ *
+ * Returns true iff data checksums are enabled or are in the process of being
+ * enabled. During "inprogress-on" and "inprogress-off" states checksums must
+ * be written even though they are not verified (see datachecksumsworker.c for
+ * a longer discussion).
+ *
+ * This function is intended for callsites which are about to write a data page
+ * to storage, and need to know whether to re-calculate the checksum for the
+ * page header. Calling this function must be performed as close to the write
+ * operation as possible to keep the critical section short.
+ */
+bool
+DataChecksumsNeedWrite(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * DataChecksumsNeedVerify
+ *		Returns whether data checksums must be verified or not
+ *
+ * Data checksums are only verified if they are fully enabled in the cluster.
+ * During the "inprogress-on" and "inprogress-off" states they are only
+ * updated, not verified (see datachecksumsworker.c for a longer discussion).
+ *
+ * This function is intended for callsites which have read data and are about
+ * to perform checksum validation based on the result of this.  Calling this
+ * function must be performed as close to the validation call as possible to
+ * keep the critical section short. This is in order to protect against time of
+ * check/time of use situations around data checksum validation.
+ */
+bool
+DataChecksumsNeedVerify(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
+}
+
+/*
+ * DataChecksumsOnInProgress
+ *		Returns whether data checksums are being enabled
+ *
+ * Most operations don't need to worry about the "inprogress" states, and
+ * should use DataChecksumsNeedVerify() or DataChecksumsNeedWrite(). The
+ * "inprogress-on" state for enabling checksums is used when the checksum
+ * worker is setting checksums on all pages, it can thus be used to check for
+ * aborted checksum processing which need to be restarted.
+ */
+inline bool
+DataChecksumsOnInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+}
+
+/*
+ * DataChecksumsOffInProgress
+ *		Returns whether data checksums are being disabled
+ *
+ * The "inprogress-off" state for disabling checksums is used for when the
+ * worker resets the catalog state.  DataChecksumsNeedVerify() or
+ * DataChecksumsNeedWrite() should be used for deciding whether to read/write
+ * checksums.
  */
 bool
-DataChecksumsEnabled(void)
+DataChecksumsOffInProgress(void)
 {
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * SetDataChecksumsOnInProgress
+ *		Sets the data checksum state to "inprogress-on" to enable checksums
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on". See
+ * SetDataChecksumsOn below for a description on how this state change works.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOnInProgress(void)
+{
+	uint64		barrier;
+
 	Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+
+	/*
+	 * The state transition is performed in a critical section with
+	 * checkpoints held off to provide crash safety.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	UpdateControlFile();
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state change in all backends to ensure that all backends are in
+	 * "inprogress-on". Once done we know that all backends are writing data
+	 * checksums.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOn
+ *		Enables data checksums cluster-wide
+ *
+ * Enabling data checksums is performed using two barriers, the first one to
+ * set the checksums state to "inprogress-on" (which is performed by
+ * SetDataChecksumsOnInProgress()) and the second one to set the state to "on"
+ * (performed here).
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on".  This state
+ * requires data checksums to be written but not verified. This ensures that
+ * all data pages can be checksummed without the risk of false negatives in
+ * validation during the process.  When all existing pages are guaranteed to
+ * have checksums, and all new pages will be initiated with checksums, the
+ * state can be changed to "on". Once the state is "on" checksums will be both
+ * written and verified. See datachecksumsworker.c for a longer discussion on
+ * how data checksums can be enabled in a running cluster.
+ *
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOn(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile != NULL);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/*
+	 * The only allowed state transition to "on" is from "inprogress-on" since
+	 * that state ensures that all pages will have data checksums written.
+	 */
+	if (ControlFile->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+		elog(ERROR, "checksums not in \"inprogress-on\" mode");
+	}
+
+	LWLockRelease(ControlFileLock);
+
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
+	UpdateControlFile();
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state transition of "on" in all backends. When done we know that
+	 * data checksums are enabled in all backends and data checksums are both
+	 * written and verified.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOff
+ *		Disables data checksums cluster-wide
+ *
+ * Disabling data checksums must be performed with two sets of barriers, each
+ * carrying a different state. The state is first set to "inprogress-off"
+ * during which checksums are still written but not verified. This ensures that
+ * backends which have yet to observe the state change from "on" won't get
+ * validation errors on concurrently modified pages. Once all backends have
+ * changed to "inprogress-off", the barrier for moving to "off" can be emitted.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOff(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/* If data checksums are already disabled there is nothing to do */
+	if (ControlFile->data_checksum_version == 0)
+	{
+		LWLockRelease(ControlFileLock);
+		return;
+	}
+
+	/*
+	 * If data checksums are currently enabled we first transition to the
+	 * "inprogress-off" state during which backends continue to write
+	 * checksums without verifying them. When all backends are in
+	 * "inprogress-off" the next transition to "off" can be performed, after
+	 * which all data checksum processing is disabled.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+
+		MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+		START_CRIT_SECTION();
+
+		XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+		UpdateControlFile();
+		LWLockRelease(ControlFileLock);
+
+		barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+
+		END_CRIT_SECTION();
+		MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+
+		/*
+		 * Update local state in all backends to ensure that any backend in
+		 * "on" state is changed to "inprogress-off".
+		 */
+		WaitForProcSignalBarrier(barrier);
+
+		/*
+		 * At this point we know that no backends are verifying data checksums
+		 * during reading. Next, we can safely move to state "off" to also
+		 * stop writing checksums.
+		 */
+	}
+	else
+	{
+		/*
+		 * Ending up here implies that the checksums state is "inprogress-on"
+		 * or "inprogress-off" and we can transition directly to "off" from
+		 * there.
+		 */
+		LWLockRelease(ControlFileLock);
+	}
+
+	/*
+	 * Ensure that we don't incur a checkpoint during disabling checksums.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(0);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = 0;
+	UpdateControlFile();
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * ProcSignalBarrier absorption functions for enabling and disabling data
+ * checksums in a running cluster. The procsignalbarriers are emitted in the
+ * SetDataChecksums* functions.
+ */
+bool
+AbsorbChecksumsOnInProgressBarrier(void)
+{
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOnBarrier(void)
+{
+	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffInProgressBarrier(void)
+{
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffBarrier(void)
+{
+	LocalDataChecksumVersion = 0;
+	return true;
+}
+
+/*
+ * InitLocalControlData
+ *
+ * Set up backend local caches of controldata variables which may change at
+ * any point during runtime and thus require special cased locking. So far
+ * this only applies to data_checksum_version, but it's intended to be general
+ * purpose enough to handle future cases.
+ */
+void
+InitLocalControldata(void)
+{
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	LocalDataChecksumVersion = ControlFile->data_checksum_version;
+	LWLockRelease(ControlFileLock);
+}
+
+/* guc hook */
+const char *
+show_data_checksums(void)
+{
+	if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)
+		return "on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		return "inprogress-on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+		return "inprogress-off";
+	else
+		return "off";
 }
 
 /*
@@ -6105,6 +6454,33 @@ StartupXLOG(void)
 	 */
 	CompleteCommitTsInitialization();
 
+	/*
+	 * If we reach this point with checksums being enabled ("inprogress-on"
+	 * state), we notify the user that they need to manually restart the
+	 * process to enable checksums. This is because we cannot launch a dynamic
+	 * background worker directly from here, it has to be launched from a
+	 * regular backend.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		ereport(WARNING,
+				(errmsg("data checksums are being enabled, but no worker is running"),
+				 errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
+
+	/*
+	 * If data checksums were being disabled when the cluster was shut down,
+	 * we know that we have a state where all backends have stopped validating
+	 * checksums and we can move to off instead of prompting the user to
+	 * perform any action.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+	{
+		XLogChecksums(0);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = 0;
+		LWLockRelease(ControlFileLock);
+	}
+
 	/*
 	 * All done with end-of-recovery actions.
 	 *
@@ -8086,6 +8462,24 @@ XLogReportParameters(void)
 	}
 }
 
+/*
+ * Log the new state of checksums
+ */
+static void
+XLogChecksums(ChecksumType new_type)
+{
+	xl_checksum_state xlrec;
+	XLogRecPtr	recptr;
+
+	xlrec.new_checksumtype = new_type;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, sizeof(xl_checksum_state));
+
+	recptr = XLogInsert(RM_XLOG_ID, XLOG_CHECKSUMS);
+	XLogFlush(recptr);
+}
+
 /*
  * Update full_page_writes in shared memory, and write an
  * XLOG_FPW_CHANGE record if necessary.
@@ -8514,6 +8908,47 @@ xlog_redo(XLogReaderState *record)
 	{
 		/* nothing to do here, just for informational purposes */
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state state;
+		uint64		barrier;
+
+		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = state.new_checksumtype;
+		UpdateControlFile();
+		LWLockRelease(ControlFileLock);
+
+		/*
+		 * Block on a procsignalbarrier to await all processes having seen the
+		 * change to checksum status. Once the barrier has been passed we can
+		 * initiate the corresponding processing.
+		 */
+		switch (state.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			default:
+				Assert(state.new_checksumtype == 0);
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+		}
+	}
 }
 
 /*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 4e46baaebd..9d3ef15d12 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -26,6 +26,7 @@
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "postmaster/datachecksumsworker.h"
 #include "replication/walreceiver.h"
 #include "storage/fd.h"
 #include "storage/standby.h"
@@ -747,3 +748,57 @@ pg_promote(PG_FUNCTION_ARGS)
 						   wait_seconds)));
 	PG_RETURN_BOOL(false);
 }
+
+/*
+ * Disables data checksums for the cluster, if applicable. Starts a background
+ * worker which turns off the data checksums.
+ */
+Datum
+disable_data_checksums(PG_FUNCTION_ARGS)
+{
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	StartDataChecksumsWorkerLauncher(false, 0, 0);
+	PG_RETURN_VOID();
+}
+
+/*
+ * Enables data checksums for the cluster, if applicable. Starts a background
+ * worker launcher which in turn launches background workers which computes
+ * data checksums for all pages.
+ */
+Datum
+enable_data_checksums(PG_FUNCTION_ARGS)
+{
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	StartDataChecksumsWorkerLauncher(true, 0, 100);
+	PG_RETURN_VOID();
+}
+
+/*
+ * Enables data checksums for the cluster, if applicable.  Supports vacuum-
+ * like cost based throttling to limit system load. Starts a background worker
+ * which updates data checksums on existing data.
+ */
+Datum
+enable_data_checksums_p(PG_FUNCTION_ARGS)
+{
+	int			cost_delay = PG_GETARG_INT32(0);
+	int			cost_limit = PG_GETARG_INT32(1);
+
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	if (cost_delay < 0)
+		ereport(ERROR, errmsg("cost delay cannot be a negative value"));
+
+	if (cost_limit <= 0)
+		ereport(ERROR, errmsg("cost limit must be greater than zero"));
+
+	StartDataChecksumsWorkerLauncher(true, cost_delay, cost_limit);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 9a2bf59e84..a464551ff4 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1611,7 +1611,8 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	 * enabled for this cluster, and if this is a relation file, then verify
 	 * the checksum.
 	 */
-	if (!noverify_checksums && DataChecksumsEnabled() &&
+	if (!noverify_checksums &&
+		DataChecksumsNeedWrite() &&
 		RelFileNumberIsValid(relfilenumber))
 		verify_checksum = true;
 
@@ -2004,6 +2005,9 @@ verify_page_checksum(Page page, XLogRecPtr start_lsn, BlockNumber blkno,
 	if (PageIsNew(page) || PageGetLSN(page) >= start_lsn)
 		return true;
 
+	if (!DataChecksumsNeedVerify())
+		return true;
+
 	/* Perform the actual checksum calculation. */
 	checksum = pg_checksum_page(page, blkno);
 
diff --git a/src/backend/postmaster/Makefile b/src/backend/postmaster/Makefile
index db08543d19..e112d4b53e 100644
--- a/src/backend/postmaster/Makefile
+++ b/src/backend/postmaster/Makefile
@@ -18,6 +18,7 @@ OBJS = \
 	bgworker.o \
 	bgwriter.o \
 	checkpointer.o \
+	datachecksumsworker.o \
 	fork_process.o \
 	interrupt.o \
 	launch_backend.o \
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 77707bb384..ece5644c17 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -18,6 +18,7 @@
 #include "pgstat.h"
 #include "port/atomics.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "replication/logicallauncher.h"
 #include "replication/logicalworker.h"
@@ -132,6 +133,12 @@ static const struct
 	},
 	{
 		"TablesyncWorkerMain", TablesyncWorkerMain
+	},
+	{
+		"DataChecksumsWorkerLauncherMain", DataChecksumsWorkerLauncherMain
+	},
+	{
+		"DataChecksumsWorkerMain", DataChecksumsWorkerMain
 	}
 };
 
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
new file mode 100644
index 0000000000..e45eb49931
--- /dev/null
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -0,0 +1,1353 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.c
+ *	  Background worker for enabling or disabling data checksums online
+ *
+ * When enabling data checksums on a database at initdb time or when shut down
+ * with pg_checksums, no extra process is required as each page is checksummed,
+ * and verified, when accessed.  When enabling checksums on an already running
+ * cluster, this worker will ensure that all pages are checksummed before
+ * verification of the checksums is turned on. In the case of disabling
+ * checksums, the state transition is performed only in the control file, no
+ * changes are performed on the data pages.
+ *
+ * Checksums can be either enabled or disabled cluster-wide, with on/off being
+ * the end state for data_checksums.
+ *
+ * Enabling checksums
+ * ------------------
+ * When enabling checksums in an online cluster, data_checksums will be set to
+ * "inprogress-on" which signals that write operations MUST compute and write
+ * the checksum on the data page, but during reading the checksum SHALL NOT be
+ * verified. This ensures that all objects created during checksumming will
+ * have checksums set, but no reads will fail due to incorrect checksum. The
+ * DataChecksumsWorker will compile a list of databases which exist at the
+ * start of checksumming, and all of these which haven't been dropped during
+ * the processing MUST have been processed successfully in order for checksums
+ * to be enabled. Any new relation created during processing will see the
+ * in-progress state and will automatically be checksummed.
+ *
+ * For each database, all relations which have storage are read and every data
+ * page is marked dirty to force a write with the checksum. This will generate
+ * a lot of WAL as the entire database is read and written.
+ *
+ * If the processing is interrupted by a cluster restart, it will be restarted
+ * from the beginning again as state isn't persisted.
+ *
+ * Disabling checksums
+ * -------------------
+ * When disabling checksums, data_checksums will be set to "inprogress-off"
+ * which signals that checksums are written but no longer verified. This ensure
+ * that backends which have yet to move from the "on" state will still be able
+ * to process data checksum validation.
+ *
+ * Synchronization and Correctness
+ * -------------------------------
+ * The processes involved in enabling, or disabling, data checksums in an
+ * online cluster must be properly synchronized with the normal backends
+ * serving concurrent queries to ensure correctness. Correctness is defined
+ * as the following:
+ *
+ *    - Backends SHALL NOT violate local data_checksums state
+ *    - Data checksums SHALL NOT be considered enabled cluster-wide until all
+ *      currently connected backends have the local state "enabled"
+ *
+ * There are two levels of synchronization required for enabling data checksums
+ * in an online cluster: (i) changing state in the active backends ("on",
+ * "off", "inprogress-on" and "inprogress-off"), and (ii) ensuring no
+ * incompatible objects and processes are left in a database when workers end.
+ * The former deals with cluster-wide agreement on data checksum state and the
+ * latter with ensuring that any concurrent activity cannot break the data
+ * checksum contract during processing.
+ *
+ * Synchronizing the state change is done with procsignal barriers, where the
+ * WAL logging backend updating the global state in the controlfile will wait
+ * for all other backends to absorb the barrier. Barrier absorption will happen
+ * during interrupt processing, which means that connected backends will change
+ * state at different times. To prevent data checksum state changes when
+ * writing and verifying checksums, interrupts shall be held off before
+ * interrogating state and resumed when the IO operation has been performed.
+ *
+ *   When Enabling Data Checksums
+ *   ----------------------------
+ *   A process which fails to observe data checksums being enabled can induce
+ *   two types of errors: failing to write the checksum when modifying the page
+ *   and failing to validate the data checksum on the page when reading it.
+ *
+ *   When processing starts all backends belong to one of the below sets, with
+ *   one set being empty:
+ *
+ *   Bd: Backends in "off" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   If processing is started in an online cluster then all backends are in Bd.
+ *   If processing was halted by the cluster shutting down, the controlfile
+ *   state "inprogress-on" will be observed on system startup and all backends
+ *   will be in Bd. Backends transition Bd -> Bi via a procsignalbarrier.  When
+ *   the DataChecksumsWorker has finished writing checksums on all pages and
+ *   enables data checksums cluster-wide, there are four sets of backends where
+ *   Bd shall be an empty set:
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   Backends in Bi and Be will write checksums when modifying a page, but only
+ *   backends in Be will verify the checksum during reading. The Bg backend is
+ *   blocked waiting for all backends in Bi to process interrupts and move to
+ *   Be. Any backend starting while Bg is waiting on the procsignalbarrier will
+ *   observe the global state being "on" and will thus automatically belong to
+ *   Be.  Checksums are enabled cluster-wide when Bi is an empty set. Bi and Be
+ *   are compatible sets while still operating based on their local state as
+ *   both write data checksums.
+ *
+ *   When Disabling Data Checksums
+ *   -----------------------------
+ *   A process which fails to observe that data checksums have been disabled
+ *   can induce two types of errors: writing the checksum when modifying the
+ *   page and validating a data checksum which is no longer correct due to
+ *   modifications to the page.
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bo: Backends in "inprogress-off" state
+ *
+ *   Backends transition from the Be state to Bd like so: Be -> Bo -> Bd
+ *
+ *   The goal is to transition all backends to Bd making the others empty sets.
+ *   Backends in Bo write data checksums, but don't validate them, such that
+ *   backends still in Be can continue to validate pages until the barrier has
+ *   been absorbed such that they are in Bo. Once all backends are in Bo, the
+ *   barrier to transition to "off" can be raised and all backends can safely
+ *   stop writing data checksums as no backend is enforcing data checksum
+ *   validation any longer.
+ *
+ *
+ * Potential optimizations
+ * -----------------------
+ * Below are some potential optimizations and improvements which were brought
+ * up during reviews of this feature, but which weren't implemented in the
+ * initial version. These are ideas listed without any validation on their
+ * feasibility or potential payoff. More discussion on these can be found on
+ * the -hackers threads linked to in the commit message of this feature.
+ *
+ *   * Launching datachecksumsworker for resuming operation from the startup
+ *     process: Currently users have to restart processing manually after a
+ *     restart since dynamic background worker cannot be started from the
+ *     postmaster. Changing the startup process could make restarting the
+ *     processing automatic on cluster restart.
+ *   * Avoid dirtying the page when checksums already match: Iff the checksum
+ *     on the page happens to already match we still dirty the page. It should
+ *     be enough to only do the log_newpage_buffer() call in that case.
+ *   * Invent a lightweight WAL record that doesn't contain the full-page
+ *     image but just the block number: On replay, the redo routine would read
+ *     the page from disk.
+ *   * Teach pg_checksums to avoid checksummed pages when pg_checksums is used
+ *     to enable checksums on a cluster which is in inprogress-on state and
+ *     may have checksummed pages (make pg_checksums be able to resume an
+ *     online operation).
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/postmaster/datachecksumsworker.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/htup_details.h"
+#include "access/xact.h"
+#include "access/xloginsert.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_database.h"
+#include "commands/vacuum.h"
+#include "common/relpath.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "postmaster/bgworker.h"
+#include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/bufmgr.h"
+#include "storage/checksum.h"
+#include "storage/ipc.h"
+#include "storage/lmgr.h"
+#include "storage/lwlock.h"
+#include "storage/procarray.h"
+#include "storage/smgr.h"
+#include "tcop/tcopprot.h"
+#include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
+#include "utils/ps_status.h"
+#include "utils/syscache.h"
+
+/*
+ * Number of times we retry to open a database before giving up and consider
+ * it to have failed processing.
+ */
+#define DATACHECKSUMSWORKER_MAX_DB_RETRIES 5
+
+typedef enum
+{
+	DATACHECKSUMSWORKER_SUCCESSFUL = 0,
+	DATACHECKSUMSWORKER_ABORTED,
+	DATACHECKSUMSWORKER_FAILED,
+	DATACHECKSUMSWORKER_RETRYDB,
+} DataChecksumsWorkerResult;
+
+/*
+ * Signaling between backends calling pg_enable/disable_data_checksums, the
+ * checksums launcher process, and the checksums worker process.
+ *
+ * This struct is protected by DataChecksumsWorkerLock
+ */
+typedef struct DataChecksumsWorkerShmemStruct
+{
+	/*
+	 * These are set by pg_enable/disable_data_checksums, to tell the launcher
+	 * what the target state is.
+	 */
+	bool		launch_enable_checksums;	/* True if checksums are being
+											 * enabled, else false */
+	int			launch_cost_delay;
+	int			launch_cost_limit;
+
+	/*
+	 * Is a launcher process is currently running?
+	 *
+	 * This is set by the launcher process, after it has read the above
+	 * launch_* parameters.
+	 */
+	bool		launcher_running;
+
+	/*
+	 * These fields indicate the target state that the launcher is currently
+	 * working towards. They can be different from the corresponding launch_*
+	 * fields, if a new pg_enable/disable_data_checksums() call was made while
+	 * the launcher/worker was already running.
+	 *
+	 * The below members are set when the launcher starts, and are only
+	 * accessed read-only by the single worker. Thus, we can access these
+	 * without a lock. If multiple workers, or dynamic cost parameters, are
+	 * supported at some point then this would need to be revisited.
+	 */
+	bool		enabling_checksums; /* True if checksums are being enabled,
+									 * else false */
+	int			cost_delay;
+	int			cost_limit;
+
+	/*
+	 * Signaling between the launcher and the worker process.
+	 *
+	 * As there is only a single worker, and the launcher won't read these
+	 * until the worker exits, they can be accessed without the need for a
+	 * lock. If multiple workers are supported then this will have to be
+	 * revisited.
+	 */
+
+	/* result, set by worker before exiting */
+	DataChecksumsWorkerResult success;
+
+	/*
+	 * tells the worker process whether it should also process the shared
+	 * catalogs
+	 */
+	bool		process_shared_catalogs;
+} DataChecksumsWorkerShmemStruct;
+
+/* Shared memory segment for datachecksumsworker */
+static DataChecksumsWorkerShmemStruct *DataChecksumsWorkerShmem;
+
+/* Bookkeeping for work to do */
+typedef struct DataChecksumsWorkerDatabase
+{
+	Oid			dboid;
+	char	   *dbname;
+} DataChecksumsWorkerDatabase;
+
+typedef struct DataChecksumsWorkerResultEntry
+{
+	Oid			dboid;
+	DataChecksumsWorkerResult result;
+	int			retries;
+} DataChecksumsWorkerResultEntry;
+
+
+/*
+ * Flag set by the interrupt handler
+ */
+static volatile sig_atomic_t abort_requested = false;
+
+/*
+ * Have we set the DataChecksumsWorkerShmemStruct->launcher_running flag?
+ * If we have, we need to clear it before exiting!
+ */
+static volatile sig_atomic_t launcher_running = false;
+
+/*
+ * Are we enabling data checksums, or disabling them?
+ */
+static bool enabling_checksums;
+
+/* Prototypes */
+static List *BuildDatabaseList(void);
+static List *BuildRelationList(bool temp_relations, bool include_shared);
+static void FreeDatabaseList(List *dblist);
+static DataChecksumsWorkerResult ProcessDatabase(DataChecksumsWorkerDatabase *db);
+static bool ProcessAllDatabases(bool *already_connected);
+static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy);
+static void launcher_cancel_handler(SIGNAL_ARGS);
+static void WaitForAllTransactionsToFinish(void);
+
+/*
+ * StartDataChecksumsWorkerLauncher
+ *		Main entry point for datachecksumsworker launcher process
+ *
+ * The main entrypoint for starting data checksums processing for enabling as
+ * well as disabling.
+ */
+void
+StartDataChecksumsWorkerLauncher(bool enable_checksums, int cost_delay, int cost_limit)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	bool		launcher_running;
+
+	/* the cost delay settings have no effect when disabling */
+	Assert(enable_checksums || cost_delay == 0);
+	Assert(enable_checksums || cost_limit == 0);
+
+	/* store the desired state in shared memory */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	DataChecksumsWorkerShmem->launch_enable_checksums = enable_checksums;
+	DataChecksumsWorkerShmem->launch_cost_delay = cost_delay;
+	DataChecksumsWorkerShmem->launch_cost_limit = cost_limit;
+
+	/* is the launcher already running? */
+	launcher_running = DataChecksumsWorkerShmem->launcher_running;
+
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * Launch a new launcher process, if it's not running already.
+	 *
+	 * If the launcher is currently busy enabling the checksums, and we want
+	 * them disabled (or vice versa), the launcher will notice that at latest
+	 * when it's about to exit, and will loop back process the new request. So
+	 * if the launcher is already running, we don't need to do anything more
+	 * here to abort it.
+	 *
+	 * If you call pg_enable/disable_data_checksums() twice in a row, before
+	 * the launcher has had a chance to start up, we still end up launching it
+	 * twice.  That's OK, the second invocation will see that a launcher is
+	 * already running and exit quickly.
+	 *
+	 * TODO: We could optimize here and skip launching the launcher, if we are
+	 * already in the desired state, i.e. if the checksums are already enabled
+	 * and you call pg_enable_data_checksums().
+	 */
+	if (!launcher_running)
+	{
+		/*
+		 * Prepare the BackgroundWorker and launch it.
+		 */
+		memset(&bgw, 0, sizeof(bgw));
+		bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+		bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+		snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+		snprintf(bgw.bgw_function_name, BGW_MAXLEN, "DataChecksumsWorkerLauncherMain");
+		snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker launcher");
+		snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker launcher");
+		bgw.bgw_restart_time = BGW_NEVER_RESTART;
+		bgw.bgw_notify_pid = MyProcPid;
+		bgw.bgw_main_arg = (Datum) 0;
+
+		if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+			ereport(ERROR,
+					(errmsg("failed to start background worker to process data checksums")));
+	}
+}
+
+/*
+ * ProcessSingleRelationFork
+ *		Enable data checksums in a single relation/fork.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy)
+{
+	BlockNumber numblocks = RelationGetNumberOfBlocksInFork(reln, forkNum);
+	BlockNumber blknum;
+	char		activity[NAMEDATALEN * 2 + 128];
+	char	   *relns;
+
+	relns = get_namespace_name(RelationGetNamespace(reln));
+
+	if (!relns)
+		return false;
+
+	/*
+	 * We are looping over the blocks which existed at the time of process
+	 * start, which is safe since new blocks are created with checksums set
+	 * already due to the state being "inprogress-on".
+	 */
+	for (blknum = 0; blknum < numblocks; blknum++)
+	{
+		Buffer		buf = ReadBufferExtended(reln, forkNum, blknum, RBM_NORMAL, strategy);
+
+		/*
+		 * Report to pgstat every 100 blocks to keep from overwhelming the
+		 * activity reporting with close to identical reports.
+		 */
+		if ((blknum % 100) == 0)
+		{
+			snprintf(activity, sizeof(activity) - 1, "processing: %s.%s (%s block %d/%d)",
+					 relns, RelationGetRelationName(reln),
+					 forkNames[forkNum], blknum, numblocks);
+			pgstat_report_activity(STATE_RUNNING, activity);
+		}
+
+		/* Need to get an exclusive lock before we can flag as dirty */
+		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+
+		/*
+		 * Mark the buffer as dirty and force a full page write.  We have to
+		 * re-write the page to WAL even if the checksum hasn't changed,
+		 * because if there is a replica it might have a slightly different
+		 * version of the page with an invalid checksum, caused by unlogged
+		 * changes (e.g. hintbits) on the master happening while checksums
+		 * were off. This can happen if there was a valid checksum on the page
+		 * at one point in the past, so only when checksums are first on, then
+		 * off, and then turned on again. Iff wal_level is set to "minimal",
+		 * this could be avoided iff the checksum is calculated to be correct.
+		 */
+		START_CRIT_SECTION();
+		MarkBufferDirty(buf);
+		log_newpage_buffer(buf, false);
+		END_CRIT_SECTION();
+
+		UnlockReleaseBuffer(buf);
+
+		/*
+		 * This is the only place where we check if we are asked to abort, the
+		 * abortion will bubble up from here. It's safe to check this without
+		 * a lock, because if we miss it being set, we will try again soon.
+		 */
+		Assert(enabling_checksums);
+		if (!DataChecksumsWorkerShmem->launch_enable_checksums)
+			abort_requested = true;
+		if (abort_requested)
+			return false;
+
+		vacuum_delay_point();
+	}
+
+	pfree(relns);
+	return true;
+}
+
+/*
+ * ProcessSingleRelationByOid
+ *		Process a single relation based on oid.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationByOid(Oid relationId, BufferAccessStrategy strategy)
+{
+	Relation	rel;
+	ForkNumber	fnum;
+	bool		aborted = false;
+
+	StartTransactionCommand();
+
+	elog(DEBUG2,
+		 "adding data checksums to relation with OID %u",
+		 relationId);
+
+	rel = try_relation_open(relationId, AccessShareLock);
+	if (rel == NULL)
+	{
+		/*
+		 * Relation no longer exists. We don't consider this an error since
+		 * there are no pages in it that need data checksums, and thus return
+		 * true. The worker operates off a list of relations generated at the
+		 * start of processing, so relations being dropped in the meantime is
+		 * to be expected.
+		 */
+		CommitTransactionCommand();
+		pgstat_report_activity(STATE_IDLE, NULL);
+		return true;
+	}
+	RelationGetSmgr(rel);
+
+	for (fnum = 0; fnum <= MAX_FORKNUM; fnum++)
+	{
+		if (smgrexists(rel->rd_smgr, fnum))
+		{
+			if (!ProcessSingleRelationFork(rel, fnum, strategy))
+			{
+				aborted = true;
+				break;
+			}
+		}
+	}
+	relation_close(rel, AccessShareLock);
+	elog(DEBUG2,
+		 "data checksum processing done for relation with OID %u: %s",
+		 relationId, (aborted ? "aborted" : "finished"));
+
+	CommitTransactionCommand();
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return !aborted;
+}
+
+/*
+ * ProcessDatabase
+ *		Enable data checksums in a single database.
+ *
+ * We do this by launching a dynamic background worker into this database, and
+ * waiting for it to finish.  We have to do this in a separate worker, since
+ * each process can only be connected to one database during its lifetime.
+ */
+static DataChecksumsWorkerResult
+ProcessDatabase(DataChecksumsWorkerDatabase *db)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	BgwHandleStatus status;
+	pid_t		pid;
+	char		activity[NAMEDATALEN + 64];
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_FAILED;
+
+	memset(&bgw, 0, sizeof(bgw));
+	bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+	bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+	snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+	snprintf(bgw.bgw_function_name, BGW_MAXLEN, "%s", "DataChecksumsWorkerMain");
+	snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker worker");
+	snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker worker");
+	bgw.bgw_restart_time = BGW_NEVER_RESTART;
+	bgw.bgw_notify_pid = MyProcPid;
+	bgw.bgw_main_arg = ObjectIdGetDatum(db->dboid);
+
+	/*
+	 * If there are no worker slots available, make sure we retry processing
+	 * this database. This will make the datachecksumsworker move on to the
+	 * next database and quite likely fail with the same problem. TODO: Maybe
+	 * we need a backoff to avoid running through all the databases here in
+	 * short order.
+	 */
+	if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+	{
+		ereport(WARNING,
+				(errmsg("failed to start worker for enabling data checksums in database \"%s\", retrying",
+						db->dbname),
+				 errhint("The max_worker_processes setting might be too low.")));
+		return DATACHECKSUMSWORKER_RETRYDB;
+	}
+
+	status = WaitForBackgroundWorkerStartup(bgw_handle, &pid);
+	if (status == BGWH_STOPPED)
+	{
+		ereport(WARNING,
+				(errmsg("could not start background worker for enabling data checksums in database \"%s\"",
+						db->dbname),
+				 errhint("More details on the error might be found in the server log.")));
+		return DATACHECKSUMSWORKER_FAILED;
+	}
+
+	/*
+	 * If the postmaster crashed we cannot end up with a processed database so
+	 * we have no alternative other than exiting. When enabling checksums we
+	 * won't at this time have changed the pg_control version to enabled so
+	 * when the cluster comes back up processing will have to be restarted.
+	 * When disabling, the pg_control version will be set to off before this
+	 * so when the cluster comes up checksums will be off as expected.
+	 */
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("cannot enable data checksums without the postmaster process"),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	Assert(status == BGWH_STARTED);
+	ereport(DEBUG1,
+			(errmsg("initiating data checksum processing in database \"%s\"",
+					db->dbname)));
+
+	snprintf(activity, sizeof(activity) - 1,
+			 "Waiting for worker in database %s (pid %d)", db->dbname, pid);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	status = WaitForBackgroundWorkerShutdown(bgw_handle);
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("postmaster exited during data checksum processing in \"%s\"",
+						db->dbname),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	if (DataChecksumsWorkerShmem->success == DATACHECKSUMSWORKER_ABORTED)
+		ereport(LOG,
+				(errmsg("data checksums processing was aborted in database \"%s\"",
+						db->dbname)));
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return DataChecksumsWorkerShmem->success;
+}
+
+/*
+ * launcher_exit
+ *
+ * Internal routine for cleaning up state when the launcher process exits. We
+ * need to clean up the abort flag to ensure that processing can be restarted
+ * again after it was previously aborted.
+ */
+static void
+launcher_exit(int code, Datum arg)
+{
+	if (launcher_running)
+	{
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		launcher_running = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		LWLockRelease(DataChecksumsWorkerLock);
+	}
+}
+
+/*
+ * launcher_cancel_handler
+ *
+ * Internal routine for reacting to SIGINT and flagging the worker to abort.
+ * The worker won't be interrupted immediately but will check for abort flag
+ * between each block in a relation.
+ */
+static void
+launcher_cancel_handler(SIGNAL_ARGS)
+{
+	int			save_errno = errno;
+
+	abort_requested = true;
+
+	/*
+	 * There is no sleeping in the main loop, the flag will be checked
+	 * periodically in ProcessSingleRelationFork. The worker does however
+	 * sleep when waiting for concurrent transactions to end so we still need
+	 * to set the latch.
+	 */
+	SetLatch(MyLatch);
+
+	errno = save_errno;
+}
+
+/*
+ * WaitForAllTransactionsToFinish
+ *		Blocks awaiting all current transactions to finish
+ *
+ * Returns when all transactions which are active at the call of the function
+ * have ended, or if the postmaster dies while waiting. If the postmaster dies
+ * the abort flag will be set to indicate that the caller of this shouldn't
+ * proceed.
+ *
+ * NB: this will return early, if aborted by SIGINT or if the target state
+ * is changed while we're running.
+ */
+static void
+WaitForAllTransactionsToFinish(void)
+{
+	TransactionId waitforxid;
+
+	LWLockAcquire(XidGenLock, LW_SHARED);
+	waitforxid = XidFromFullTransactionId(TransamVariables->nextXid);
+	LWLockRelease(XidGenLock);
+
+	while (TransactionIdPrecedes(GetOldestActiveTransactionId(), waitforxid))
+	{
+		char		activity[64];
+		int			rc;
+
+		/* Oldest running xid is older than us, so wait */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for current transactions to finish (waiting for %u)",
+				 waitforxid);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		rc = WaitLatch(MyLatch,
+					   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+					   5000,
+					   WAIT_EVENT_CHECKSUM_ENABLE_STARTCONDITION);
+
+		/*
+		 * If the postmaster died we won't be able to enable checksums
+		 * cluster-wide so abort and hope to continue when restarted.
+		 */
+		if (rc & WL_POSTMASTER_DEATH)
+			ereport(FATAL,
+					(errmsg("postmaster exited during data checksum processing"),
+					 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_SHARED);
+		if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			abort_requested = true;
+		LWLockRelease(DataChecksumsWorkerLock);
+		if (abort_requested)
+			break;
+	}
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+	return;
+}
+
+/*
+ * DataChecksumsWorkerLauncherMain
+ *
+ * Main function for launching dynamic background workers for processing data
+ * checksums in databases. This function has the bgworker management, with
+ * ProcessAllDatabases being responsible for looping over the databases and
+ * initiating processing.
+ */
+void
+DataChecksumsWorkerLauncherMain(Datum arg)
+{
+	bool		connected = false;
+	bool		status = false;
+
+	on_shmem_exit(launcher_exit, 0);
+
+	ereport(DEBUG1,
+			errmsg("background worker \"datachecksumsworker\" launcher started"));
+
+	pqsignal(SIGTERM, die);
+	pqsignal(SIGINT, launcher_cancel_handler);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_LAUNCHER;
+	init_ps_display(NULL);
+
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	if (DataChecksumsWorkerShmem->launcher_running)
+	{
+		/* Launcher was already running, let it finish */
+		LWLockRelease(DataChecksumsWorkerLock);
+		return;
+	}
+
+	launcher_running = true;
+
+	enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+	DataChecksumsWorkerShmem->launcher_running = true;
+	DataChecksumsWorkerShmem->enabling_checksums = enabling_checksums;
+	DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+	DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * The target state can change while we are busy enabling/disabling
+	 * checksums, if the user calls pg_disable/enable_data_checksums() before
+	 * we are finished with the previous request. In that case, we will loop
+	 * back here, to process the new request.
+	 */
+again:
+
+	/*
+	 * If we're asked to enable checksums, we need to check if processing was
+	 * previously interrupted such that we should resume rather than start
+	 * from scratch.
+	 */
+	if (enabling_checksums)
+	{
+		/*
+		 * If we are asked to enable checksums in a cluster which already has
+		 * checksums enabled, exit immediately as there is nothing more to do.
+		 * Hold interrupts to make sure state doesn't change during checking.
+		 */
+		HOLD_INTERRUPTS();
+		if (DataChecksumsNeedVerify())
+		{
+			RESUME_INTERRUPTS();
+			goto done;
+		}
+		RESUME_INTERRUPTS();
+
+		SetDataChecksumsOnInProgress();
+
+		status = ProcessAllDatabases(&connected);
+		if (!status)
+		{
+			/*
+			 * If the target state changed during processing then it's not a
+			 * failure, so restart processing instead.
+			 */
+			LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+			if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			{
+				LWLockRelease(DataChecksumsWorkerLock);
+				goto done;
+			}
+			LWLockRelease(DataChecksumsWorkerLock);
+			ereport(ERROR,
+					(errmsg("unable to enable checksums in cluster")));
+		}
+
+		SetDataChecksumsOn();
+	}
+	else
+	{
+		SetDataChecksumsOff();
+	}
+
+done:
+
+	/*
+	 * All done. But before we exit, check if the target state was changed
+	 * while we were running. In that case we will have to start all over
+	 * again.
+	 */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+	if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+	{
+		DataChecksumsWorkerShmem->enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+		DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+		LWLockRelease(DataChecksumsWorkerLock);
+		goto again;
+	}
+
+	launcher_running = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	LWLockRelease(DataChecksumsWorkerLock);
+}
+
+/*
+ * ProcessAllDatabases
+ *		Compute the list of all databases and process checksums in each
+ *
+ * This will repeatedly generate a list of databases to process for enabling
+ * checksums. Until no new databases are found, this will loop around computing
+ * a new list and comparing it to the already seen ones.
+ */
+static bool
+ProcessAllDatabases(bool *already_connected)
+{
+	List	   *DatabaseList;
+	HTAB	   *ProcessedDatabases = NULL;
+	ListCell   *lc;
+	HASHCTL		hash_ctl;
+	bool		found_failed = false;
+
+	/* Initialize a hash tracking all processed databases */
+	memset(&hash_ctl, 0, sizeof(hash_ctl));
+	hash_ctl.keysize = sizeof(Oid);
+	hash_ctl.entrysize = sizeof(DataChecksumsWorkerResultEntry);
+	ProcessedDatabases = hash_create("Processed databases",
+									 64,
+									 &hash_ctl,
+									 HASH_ELEM | HASH_BLOBS);
+
+	/*
+	 * Initialize a connection to shared catalogs only.
+	 */
+	if (!*already_connected)
+		BackgroundWorkerInitializeConnection(NULL, NULL, 0);
+
+	*already_connected = true;
+
+	/*
+	 * Set up so first run processes shared catalogs, but not once in every
+	 * db.
+	 */
+	DataChecksumsWorkerShmem->process_shared_catalogs = true;
+
+	/*
+	 * Get a list of all databases to process. This may include databases that
+	 * were created during our runtime.  Since a database can be created as a
+	 * copy of any other database (which may not have existed in our last
+	 * run), we have to repeat this loop until no new databases show up in the
+	 * list.
+	 */
+	DatabaseList = BuildDatabaseList();
+
+	while (true)
+	{
+		int			processed_databases = 0;
+
+		foreach(lc, DatabaseList)
+		{
+			DataChecksumsWorkerDatabase *db = (DataChecksumsWorkerDatabase *) lfirst(lc);
+			DataChecksumsWorkerResult result;
+			DataChecksumsWorkerResultEntry *entry;
+			bool		found;
+
+			elog(DEBUG1,
+				 "starting processing of database %s with oid %u",
+				 db->dbname, db->dboid);
+
+			entry = (DataChecksumsWorkerResultEntry *) hash_search(ProcessedDatabases, &db->dboid,
+																   HASH_FIND, NULL);
+
+			if (entry)
+			{
+				if (entry->result == DATACHECKSUMSWORKER_RETRYDB)
+				{
+					/*
+					 * Limit the number of retries to avoid infinite looping
+					 * in case there simply won't be enough workers in the
+					 * cluster to finish this operation.
+					 */
+					if (entry->retries > DATACHECKSUMSWORKER_MAX_DB_RETRIES)
+						entry->result = DATACHECKSUMSWORKER_FAILED;
+				}
+
+				/* Skip if this database has been processed already */
+				if (entry->result != DATACHECKSUMSWORKER_RETRYDB)
+					continue;
+			}
+
+			result = ProcessDatabase(db);
+			processed_databases++;
+
+			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
+			{
+				/*
+				 * If one database has completed shared catalogs, we don't
+				 * have to process them again.
+				 */
+				if (DataChecksumsWorkerShmem->process_shared_catalogs)
+					DataChecksumsWorkerShmem->process_shared_catalogs = false;
+			}
+			else if (result == DATACHECKSUMSWORKER_ABORTED)
+			{
+				/* Abort flag set, so exit the whole process */
+				return false;
+			}
+
+			entry = hash_search(ProcessedDatabases, &db->dboid, HASH_ENTER, &found);
+			entry->dboid = db->dboid;
+			entry->result = result;
+			if (!found)
+				entry->retries = 0;
+			else
+				entry->retries++;
+		}
+
+		elog(DEBUG1,
+			 "%i databases processed for data checksum enabling, %s",
+			 processed_databases,
+			 (processed_databases ? "process with restart" : "process completed"));
+
+		FreeDatabaseList(DatabaseList);
+
+		/*
+		 * If no databases were processed in this run of the loop, we have now
+		 * finished all databases and no concurrently created ones can exist.
+		 */
+		if (processed_databases == 0)
+			break;
+
+		/*
+		 * Re-generate the list of databases for another pass. Since we wait
+		 * for all pre-existing transactions finish, this way we can be
+		 * certain that there are no databases left without checksums.
+		 */
+		WaitForAllTransactionsToFinish();
+		DatabaseList = BuildDatabaseList();
+	}
+
+	/*
+	 * ProcessedDatabases now has all databases and the results of their
+	 * processing. Failure to enable checksums for a database can be because
+	 * they actually failed for some reason, or because the database was
+	 * dropped between us getting the database list and trying to process it.
+	 * Get a fresh list of databases to detect the second case where the
+	 * database was dropped before we had started processing it. If a database
+	 * still exists, but enabling checksums failed then we fail the entire
+	 * checksumming process and exit with an error.
+	 */
+	WaitForAllTransactionsToFinish();
+	DatabaseList = BuildDatabaseList();
+
+	foreach(lc, DatabaseList)
+	{
+		DataChecksumsWorkerDatabase *db = (DataChecksumsWorkerDatabase *) lfirst(lc);
+		DataChecksumsWorkerResult *entry;
+		bool		found;
+
+		entry = hash_search(ProcessedDatabases, (void *) &db->dboid,
+							HASH_FIND, &found);
+
+		/*
+		 * We are only interested in the databases where the failed database
+		 * still exists.
+		 */
+		if (found && *entry == DATACHECKSUMSWORKER_FAILED)
+		{
+			ereport(WARNING,
+					(errmsg("failed to enable data checksums in \"%s\"",
+							db->dbname)));
+			found_failed = found;
+			continue;
+		}
+	}
+
+	FreeDatabaseList(DatabaseList);
+
+	if (found_failed)
+	{
+		/* Disable checksums on cluster, because we failed */
+		SetDataChecksumsOff();
+		ereport(ERROR,
+				(errmsg("checksums failed to get enabled in all databases, aborting"),
+				 errhint("The server log might have more information on the error.")));
+	}
+
+	/*
+	 * Force a checkpoint to get everything out to disk. TODO: we probably
+	 * don't want to use a CHECKPOINT_IMMEDIATE here but it's very convenient
+	 * for testing until the patch is fully baked, as it may otherwise make
+	 * tests take a lot longer.
+	 */
+	RequestCheckpoint(CHECKPOINT_FORCE | CHECKPOINT_WAIT | CHECKPOINT_IMMEDIATE);
+
+	return true;
+}
+
+/*
+ * DataChecksumsWorkerShmemSize
+ *		Compute required space for datachecksumsworker-related shared memory
+ */
+Size
+DataChecksumsWorkerShmemSize(void)
+{
+	Size		size;
+
+	size = sizeof(DataChecksumsWorkerShmemStruct);
+	size = MAXALIGN(size);
+
+	return size;
+}
+
+/*
+ * DataChecksumsWorkerShmemInit
+ *		Allocate and initialize datachecksumsworker-related shared memory
+ */
+void
+DataChecksumsWorkerShmemInit(void)
+{
+	bool		found;
+
+	DataChecksumsWorkerShmem = (DataChecksumsWorkerShmemStruct *)
+		ShmemInitStruct("DataChecksumsWorker Data",
+						DataChecksumsWorkerShmemSize(),
+						&found);
+
+	MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
+
+	/*
+	 * Even if this is a redundant assignment, we want to be explicit about
+	 * our intent for readability, since we want to be able to query this
+	 * state in case of restartability.
+	 */
+	DataChecksumsWorkerShmem->launch_enable_checksums = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+}
+
+/*
+ * BuildDatabaseList
+ *		Compile a list of all currently available databases in the cluster
+ *
+ * This creates the list of databases for the datachecksumsworker workers to
+ * add checksums to. If the caller wants to ensure that no concurrently
+ * running CREATE DATABASE calls exist, this needs to be preceded by a call
+ * to WaitForAllTransactionsToFinish().
+ */
+static List *
+BuildDatabaseList(void)
+{
+	List	   *DatabaseList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(DatabaseRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_database pgdb = (Form_pg_database) GETSTRUCT(tup);
+		DataChecksumsWorkerDatabase *db;
+
+		oldctx = MemoryContextSwitchTo(ctx);
+
+		db = (DataChecksumsWorkerDatabase *) palloc0(sizeof(DataChecksumsWorkerDatabase));
+
+		db->dboid = pgdb->oid;
+		db->dbname = pstrdup(NameStr(pgdb->datname));
+
+		DatabaseList = lappend(DatabaseList, db);
+
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return DatabaseList;
+}
+
+static void
+FreeDatabaseList(List *dblist)
+{
+	ListCell   *lc;
+
+	if (!dblist)
+		return;
+
+	foreach(lc, dblist)
+	{
+		DataChecksumsWorkerDatabase *db = lfirst(lc);
+
+		if (db->dbname != NULL)
+			pfree(db->dbname);
+	}
+
+	list_free_deep(dblist);
+}
+
+/*
+ * BuildRelationList
+ *		Compile a list of relations in the database
+ *
+ * Returns a list of OIDs for the request relation types. If temp_relations
+ * is True then only temporary relations are returned. If temp_relations is
+ * False then non-temporary relations which have data checksums are returned.
+ * If include_shared is True then shared relations are included as well in a
+ * non-temporary list. include_shared has no relevance when building a list of
+ * temporary relations.
+ */
+static List *
+BuildRelationList(bool temp_relations, bool include_shared)
+{
+	List	   *RelationList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(RelationRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_class pgc = (Form_pg_class) GETSTRUCT(tup);
+
+		/*
+		 * Only include temporary relations when asked for a temp relation
+		 * list.
+		 */
+		if (pgc->relpersistence == RELPERSISTENCE_TEMP)
+		{
+			if (!temp_relations)
+				continue;
+		}
+		else
+		{
+			/*
+			 * If we are only interested in temp relations then continue
+			 * immediately as the current relation isn't a temp relation.
+			 */
+			if (temp_relations)
+				continue;
+
+			if (!RELKIND_HAS_STORAGE(pgc->relkind))
+				continue;
+
+			if (pgc->relisshared && !include_shared)
+				continue;
+		}
+
+		oldctx = MemoryContextSwitchTo(ctx);
+		RelationList = lappend_oid(RelationList, pgc->oid);
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return RelationList;
+}
+
+/*
+ * DataChecksumsWorkerMain
+ *
+ * Main function for enabling checksums in a single database, This is the
+ * function set as the bgw_function_name in the dynamic background worker
+ * process initiated for each database by the worker launcher. After enabling
+ * data checksums in each applicable relation in the database, it will wait for
+ * all temporary relations that were present when the function started to
+ * disappear before returning. This is required since we cannot rewrite
+ * existing temporary relations with data checksums.
+ */
+void
+DataChecksumsWorkerMain(Datum arg)
+{
+	Oid			dboid = DatumGetObjectId(arg);
+	List	   *RelationList = NIL;
+	List	   *InitialTempTableList = NIL;
+	ListCell   *lc;
+	BufferAccessStrategy strategy;
+	bool		aborted = false;
+
+	enabling_checksums = true;
+
+	pqsignal(SIGTERM, die);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_WORKER;
+	init_ps_display(NULL);
+
+	ereport(DEBUG1,
+			(errmsg("starting data checksum processing in database with OID %u",
+					dboid)));
+
+	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
+											  BGWORKER_BYPASS_ALLOWCONN);
+
+	/*
+	 * Get a list of all temp tables present as we start in this database. We
+	 * need to wait until they are all gone until we are done, since we cannot
+	 * access these relations and modify them.
+	 */
+	InitialTempTableList = BuildRelationList(true, false);
+
+	/*
+	 * Enable vacuum cost delay, if any.
+	 */
+	Assert(DataChecksumsWorkerShmem->enabling_checksums);
+	VacuumCostDelay = DataChecksumsWorkerShmem->cost_delay;
+	VacuumCostLimit = DataChecksumsWorkerShmem->cost_limit;
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumPageHit = 0;
+	VacuumPageMiss = 0;
+	VacuumPageDirty = 0;
+
+	/*
+	 * Create and set the vacuum strategy as our buffer strategy.
+	 */
+	strategy = GetAccessStrategy(BAS_VACUUM);
+
+	RelationList = BuildRelationList(false,
+									 DataChecksumsWorkerShmem->process_shared_catalogs);
+	foreach(lc, RelationList)
+	{
+		Oid			reloid = lfirst_oid(lc);
+
+		if (!ProcessSingleRelationByOid(reloid, strategy))
+		{
+			aborted = true;
+			break;
+		}
+	}
+	list_free(RelationList);
+
+	if (aborted)
+	{
+		DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+		ereport(DEBUG1,
+				(errmsg("data checksum processing aborted in database OID %u",
+						dboid)));
+		return;
+	}
+
+	/*
+	 * Wait for all temp tables that existed when we started to go away. This
+	 * is necessary since we cannot "reach" them to enable checksums. Any temp
+	 * tables created after we started will already have checksums in them
+	 * (due to the "inprogress-on" state), so no need to wait for those.
+	 */
+	for (;;)
+	{
+		List	   *CurrentTempTables;
+		ListCell   *lc;
+		int			numleft;
+		char		activity[64];
+
+		CurrentTempTables = BuildRelationList(true, false);
+		numleft = 0;
+		foreach(lc, InitialTempTableList)
+		{
+			if (list_member_oid(CurrentTempTables, lfirst_oid(lc)))
+				numleft++;
+		}
+		list_free(CurrentTempTables);
+
+		if (numleft == 0)
+			break;
+
+		/* At least one temp table is left to wait for */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for %d temp tables to be removed", numleft);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		(void) WaitLatch(MyLatch,
+						 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+						 5000,
+						 WAIT_EVENT_CHECKSUM_ENABLE_FINISHCONDITION);
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		aborted = DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums;
+		LWLockRelease(DataChecksumsWorkerLock);
+
+		if (aborted || abort_requested)
+		{
+			DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+			ereport(DEBUG1,
+					(errmsg("data checksum processing aborted in database OID %u",
+							dboid)));
+			return;
+		}
+	}
+
+	list_free(InitialTempTableList);
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
+	ereport(DEBUG1,
+			(errmsg("data checksum processing completed in database with OID %u",
+					dboid)));
+}
diff --git a/src/backend/postmaster/meson.build b/src/backend/postmaster/meson.build
index 0ea4bbe084..3aacb2e0e7 100644
--- a/src/backend/postmaster/meson.build
+++ b/src/backend/postmaster/meson.build
@@ -6,6 +6,7 @@ backend_sources += files(
   'bgworker.c',
   'bgwriter.c',
   'checkpointer.c',
+  'datachecksumsworker.c',
   'fork_process.c',
   'interrupt.c',
   'launch_backend.c',
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index d687ceee33..6c38a57a20 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -186,6 +186,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 		case XLOG_FPW_CHANGE:
 		case XLOG_FPI_FOR_HINT:
 		case XLOG_FPI:
+		case XLOG_CHECKSUMS:
 		case XLOG_OVERWRITE_CONTRECORD:
 		case XLOG_CHECKPOINT_REDO:
 			break;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 6181673095..9baba83e8c 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1521,7 +1521,9 @@ WaitReadBuffers(ReadBuffersOperation *operation)
 				bufBlock = BufHdrGetBlock(bufHdr);
 			}
 
-			/* check for garbage data */
+			/*
+			 * Check for garbage data.
+			 */
 			if (!PageIsVerifiedExtended((Page) bufBlock, io_first_block + j,
 										PIV_LOG_WARNING | PIV_REPORT_STAT))
 			{
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 2100150f01..14710ca5c9 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -30,6 +30,7 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "postmaster/walsummarizer.h"
 #include "replication/logicallauncher.h"
@@ -152,6 +153,7 @@ CalculateShmemSize(int *num_semaphores)
 	size = add_size(size, WaitEventCustomShmemSize());
 	size = add_size(size, InjectionPointShmemSize());
 	size = add_size(size, SlotSyncShmemSize());
+	size = add_size(size, DataChecksumsWorkerShmemSize());
 #ifdef EXEC_BACKEND
 	size = add_size(size, ShmemBackendArraySize());
 #endif
@@ -347,6 +349,7 @@ CreateOrAttachShmemStructs(void)
 	PgArchShmemInit();
 	ApplyLauncherShmemInit();
 	SlotSyncShmemInit();
+	DataChecksumsWorkerShmemInit();
 
 	/*
 	 * Set up other modules that need some shared memory space
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 4ed9cedcdd..8c9fe9e74f 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -18,6 +18,7 @@
 #include <unistd.h>
 
 #include "access/parallel.h"
+#include "access/xlog.h"
 #include "commands/async.h"
 #include "miscadmin.h"
 #include "pgstat.h"
@@ -541,6 +542,18 @@ ProcessProcSignalBarrier(void)
 					case PROCSIGNAL_BARRIER_SMGRRELEASE:
 						processed = ProcessBarrierSmgrRelease();
 						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON:
+						processed = AbsorbChecksumsOnInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_ON:
+						processed = AbsorbChecksumsOnBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF:
+						processed = AbsorbChecksumsOffInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_OFF:
+						processed = AbsorbChecksumsOffBarrier();
+						break;
 				}
 
 				/*
diff --git a/src/backend/storage/page/README b/src/backend/storage/page/README
index e30d7ac59a..73c36a6390 100644
--- a/src/backend/storage/page/README
+++ b/src/backend/storage/page/README
@@ -10,7 +10,9 @@ http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf, discussed
 2010/12/22 on -hackers list.
 
 Current implementation requires this be enabled system-wide at initdb time, or
-by using the pg_checksums tool on an offline cluster.
+by using the pg_checksums tool on an offline cluster.  Checksums can also be
+enabled at runtime using pg_enable_data_checksums(), and disabled by using
+pg_disable_data_checksums().
 
 The checksum is not valid at all times on a data page!!
 The checksum is valid when the page leaves the shared pool and is checked
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index be6f1f62d2..112c05ee7e 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -100,7 +100,7 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 	 */
 	if (!PageIsNew(page))
 	{
-		if (DataChecksumsEnabled())
+		if (DataChecksumsNeedVerify())
 		{
 			checksum = pg_checksum_page((char *) page, blkno);
 
@@ -1512,7 +1512,7 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 	static char *pageCopy = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return (char *) page;
 
 	/*
@@ -1542,7 +1542,7 @@ void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
 	/* If we don't need a checksum, just return */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return;
 
 	((PageHeader) page)->pd_checksum = pg_checksum_page((char *) page, blkno);
diff --git a/src/backend/storage/smgr/bulk_write.c b/src/backend/storage/smgr/bulk_write.c
index 4a10ece4c3..45ed560970 100644
--- a/src/backend/storage/smgr/bulk_write.c
+++ b/src/backend/storage/smgr/bulk_write.c
@@ -36,6 +36,7 @@
 
 #include "access/xloginsert.h"
 #include "access/xlogrecord.h"
+#include "miscadmin.h"
 #include "storage/bufmgr.h"
 #include "storage/bufpage.h"
 #include "storage/bulk_write.h"
@@ -253,6 +254,7 @@ smgr_bulk_flush(BulkWriteState *bulkstate)
 		}
 		else
 			smgrwrite(bulkstate->smgr, bulkstate->forknum, blkno, page, true);
+
 		pfree(page);
 	}
 
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index 9d6e067382..6b1a6f1348 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -332,6 +332,8 @@ pgstat_tracks_io_bktype(BackendType bktype)
 		case B_WAL_SUMMARIZER:
 			return false;
 
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 		case B_AUTOVAC_LAUNCHER:
 		case B_AUTOVAC_WORKER:
 		case B_BACKEND:
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index db37beeaae..ce76799979 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -114,6 +114,8 @@ CHECKPOINT_DELAY_COMPLETE	"Waiting for a backend that blocks a checkpoint from c
 CHECKPOINT_DELAY_START	"Waiting for a backend that blocks a checkpoint from starting."
 CHECKPOINT_DONE	"Waiting for a checkpoint to complete."
 CHECKPOINT_START	"Waiting for a checkpoint to start."
+CHECKSUM_ENABLE_STARTCONDITION	"Waiting for data checksums enabling to start."
+CHECKSUM_ENABLE_FINISHCONDITION	"Waiting for data checksums to be enabled."
 EXECUTE_GATHER	"Waiting for activity from a child process while executing a <literal>Gather</literal> plan node."
 HASH_BATCH_ALLOCATE	"Waiting for an elected Parallel Hash participant to allocate a hash table."
 HASH_BATCH_ELECT	"Waiting to elect a Parallel Hash participant to allocate a hash table."
@@ -345,6 +347,7 @@ WALSummarizer	"Waiting to read or update WAL summarization state."
 DSMRegistry	"Waiting to read or update the dynamic shared memory registry."
 InjectionPoint	"Waiting to read or update information related to injection points."
 SerialControl	"Waiting to read or update shared <filename>pg_serial</filename> state."
+DataChecksumsWorker	"Waiting for data checksumsworker."
 
 #
 # END OF PREDEFINED LWLOCKS (DO NOT CHANGE THIS LINE)
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 3876339ee1..973fd7d0b2 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -1115,9 +1115,6 @@ pg_stat_get_db_checksum_failures(PG_FUNCTION_ARGS)
 	int64		result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
@@ -1133,9 +1130,6 @@ pg_stat_get_db_checksum_last_failure(PG_FUNCTION_ARGS)
 	TimestampTz result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index 537d92c0cf..6571c187cc 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -290,6 +290,12 @@ GetBackendTypeDesc(BackendType backendType)
 		case B_CHECKPOINTER:
 			backendDesc = "checkpointer";
 			break;
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+			backendDesc = "datachecksumsworker launcher";
+			break;
+		case B_DATACHECKSUMSWORKER_WORKER:
+			backendDesc = "datachecksumsworker worker";
+			break;
 		case B_LOGGER:
 			backendDesc = "logger";
 			break;
@@ -841,7 +847,8 @@ InitializeSessionUserIdStandalone(void)
 	 * workers, in slot sync worker and in background workers.
 	 */
 	Assert(!IsUnderPostmaster || AmAutoVacuumWorkerProcess() ||
-		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess());
+		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess() ||
+		   AmDataChecksumsWorkerProcess());
 
 	/* call only once */
 	Assert(!OidIsValid(AuthenticatedUserId));
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 0805398e24..f7a74d47ba 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -759,6 +759,11 @@ InitPostgres(const char *in_dbname, Oid dboid,
 	 */
 	SharedInvalBackendInit(false);
 
+	/*
+	 * Set up backend local cache of Controldata values.
+	 */
+	InitLocalControldata();
+
 	ProcSignalInit();
 
 	/*
@@ -903,7 +908,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
 					 errhint("You should immediately run CREATE USER \"%s\" SUPERUSER;.",
 							 username != NULL ? username : "postgres")));
 	}
-	else if (AmBackgroundWorkerProcess())
+	else if (AmBackgroundWorkerProcess() || AmDataChecksumsWorkerProcess())
 	{
 		if (username == NULL && !OidIsValid(useroid))
 		{
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index d28b0bcb40..4e12545ed7 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -484,6 +484,14 @@ static const struct config_enum_entry wal_compression_options[] = {
 	{NULL, 0, false}
 };
 
+static const struct config_enum_entry data_checksums_options[] = {
+	{"on", DATA_CHECKSUMS_ON, true},
+	{"off", DATA_CHECKSUMS_OFF, true},
+	{"inprogress-on", DATA_CHECKSUMS_INPROGRESS_ON, true},
+	{"inprogress-off", DATA_CHECKSUMS_INPROGRESS_OFF, true},
+	{NULL, 0, false}
+};
+
 /*
  * Options for enum values stored in other modules
  */
@@ -601,7 +609,7 @@ static int	segment_size;
 static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
-static bool data_checksums;
+static int	data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
@@ -1870,17 +1878,6 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
-	{
-		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
-			gettext_noop("Shows whether data checksums are turned on for this cluster."),
-			NULL,
-			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
-		},
-		&data_checksums,
-		false,
-		NULL, NULL, NULL
-	},
-
 	{
 		{"syslog_sequence_numbers", PGC_SIGHUP, LOGGING_WHERE,
 			gettext_noop("Add sequence number to syslog messages to avoid duplicate suppression."),
@@ -5120,6 +5117,17 @@ struct config_enum ConfigureNamesEnum[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows whether data checksums are turned on for this cluster."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+		},
+		&data_checksums,
+		DATA_CHECKSUMS_OFF, data_checksums_options,
+		NULL, NULL, show_data_checksums
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index b5bb0e7887..3dcdea2b38 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -577,7 +577,7 @@ main(int argc, char *argv[])
 		mode == PG_MODE_DISABLE)
 		pg_fatal("data checksums are already disabled in cluster");
 
-	if (ControlFile->data_checksum_version > 0 &&
+	if (ControlFile->data_checksum_version == DATA_CHECKSUMS_ON &&
 		mode == PG_MODE_ENABLE)
 		pg_fatal("data checksums are already enabled in cluster");
 
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index 1f0ccea3ed..fb9eaaf067 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -692,6 +692,15 @@ check_control_data(ControlData *oldctrl,
 	 * check_for_isn_and_int8_passing_mismatch().
 	 */
 
+	/*
+	 * If data checksums have been turned on in the old cluster, but the
+	 * datachecksumsworker have yet to finish, then disallow the upgrade. The
+	 * user should either let the process finish, or turn off checksums,
+	 * before retrying.
+	 */
+	if (oldctrl->data_checksum_version == 2)
+		pg_fatal("checksums are being enabled in the old cluster");
+
 	/*
 	 * We might eventually allow upgrades from checksum to no-checksum
 	 * clusters.
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 1a1f11a943..43aa17f166 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -115,7 +115,7 @@ extern PGDLLIMPORT int wal_level;
  * of the bits make it to disk, but the checksum wouldn't match.  Also WAL-log
  * them if forced by wal_log_hints=on.
  */
-#define XLogHintBitIsNeeded() (DataChecksumsEnabled() || wal_log_hints)
+#define XLogHintBitIsNeeded() (wal_log_hints || DataChecksumsNeedWrite())
 
 /* Do we need to WAL-log information required only for Hot Standby and logical replication? */
 #define XLogStandbyInfoActive() (wal_level >= WAL_LEVEL_REPLICA)
@@ -227,7 +227,19 @@ extern XLogRecPtr GetXLogWriteRecPtr(void);
 
 extern uint64 GetSystemIdentifier(void);
 extern char *GetMockAuthenticationNonce(void);
-extern bool DataChecksumsEnabled(void);
+extern bool DataChecksumsNeedWrite(void);
+extern bool DataChecksumsNeedVerify(void);
+extern bool DataChecksumsOnInProgress(void);
+extern bool DataChecksumsOffInProgress(void);
+extern void SetDataChecksumsOnInProgress(void);
+extern void SetDataChecksumsOn(void);
+extern void SetDataChecksumsOff(void);
+extern bool AbsorbChecksumsOnInProgressBarrier(void);
+extern bool AbsorbChecksumsOffInProgressBarrier(void);
+extern bool AbsorbChecksumsOnBarrier(void);
+extern bool AbsorbChecksumsOffBarrier(void);
+extern const char *show_data_checksums(void);
+extern void InitLocalControldata(void);
 extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
 extern Size XLOGShmemSize(void);
 extern void XLOGShmemInit(void);
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index 5161b72f28..e085f60043 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,6 +25,7 @@
 #include "lib/stringinfo.h"
 #include "pgtime.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/relfilelocator.h"
 
 
@@ -289,6 +290,12 @@ typedef struct xl_restore_point
 	char		rp_name[MAXFNAMELEN];
 } xl_restore_point;
 
+/* Information logged when data checksum level is changed */
+typedef struct xl_checksum_state
+{
+	ChecksumType new_checksumtype;
+}			xl_checksum_state;
+
 /* Overwrite of prior contrecord */
 typedef struct xl_overwrite_contrecord
 {
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index a00606ffcd..a2fb60c864 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -79,6 +79,7 @@ typedef struct CheckPoint
 /* 0xC0 is used in Postgres 9.5-11 */
 #define XLOG_OVERWRITE_CONTRECORD		0xD0
 #define XLOG_CHECKPOINT_REDO			0xE0
+#define XLOG_CHECKSUMS					0xF0
 
 
 /*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index d4ac578ae6..f9f075e9bb 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12007,6 +12007,27 @@
   proname => 'jsonb_subscript_handler', prorettype => 'internal',
   proargtypes => 'internal', prosrc => 'jsonb_subscript_handler' },
 
+# data checksum management functionc
+{ oid => '9258',
+  descr => 'disable data checksums',
+  proname => 'pg_disable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => '',
+  prosrc => 'disable_data_checksums' },
+
+{ oid => '9257',
+  descr => 'enable data checksums',
+  proname => 'pg_enable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'int4 int4', proallargtypes => '{int4,int4}',
+  proargmodes => '{i,i}',
+  proargnames => '{cost_delay,cost_limit}',
+  prosrc => 'enable_data_checksums_p' },
+
+{ oid => '9259', descr => 'enable data checksums',
+  proname => 'pg_enable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proargtypes => '', proparallel => 'r', prosrc => 'enable_data_checksums' },
+
 # collation management functions
 { oid => '3445', descr => 'import collations from operating system',
   proname => 'pg_import_system_collations', procost => '100',
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 90f9b21b25..2ae420c16a 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -360,6 +360,9 @@ typedef enum BackendType
 	B_WAL_SUMMARIZER,
 	B_WAL_WRITER,
 
+	B_DATACHECKSUMSWORKER_LAUNCHER,
+	B_DATACHECKSUMSWORKER_WORKER,
+
 	/*
 	 * Logger is not connected to shared memory and does not have a PGPROC
 	 * entry.
@@ -383,6 +386,7 @@ extern PGDLLIMPORT BackendType MyBackendType;
 #define AmWalReceiverProcess()		(MyBackendType == B_WAL_RECEIVER)
 #define AmWalSummarizerProcess()	(MyBackendType == B_WAL_SUMMARIZER)
 #define AmWalWriterProcess()		(MyBackendType == B_WAL_WRITER)
+#define AmDataChecksumsWorkerProcess()	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
 
 extern const char *GetBackendTypeDesc(BackendType backendType);
 
diff --git a/src/include/postmaster/datachecksumsworker.h b/src/include/postmaster/datachecksumsworker.h
new file mode 100644
index 0000000000..1414abd7f5
--- /dev/null
+++ b/src/include/postmaster/datachecksumsworker.h
@@ -0,0 +1,29 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.h
+ *	  header file for checksum helper background worker
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/postmaster/datachecksumsworker.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef DATACHECKSUMSWORKER_H
+#define DATACHECKSUMSWORKER_H
+
+/* Shared memory */
+extern Size DataChecksumsWorkerShmemSize(void);
+extern void DataChecksumsWorkerShmemInit(void);
+
+/* Start the background processes for enabling or disabling checksums */
+void		StartDataChecksumsWorkerLauncher(bool enable_checksums,
+											 int cost_delay, int cost_limit);
+
+/* Background worker entrypoints */
+void		DataChecksumsWorkerLauncherMain(Datum arg);
+void		DataChecksumsWorkerMain(Datum arg);
+
+#endif							/* DATACHECKSUMSWORKER_H */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index d0df02d39c..a88f1e7916 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -201,7 +201,17 @@ typedef PageHeaderData *PageHeader;
  * handling pages.
  */
 #define PG_PAGE_LAYOUT_VERSION		4
+
+/*
+ * Checksum version 0 is used for when data checksums are disabled.
+ * PG_DATA_CHECKSUM_VERSION defines that data checksums are enabled in the
+ * cluster and PG_DATA_CHECKSUM_INPROGRESS_{ON|OFF}_VERSION defines that data
+ * checksums are either currently being enabled or disabled.
+ */
 #define PG_DATA_CHECKSUM_VERSION	1
+#define PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION	2
+#define PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION	3
+
 
 /* ----------------------------------------------------------------
  *						page support functions
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 08e9d598ce..e3e6ec3d4a 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -15,6 +15,14 @@
 
 #include "storage/block.h"
 
+typedef enum ChecksumType
+{
+	DATA_CHECKSUMS_OFF = 0,
+	DATA_CHECKSUMS_ON,
+	DATA_CHECKSUMS_INPROGRESS_ON,
+	DATA_CHECKSUMS_INPROGRESS_OFF
+} ChecksumType;
+
 /*
  * Compute the checksum for a Postgres page.  The page must be aligned on a
  * 4-byte boundary.
diff --git a/src/include/storage/lwlocklist.h b/src/include/storage/lwlocklist.h
index 6a2f64c54f..64e2ea5a50 100644
--- a/src/include/storage/lwlocklist.h
+++ b/src/include/storage/lwlocklist.h
@@ -83,3 +83,4 @@ PG_LWLOCK(49, WALSummarizer)
 PG_LWLOCK(50, DSMRegistry)
 PG_LWLOCK(51, InjectionPoint)
 PG_LWLOCK(52, SerialControl)
+PG_LWLOCK(53, DataChecksumsWorker)
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 7d290ea7d0..19557b3582 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -54,6 +54,11 @@ typedef enum
 typedef enum
 {
 	PROCSIGNAL_BARRIER_SMGRRELEASE, /* ask smgr to close files */
+
+	PROCSIGNAL_BARRIER_CHECKSUM_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_ON,
 } ProcSignalBarrierType;
 
 /*
diff --git a/src/test/Makefile b/src/test/Makefile
index dbd3192874..36023c1878 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,15 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = perl regress isolation modules authentication recovery subscription
+SUBDIRS = \
+		perl \
+		regress \
+		isolation \
+		modules \
+		authentication \
+		recovery \
+		subscription \
+		checksum
 
 ifeq ($(with_icu),yes)
 SUBDIRS += icu
diff --git a/src/test/checksum/.gitignore b/src/test/checksum/.gitignore
new file mode 100644
index 0000000000..871e943d50
--- /dev/null
+++ b/src/test/checksum/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/checksum/Makefile b/src/test/checksum/Makefile
new file mode 100644
index 0000000000..fd03bf73df
--- /dev/null
+++ b/src/test/checksum/Makefile
@@ -0,0 +1,23 @@
+#-------------------------------------------------------------------------
+#
+# Makefile for src/test/checksum
+#
+# Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/checksum/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/test/checksum
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
+
+clean distclean maintainer-clean:
+	rm -rf tmp_check
diff --git a/src/test/checksum/README b/src/test/checksum/README
new file mode 100644
index 0000000000..0f0317060b
--- /dev/null
+++ b/src/test/checksum/README
@@ -0,0 +1,22 @@
+src/test/checksum/README
+
+Regression tests for data checksums
+===================================
+
+This directory contains a test suite for enabling data checksums
+in a running cluster.
+
+Running the tests
+=================
+
+    make check
+
+or
+
+    make installcheck
+
+NOTE: This creates a temporary installation (in the case of "check"),
+with multiple nodes, be they master or standby(s) for the purpose of
+the tests.
+
+NOTE: This requires the --enable-tap-tests argument to configure.
diff --git a/src/test/checksum/meson.build b/src/test/checksum/meson.build
new file mode 100644
index 0000000000..5f96b5c246
--- /dev/null
+++ b/src/test/checksum/meson.build
@@ -0,0 +1,15 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+tests += {
+  'name': 'checksums',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'tests': [
+      't/001_basic.pl',
+      't/002_restarts.pl',
+      't/003_standby_restarts.pl',
+      't/004_offline.pl',
+    ],
+  },
+}
diff --git a/src/test/checksum/t/001_basic.pl b/src/test/checksum/t/001_basic.pl
new file mode 100644
index 0000000000..5db3e75370
--- /dev/null
+++ b/src/test/checksum/t/001_basic.pl
@@ -0,0 +1,78 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init();
+$node->start();
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are turned off
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable data checksums
+$node->safe_psql('postgres', "SELECT pg_enable_data_checksums();");
+
+# Wait for checksums to become enabled
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Enable data checksums again which should be a no-op..
+$node->safe_psql('postgres', "SELECT pg_enable_data_checksums();");
+# ..and make sure we can still read/write data
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums again
+$node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are disabled');
+
+# Test reading again
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure previously checksummed pages can be read back');
+
+# Re-enable checksums and make sure that the underlying data has changed such
+# that checksums will be different.
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+
+$node->safe_psql('postgres', "SELECT pg_enable_data_checksums();");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$node->stop;
+
+done_testing();
diff --git a/src/test/checksum/t/002_restarts.pl b/src/test/checksum/t/002_restarts.pl
new file mode 100644
index 0000000000..5ed6018f96
--- /dev/null
+++ b/src/test/checksum/t/002_restarts.pl
@@ -0,0 +1,83 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# restarting the processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init;
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres', "SELECT pg_enable_data_checksums();");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, '1', "ensure checksums aren't enabled yet");
+
+$bsession->quit;
+$node->stop;
+$node->start;
+
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'inprogress-on', "ensure checksums aren't enabled yet");
+
+$node->safe_psql('postgres', "SELECT pg_enable_data_checksums();");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are turned on');
+
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+$result = $node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are turned off');
+
+done_testing();
diff --git a/src/test/checksum/t/003_standby_restarts.pl b/src/test/checksum/t/003_standby_restarts.pl
new file mode 100644
index 0000000000..6d785a7807
--- /dev/null
+++ b/src/test/checksum/t/003_standby_restarts.pl
@@ -0,0 +1,123 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# streaming replication
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Initialize primary node
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+$node_primary->init(allows_streaming => 1);
+$node_primary->start;
+my $backup_name = 'my_backup';
+
+# Take backup
+$node_primary->backup($backup_name);
+
+# Create streaming standby linking to primary
+my $node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $backup_name,
+	has_streaming => 1);
+$node_standby_1->start;
+
+# Create some content on the primary to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Wait for standbys to catch up
+$node_primary->wait_for_catchup($node_standby_1, 'replay',
+	$node_primary->lsn('insert'));
+
+# Check that checksums are turned off on all nodes
+my $result = $node_primary->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on primary');
+
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on standby_1');
+
+# Enable checksums for the cluster
+$node_primary->safe_psql('postgres', "SELECT pg_enable_data_checksums();");
+
+# Ensure that the primary switches to "inprogress-on"
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	"inprogress-on");
+is($result, 1, 'ensure checksums are in progress on primary');
+
+# Wait for checksum enable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Ensure that the standby has switched to "inprogress-on" or "on".  Normally it
+# would be "inprogress-on", but it is theoretically possible for the primary to
+# complete the checksum enabling *and* have the standby replay that record
+# before we reach the check below.
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting = 'off' FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'f');
+is($result, 1, 'ensure standby has absorbed the inprogress-on barrier');
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+
+is(($result eq 'inprogress-on' || $result eq 'on'),
+	1, 'ensure checksums are on, or in progress, on standby_1');
+
+# Insert some more data which should be checksummed on INSERT
+$node_primary->safe_psql('postgres',
+	"INSERT INTO t VALUES (generate_series(1, 10000));");
+
+# Wait for checksums enabled on the primary
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the primary');
+
+# Wait for checksums enabled on the standby
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the standby');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, '20000', 'ensure we can safely read all data with checksums');
+
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+# Disable checksums and ensure it's propagated to standby and that we can
+# still read all data
+$node_primary->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+# Wait for checksum disable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure data checksums are disabled on the primary 2');
+
+# Ensure that the standby has switched to off
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are off on standby_1');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, "20000", 'ensure we can safely read all data without checksums');
+
+done_testing();
diff --git a/src/test/checksum/t/004_offline.pl b/src/test/checksum/t/004_offline.pl
new file mode 100644
index 0000000000..08e4eff96c
--- /dev/null
+++ b/src/test/checksum/t/004_offline.pl
@@ -0,0 +1,93 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums offline from various states
+# of checksum processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init();
+$node->start();
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable checksums offline using pg_checksums
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are enabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums offline again using pg_checksums
+$node->stop;
+$node->checksum_disable_offline;
+$node->start;
+
+# Ensure that checksums are disabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres', "SELECT pg_enable_data_checksums();");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, 1, 'ensure checksums are in the process of being enabled');
+
+# Turn the cluster off and enable checksums offline, then start back up
+$bsession->quit;
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are now enabled even though processing wasn't
+# restarted
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+done_testing();
diff --git a/src/test/meson.build b/src/test/meson.build
index c3d0dfedf1..b07d5d2d00 100644
--- a/src/test/meson.build
+++ b/src/test/meson.build
@@ -7,6 +7,7 @@ subdir('authentication')
 subdir('recovery')
 subdir('subscription')
 subdir('modules')
+subdir('checksum')
 
 if ssl.found()
   subdir('ssl')
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 0135c5a795..060d753fed 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3421,6 +3421,40 @@ sub advance_wal
 	}
 }
 
+=item $node->checksum_enable_offline()
+
+Enable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_enable_offline
+{
+	my ($self) = @_;
+
+	print "### Enabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-e');
+	return;
+}
+
+=item checksum_disable_offline
+
+Disable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_disable_offline
+{
+	my ($self) = @_;
+
+	print "### Disabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-d');
+	return;
+}
+
 =pod
 
 =back
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index e6c1caf649..32d16c286a 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -393,6 +393,7 @@ CheckPointStmt
 CheckpointStatsData
 CheckpointerRequest
 CheckpointerShmemStruct
+ChecksumType
 Chromosome
 CkptSortItem
 CkptTsStatus
@@ -578,6 +579,10 @@ DataPageDeleteStack
 DataTypesUsageChecks
 DataTypesUsageVersionCheck
 DatabaseInfo
+DataChecksumsWorkerDatabase
+DataChecksumsWorkerResult
+DataChecksumsWorkerResultEntry
+DataChecksumsWorkerShmemStruct
 DateADT
 DateTimeErrorExtra
 Datum
-- 
2.39.3 (Apple Git-146)

#2Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Daniel Gustafsson (#1)
Re: Changing the state of data checksums in a running cluster

Hi Daniel,

Thanks for rebasing the patch and submitting it again!

On 7/3/24 08:41, Daniel Gustafsson wrote:

After some off-list discussion about the desirability of this feature, where
several hackers opined that it's something that we should have, I've decided to
rebase this patch and submit it one more time. There are several (long)
threads covering the history of this patch [0][1], related work stemming from
this [2] as well as earlier attempts and discussions [3][4]. Below I try to
respond to a summary of points raised in those threads.

The mechanics of the patch hasn't changed since the last posted version, it has
mainly been polished slightly. A high-level overview of the processing is:
It's using a launcher/worker model where the launcher will spawn a worker per
database which will traverse all pages and dirty them in order to calculate and
set the checksum on them. During this inprogress state all backends calculated
and write checksums but don't verify them on read. Once all pages have been
checksummed the state of the cluster will switch over to "on" synchronized
across all backends with a procsignalbarrier. At this point checksums are
verified and processing is equal to checksums having been enabled initdb. When
a user disables checksums the cluster enters a state where all backends still
write checksums until all backends have acknowledged that they have stopped
verifying checksums (again using a procsignalbarrier). At this point the
cluster switches to "off" and checksums are neither written nor verified. In
case the cluster is restarted, voluntarily or via a crash, processing will have
to be restarted (more on that further down).

The user facing controls for this are two SQL level functions, for enabling and
disabling. The existing data_checksums GUC remains but is expanded with more
possible states (with on/off retained).

Complaints against earlier versions
===================================
Seasoned hackers might remember that this patch has been on -hackers before.
There has been a lot of review, and AFAICT all specific comments have been
addressed. There are however a few larger more generic complaints:

* Restartability - the initial version of the patch did not support stateful
restarts, a shutdown performed (or crash) before checksums were enabled would
result in a need to start over from the beginning. This was deemed the safe
orchestration method. The lack of this feature was seen as serious drawback,
so it was added. Subsequent review instead found the patch to be too
complicated with a too large featureset. I thihk there is merit to both of
these arguments: being able to restart is a great feature; and being able to
reason about the correctness of a smaller patch is also great. As of this
submission I have removed the ability to restart to keep the scope of the patch
small (which is where the previous version was, which received no review after
the removal). The way I prefer to frame this is to first add scaffolding and
infrastructure (this patch) and leave refinements and add-on features
(restartability, but also others like parallel workers, optimizing rare cases,
etc) for follow-up patches.

I 100% support this approach.

Sure, I'd like to have a restartable tool, but clearly that didn't go
particularly well, and we still have nothing to enable checksums online.
That doesn't seem to benefit anyone - to me it seems reasonable to get
the non-restartable tool in, and then maybe later someone can improve
this to make it restartable. Thanks to the earlier work we know it's
doable, even if it was too complex.

This way it's at least possible to enable checksums online with some
additional care (e.g. to make sure no one restarts the cluster etc.).
I'd bet for vast majority of systems this will work just fine. Huge
systems with some occasional / forced restarts may not be able to make
this work - but then again, that's no worse than now.

* Complexity - it was brought up that this is a very complex patch for a niche
feature, and there is a lot of truth to that. It is inherently complex to
change a pg_control level state of a running cluster. There might be ways to
make the current patch less complex, while not sacrificing stability, and if so
that would be great. A lot of of the complexity came from being able to
restart processing, and that's not removed for this version, but it's clearly
not close to a one-line-diff even without it.

I'd push back on this a little bit - the patch looks like this:

50 files changed, 2691 insertions(+), 48 deletions(-)

and if we ignore the docs / perl tests, then the two parts that stand
out are

src/backend/access/transam/xlog.c | 455 +++++-
src/backend/postmaster/datachecksumsworker.c | 1353 +++++++++++++++++

I don't think the worker code is exceptionally complex. Yes, it's not
trivial, but a lot of the 1353 inserts is comments (which is good) or
generic infrastructure to start the worker etc.

Other complaints were addressed, in part by the invention of procsignalbarriers
which makes this synchronization possible. In re-reading the threads I might
have missed something which is still left open, and if so I do apologize for
that.

Open TODO items:
================
* Immediate checkpoints - the code is currently using CHECKPOINT_IMMEDIATE in
order to be able to run the tests in a timely manner on it. This is overly
aggressive and dialling it back while still being able to run fast tests is a
TODO. Not sure what the best option is there.

Why not to add a parameter to pg_enable_data_checksums(), specifying
whether to do immediate checkpoint or wait for the next one? AFAIK
that's what we do in pg_backup_start, for example.

* Monitoring - an insightful off-list reviewer asked how the current progress
of the operation is monitored. So far I've been using pg_stat_activity but I
don't disagree that it's not a very sharp tool for this. Maybe we need a
specific function or view or something? There clearly needs to be a way for a
user to query state and progress of a transition.

Yeah, I think a view like pg_stat_progress_checksums would work.

* Throttling - right now the patch uses the vacuum access strategy, with the
same cost options as vacuum, in order to implement throttling. This is in part
due to the patch starting out modelled around autovacuum as a worker, but it
may not be the right match for throttling checksums.

IMHO it's reasonable to reuse the vacuum throttling. Even if it's not
perfect, it does not seem great to invent something new and end up with
two different ways to throttle stuff.

* Naming - the in-between states when data checksums are enabled or disabled
are called inprogress-on and inprogress-off. The reason for this is simply
that early on there were only three states: inprogress, on and off, and the
process of disabling wasn't labeled with a state. When this transition state
was added it seemed like a good idea to tack the end-goal onto the transition.
These state names make the code easily greppable but might not be the most
obvious choices for anything user facing. Is "Enabling" and "Disabling" better
terms to use (across the board or just user facing) or should we stick to the
current?

I think the naming is fine. In the worst case we can rename that later,
seems more like a detail.

There are ways in which this processing can be optimized to achieve better
performance, but in order to keep goalposts in sight and patchsize down they
are left as future work.

+1

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#3Bruce Momjian
bruce@momjian.us
In reply to: Tomas Vondra (#2)
Re: Changing the state of data checksums in a running cluster

On Wed, Jul 3, 2024 at 01:20:10PM +0200, Tomas Vondra wrote:

* Restartability - the initial version of the patch did not support stateful
restarts, a shutdown performed (or crash) before checksums were enabled would
result in a need to start over from the beginning. This was deemed the safe
orchestration method. The lack of this feature was seen as serious drawback,
so it was added. Subsequent review instead found the patch to be too
complicated with a too large featureset. I thihk there is merit to both of
these arguments: being able to restart is a great feature; and being able to
reason about the correctness of a smaller patch is also great. As of this
submission I have removed the ability to restart to keep the scope of the patch
small (which is where the previous version was, which received no review after
the removal). The way I prefer to frame this is to first add scaffolding and
infrastructure (this patch) and leave refinements and add-on features
(restartability, but also others like parallel workers, optimizing rare cases,
etc) for follow-up patches.

I 100% support this approach.

Yes, I was very disappointed when restartability sunk the patch, and I
saw this as another case where saying "yes" to every feature improvement
can lead to failure.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Only you can decide what is important to you.

#4Daniel Gustafsson
daniel@yesql.se
In reply to: Tomas Vondra (#2)
1 attachment(s)
Re: Changing the state of data checksums in a running cluster

On 3 Jul 2024, at 13:20, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:

Thanks for rebasing the patch and submitting it again!

Thanks for review, sorry for being so slow to pick this up again.

The attached version is a rebase with some level of cleanup and polish all
around, and most importantly it adresses the two points raised below.

* Immediate checkpoints - the code is currently using CHECKPOINT_IMMEDIATE in
order to be able to run the tests in a timely manner on it. This is overly
aggressive and dialling it back while still being able to run fast tests is a
TODO. Not sure what the best option is there.

Why not to add a parameter to pg_enable_data_checksums(), specifying
whether to do immediate checkpoint or wait for the next one? AFAIK
that's what we do in pg_backup_start, for example.

That's a good idea, pg_enable_data_checksums now accepts a third parameter
"fast" (defaults to false) which will enable immediate checkpoints when true.

* Monitoring - an insightful off-list reviewer asked how the current progress
of the operation is monitored. So far I've been using pg_stat_activity but I
don't disagree that it's not a very sharp tool for this. Maybe we need a
specific function or view or something? There clearly needs to be a way for a
user to query state and progress of a transition.

Yeah, I think a view like pg_stat_progress_checksums would work.

Added in the attached version. It probably needs some polish (the docs for
sure do) but it's at least a start.

--
Daniel Gustafsson

Attachments:

v2-0001-Support-checksum-enable-disable-in-a-running-clus.patchapplication/octet-stream; name=v2-0001-Support-checksum-enable-disable-in-a-running-clus.patch; x-unix-mode=0644Download
From 5a52eb6768dc34db0eceea1e21523b4acbcc01bd Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Tue, 2 Jul 2024 15:20:43 +0200
Subject: [PATCH v2] Support checksum enable/disable in a running cluster

This allows data checksums to be enabled, or disabled, in a running
cluster without restricting access to the cluster during processing.

Data checksums could prior to this only be enabled at initdb time, or
when the cluster is offline using pg_checksums. This commit introduce
functionality to enable, and disable, data checksums without the need
for turning off the cluster.

A dynamic background worker is responsible for launching a per-database
worker which will mark all buffers dirty for all relation with storage
in order for them to have data checksums on write. Once all relations
in all databases have been processed, the data_checksums state can be
set to "on" and the cluster will at that point be identical to one
which had checksums enabled from the start.

While the cluster is writing checksums on existing buffers, checksums
are written but not verified during reading to avoid false negatives.
Disabling checksums will not touch any buffers (but existing checksums
cannot be re-used in case checksums are immediately re-enabled). While
disabling, checksums are again written but not verified to ensure that
concurrent backends which haven't started disabling checksums will
incur a verification error.

New in-progress states are introduced for data_checksums which during
processing ensures that backends know whether to verify and write
checksums. All state changes across backends are synchronized using
procsignalbarriers

Earlier versions of this patch were reviewed by Heikki Linnakangas,
Robert Haas, Andres Freund, Tomas Vondra, Michael Banck and Andrey
Borodin.

Authors: Daniel Gustafsson, Magnus Hagander
Discussion: https://postgr.es/m/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com
Discussion: https://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de
Discussion: https://postgr.es/m/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com
---
 doc/src/sgml/func.sgml                        |   71 +
 doc/src/sgml/monitoring.sgml                  |  206 ++-
 doc/src/sgml/ref/pg_checksums.sgml            |    6 +
 doc/src/sgml/wal.sgml                         |   57 +-
 src/backend/access/heap/heapam.c              |    1 +
 src/backend/access/rmgrdesc/xlogdesc.c        |   24 +
 src/backend/access/transam/xlog.c             |  455 +++++-
 src/backend/access/transam/xlogfuncs.c        |   41 +
 src/backend/backup/basebackup.c               |    6 +-
 src/backend/catalog/system_functions.sql      |    7 +
 src/backend/catalog/system_views.sql          |   21 +
 src/backend/postmaster/Makefile               |    1 +
 src/backend/postmaster/bgworker.c             |    7 +
 src/backend/postmaster/datachecksumsworker.c  | 1404 +++++++++++++++++
 src/backend/postmaster/meson.build            |    1 +
 src/backend/replication/logical/decode.c      |    1 +
 src/backend/storage/buffer/bufmgr.c           |    4 +-
 src/backend/storage/ipc/ipci.c                |    3 +
 src/backend/storage/ipc/procsignal.c          |   13 +
 src/backend/storage/page/README               |    4 +-
 src/backend/storage/page/bufpage.c            |    6 +-
 src/backend/storage/smgr/bulk_write.c         |    2 +
 src/backend/utils/activity/pgstat.c           |    1 -
 src/backend/utils/activity/pgstat_io.c        |    2 +
 .../utils/activity/wait_event_names.txt       |    3 +
 src/backend/utils/adt/pgstatfuncs.c           |    8 +-
 src/backend/utils/init/miscinit.c             |    9 +-
 src/backend/utils/init/postinit.c             |    7 +-
 src/backend/utils/misc/guc_tables.c           |   32 +-
 src/bin/pg_checksums/pg_checksums.c           |    2 +-
 src/bin/pg_upgrade/controldata.c              |    9 +
 src/include/access/xlog.h                     |   16 +-
 src/include/access/xlog_internal.h            |    7 +
 src/include/catalog/pg_control.h              |    1 +
 src/include/catalog/pg_proc.dat               |   17 +
 src/include/commands/progress.h               |   18 +
 src/include/miscadmin.h                       |    4 +
 src/include/postmaster/datachecksumsworker.h  |   31 +
 src/include/storage/bufpage.h                 |   10 +
 src/include/storage/checksum.h                |    8 +
 src/include/storage/lwlocklist.h              |    1 +
 src/include/storage/procsignal.h              |    5 +
 src/include/utils/backend_progress.h          |    1 +
 src/test/Makefile                             |   10 +-
 src/test/checksum/.gitignore                  |    2 +
 src/test/checksum/Makefile                    |   23 +
 src/test/checksum/README                      |   22 +
 src/test/checksum/meson.build                 |   15 +
 src/test/checksum/t/001_basic.pl              |   85 +
 src/test/checksum/t/002_restarts.pl           |   90 ++
 src/test/checksum/t/003_standby_restarts.pl   |  130 ++
 src/test/checksum/t/004_offline.pl            |  100 ++
 src/test/meson.build                          |    1 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |   34 +
 src/tools/pgindent/typedefs.list              |    5 +
 55 files changed, 3001 insertions(+), 49 deletions(-)
 create mode 100644 src/backend/postmaster/datachecksumsworker.c
 create mode 100644 src/include/postmaster/datachecksumsworker.h
 create mode 100644 src/test/checksum/.gitignore
 create mode 100644 src/test/checksum/Makefile
 create mode 100644 src/test/checksum/README
 create mode 100644 src/test/checksum/meson.build
 create mode 100644 src/test/checksum/t/001_basic.pl
 create mode 100644 src/test/checksum/t/002_restarts.pl
 create mode 100644 src/test/checksum/t/003_standby_restarts.pl
 create mode 100644 src/test/checksum/t/004_offline.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index d6acdd3059..dff3f1ab08 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29715,6 +29715,77 @@ DETAIL:  Make sure pg_wal_replay_wait() isn't called within a transaction with a
 
   </sect2>
 
+  <sect2 id="functions-admin-checksum">
+   <title>Data Checksum Functions</title>
+
+   <para>
+    The functions shown in <xref linkend="functions-checksums-table" /> can
+    be used to enable or disable data checksums in a running cluster.
+    See <xref linkend="checksums" /> for details.
+   </para>
+
+   <table id="functions-checksums-table">
+    <title>Data Checksum Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_enable_data_checksums</primary>
+        </indexterm>
+        <function>pg_enable_data_checksums</function> ( <optional><parameter>cost_delay</parameter> <type>int</type>, <parameter>cost_limit</parameter> <type>int</type></optional> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Initiates data checksums for the cluster. This will switch the data
+        checksums mode to <literal>inprogress-on</literal> as well as start a
+        background worker that will process all pages in the database and
+        enable checksums on them. When all data pages have had checksums
+        enabled, the cluster will automatically switch data checksums mode to
+        <literal>on</literal>.
+       </para>
+       <para>
+        If <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter> are
+        specified, the speed of the process is throttled using the same principles as
+        <link linkend="runtime-config-resource-vacuum-cost">Cost-based Vacuum Delay</link>.
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_disable_data_checksums</primary>
+        </indexterm>
+        <function>pg_disable_data_checksums</function> ()
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Disables data checksum validation and calculation for the cluster. This
+        will switch the data checksum mode to <literal>inprogress-off</literal>
+        while data checksums are being disabled. When all active backends have
+        stopped validating data checksums, the data checksum mode will be
+        changed to <literal>off</literal>.  At this point the data pages will
+        still have checksums recorded but they are not updated when pages are
+        modified.
+       </para></entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+
+  </sect2>
+
   <sect2 id="functions-admin-dbobject">
    <title>Database Object Management Functions</title>
 
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 48ffe87241..5c61e8bbc1 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3487,8 +3487,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Number of data page checksum failures detected in this
-       database (or on a shared object), or NULL if data checksums are not
-       enabled.
+       database. Detected failures are reported regardless of the
+       <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -3498,8 +3498,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Time at which the last data page checksum failure was detected in
-       this database (or on a shared object), or NULL if data checksums are not
-       enabled.
+       this database (or on a shared object). Last failure is reported
+       regardless of the <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -6661,6 +6661,204 @@ FROM pg_stat_get_backend_idset() AS backendid;
 
  </sect2>
 
+ <sect2 id="data-checksum-progress-reporting">
+  <title>Data Checksum Progress Reporting</title>
+
+  <indexterm>
+   <primary>pg_stat_progress_datachecksums</primary>
+  </indexterm>
+
+  <para>
+   When data checksums are being enabled on a running cluster, the
+   <structname>pg_stat_progress_datachecksums</structname> view will contain
+   a row for each background worker which is currently calculating checksums
+   for the data pages.
+  </para>
+
+  <table id="pg-stat-progress-datachecksums-view" xreflabel="pg_stat_progress_datachecksums">
+   <title><structname>pg_stat_progress_datachecksums</structname> View</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        Column Type
+       </para>
+       <para>
+        Description>
+       </para>
+      </entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>pid</structfield> <type>integer</type>
+       </para>
+       <para>
+        Process ID of a datachecksumworker process.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>phase</structfield> <type>text</type>
+       </para>
+       <para>
+        Current processing phase, see <xref linkend="datachecksum-phases"/>
+        for description of the phases.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of database which will be processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of relations which will be processed, or
+        <literal>NULL</literal> if the datachecksumsworker hasn't calcuated
+        the number of relations yet.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of databases which have been processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of relations which have been processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_current</structfield> <type>oid</type>
+       </para>
+       <para>
+        OID of the database currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current</structfield> <type>oid</type>
+       </para>
+       <para>
+        OID of the relation currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current_blocks</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of blocks in the relation currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current_blocks_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks which have been processed in the relation currently
+        being processed.
+       </para>
+      </entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+  <table id="datachecksum-phases">
+   <title>Data Checksum Phases</title>
+   <tgroup cols="2">
+    <colspec colname="col1" colwidth="1*"/>
+    <colspec colname="col2" colwidth="2*"/>
+    <thead>
+     <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+    <tbody>
+     <row>
+      <entry><literal>enabling</literal></entry>
+      <entry>
+       The command is currently enabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>disabling</literal></entry>
+      <entry>
+       The command is currently disabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on backends</literal></entry>
+      <entry>
+       The command is currently waiting for backends to acknowledge the data
+       checksum operation.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on temporary tables</literal></entry>
+      <entry>
+       The command is currently waiting for all temporary tables which existed
+       at the time the command was started to be removed.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>done</literal></entry>
+      <entry>
+       The command has finished processing and is exiting.
+      </entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+ </sect2>
+
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index 95043aa329..0343710af5 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -45,6 +45,12 @@ PostgreSQL documentation
    exit status is nonzero if the operation failed.
   </para>
 
+  <para>
+   When enabling checksums, if checksums were in the process of being enabled
+   when the cluster was shut down, <application>pg_checksums</application>
+   will still process all relations regardless of the online processing.
+  </para>
+
   <para>
    When verifying checksums, every file in the cluster is scanned. When
    enabling checksums, each relation file block with a changed checksum is
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index 0ba0c930b7..5061372c27 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -248,9 +248,10 @@
   <para>
    Checksums are normally enabled when the cluster is initialized using <link
    linkend="app-initdb-data-checksums"><application>initdb</application></link>.
-   They can also be enabled or disabled at a later time as an offline
-   operation. Data checksums are enabled or disabled at the full cluster
-   level, and cannot be specified individually for databases or tables.
+   They can also be enabled or disabled at a later time either as an offline
+   operation or online in a running cluster allowing concurrent access. Data
+   checksums are enabled or disabled at the full cluster level, and cannot be
+   specified individually for databases or tables.
   </para>
 
   <para>
@@ -267,7 +268,7 @@
   </para>
 
   <sect2 id="checksums-offline-enable-disable">
-   <title>Off-line Enabling of Checksums</title>
+   <title>Offline Enabling of Checksums</title>
 
    <para>
     The <link linkend="app-pgchecksums"><application>pg_checksums</application></link>
@@ -276,6 +277,54 @@
    </para>
 
   </sect2>
+
+  <sect2 id="checksums-online-enable-disable">
+   <title>Online Enabling of Checksums</title>
+
+   <para>
+    Checksums can be enabled or disabled online, by calling the appropriate
+    <link linkend="functions-admin-checksum">functions</link>.
+   </para>
+
+   <para>
+    Enabling checksums will put the cluster checksum mode in
+    <literal>inprogress-on</literal> mode.  During this time, checksums will be
+    written but not verified. In addition to this, a background worker process
+    is started that enables checksums on all existing data in the cluster. Once
+    this worker has completed processing all databases in the cluster, the
+    checksum mode will automatically switch to <literal>on</literal>. The
+    processing will consume two background worker processes, make sure that
+    <varname>max_worker_processes</varname> allows for at least two more
+    additional processes.
+   </para>
+
+   <para>
+    The process will initially wait for all open transactions to finish before
+    it starts, so that it can be certain that there are no tables that have been
+    created inside a transaction that has not committed yet and thus would not
+    be visible to the process enabling checksums. It will also, for each database,
+    wait for all pre-existing temporary tables to get removed before it finishes.
+    If long-lived temporary tables are used in the application it may be necessary
+    to terminate these application connections to allow the process to complete.
+   </para>
+
+   <para>
+    If the cluster is stopped while in <literal>inprogress-on</literal> mode, for
+    any reason, then this process must be restarted manually. To do this,
+    re-execute the function <function>pg_enable_data_checksums()</function>
+    once the cluster has been restarted. The background worker will attempt
+    to resume the work from where it was interrupted.
+   </para>
+
+   <note>
+    <para>
+     Enabling checksums can cause significant I/O to the system, as most of the
+     database pages will need to be rewritten, and will be written both to the
+     data files and the WAL.
+    </para>
+   </note>
+
+  </sect2>
  </sect1>
 
   <sect1 id="wal-intro">
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index da5e656a08..0a5516f45d 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -8576,6 +8576,7 @@ log_heap_visible(Relation rel, Buffer heap_buffer, Buffer vm_buffer,
 	XLogRegisterBuffer(0, vm_buffer, 0);
 
 	flags = REGBUF_STANDARD;
+
 	if (!XLogHintBitIsNeeded())
 		flags |= REGBUF_NO_IMAGE;
 	XLogRegisterBuffer(1, heap_buffer, flags);
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 363294d623..465153e5d1 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -18,6 +18,7 @@
 #include "access/xlog.h"
 #include "access/xlog_internal.h"
 #include "catalog/pg_control.h"
+#include "storage/bufpage.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -167,6 +168,26 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
 		memcpy(&wal_level, rec, sizeof(int));
 		appendStringInfo(buf, "wal_level %s", get_wal_level_string(wal_level));
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state xlrec;
+
+		memcpy(&xlrec, rec, sizeof(xl_checksum_state));
+		switch (xlrec.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_VERSION:
+				appendStringInfoString(buf, "on");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				appendStringInfoString(buf, "inprogress-off");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				appendStringInfoString(buf, "inprogress-on");
+				break;
+			default:
+				appendStringInfoString(buf, "off");
+		}
+	}
 }
 
 const char *
@@ -218,6 +239,9 @@ xlog_identify(uint8 info)
 		case XLOG_CHECKPOINT_REDO:
 			id = "CHECKPOINT_REDO";
 			break;
+		case XLOG_CHECKSUMS:
+			id = "CHECKSUMS";
+			break;
 	}
 
 	return id;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 64304d77d3..ecded32414 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -646,6 +646,16 @@ static XLogRecPtr LocalMinRecoveryPoint;
 static TimeLineID LocalMinRecoveryPointTLI;
 static bool updateMinRecoveryPoint = true;
 
+/*
+ * Local state fror Controlfile data_checksum_version. After initialization
+ * this is only updated when absorbing a procsignal barrier during interrupt
+ * processing. The reason for keeping a copy in backend-private memory is to
+ * avoid locking for interrogating checksum state. Possible values are the
+ * checksum versions defined in storage/bufpage.h as well as zero when data
+ * checksums are disabled.
+ */
+static uint32 LocalDataChecksumVersion = 0;
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -714,6 +724,8 @@ static void WALInsertLockAcquireExclusive(void);
 static void WALInsertLockRelease(void);
 static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);
 
+static void XLogChecksums(ChecksumType new_type);
+
 /*
  * Insert an XLOG record represented by an already-constructed chain of data
  * chunks.  This is a low-level routine; to construct the WAL record header
@@ -827,9 +839,10 @@ XLogInsertRecord(XLogRecData *rdata,
 		 * only happen just after a checkpoint, so it's better to be slow in
 		 * this case and fast otherwise.
 		 *
-		 * Also check to see if fullPageWrites was just turned on or there's a
-		 * running backup (which forces full-page writes); if we weren't
-		 * already doing full-page writes then go back and recompute.
+		 * Also check to see if fullPageWrites was just turned on, there's a
+		 * running backup or if checksums are enabled (all of which forces
+		 * full-page writes); if we weren't already doing full-page writes
+		 * then go back and recompute.
 		 *
 		 * If we aren't doing full-page writes then RedoRecPtr doesn't
 		 * actually affect the contents of the XLOG record, so we'll update
@@ -842,7 +855,9 @@ XLogInsertRecord(XLogRecData *rdata,
 			Assert(RedoRecPtr < Insert->RedoRecPtr);
 			RedoRecPtr = Insert->RedoRecPtr;
 		}
-		doPageWrites = (Insert->fullPageWrites || Insert->runningBackups > 0);
+		doPageWrites = (Insert->fullPageWrites ||
+						Insert->runningBackups > 0 ||
+						DataChecksumsNeedWrite());
 
 		if (doPageWrites &&
 			(!prevDoPageWrites ||
@@ -4539,9 +4554,7 @@ ReadControlFile(void)
 
 	CalculateCheckpointSegments();
 
-	/* Make the initdb settings visible as GUC variables, too */
-	SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no",
-					PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
+	LocalDataChecksumVersion = ControlFile->data_checksum_version;
 }
 
 /*
@@ -4575,13 +4588,349 @@ GetMockAuthenticationNonce(void)
 }
 
 /*
- * Are checksums enabled for data pages?
+ * DataChecksumsNeedWrite
+ *		Returns whether data checksums must be written or not
+ *
+ * Returns true iff data checksums are enabled or are in the process of being
+ * enabled. During "inprogress-on" and "inprogress-off" states checksums must
+ * be written even though they are not verified (see datachecksumsworker.c for
+ * a longer discussion).
+ *
+ * This function is intended for callsites which are about to write a data page
+ * to storage, and need to know whether to re-calculate the checksum for the
+ * page header. Calling this function must be performed as close to the write
+ * operation as possible to keep the critical section short.
+ */
+bool
+DataChecksumsNeedWrite(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * DataChecksumsNeedVerify
+ *		Returns whether data checksums must be verified or not
+ *
+ * Data checksums are only verified if they are fully enabled in the cluster.
+ * During the "inprogress-on" and "inprogress-off" states they are only
+ * updated, not verified (see datachecksumsworker.c for a longer discussion).
+ *
+ * This function is intended for callsites which have read data and are about
+ * to perform checksum validation based on the result of this.  Calling this
+ * function must be performed as close to the validation call as possible to
+ * keep the critical section short. This is in order to protect against time of
+ * check/time of use situations around data checksum validation.
+ */
+bool
+DataChecksumsNeedVerify(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
+}
+
+/*
+ * DataChecksumsOnInProgress
+ *		Returns whether data checksums are being enabled
+ *
+ * Most operations don't need to worry about the "inprogress" states, and
+ * should use DataChecksumsNeedVerify() or DataChecksumsNeedWrite(). The
+ * "inprogress-on" state for enabling checksums is used when the checksum
+ * worker is setting checksums on all pages, it can thus be used to check for
+ * aborted checksum processing which need to be restarted.
+ */
+inline bool
+DataChecksumsOnInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+}
+
+/*
+ * DataChecksumsOffInProgress
+ *		Returns whether data checksums are being disabled
+ *
+ * The "inprogress-off" state for disabling checksums is used for when the
+ * worker resets the catalog state.  DataChecksumsNeedVerify() or
+ * DataChecksumsNeedWrite() should be used for deciding whether to read/write
+ * checksums.
  */
 bool
-DataChecksumsEnabled(void)
+DataChecksumsOffInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * SetDataChecksumsOnInProgress
+ *		Sets the data checksum state to "inprogress-on" to enable checksums
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on". See
+ * SetDataChecksumsOn below for a description on how this state change works.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOnInProgress(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile != NULL);
+
+	/*
+	 * The state transition is performed in a critical section with
+	 * checkpoints held off to provide crash safety.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	UpdateControlFile();
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state change in all backends to ensure that all backends are in
+	 * "inprogress-on". Once done we know that all backends are writing data
+	 * checksums.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOn
+ *		Enables data checksums cluster-wide
+ *
+ * Enabling data checksums is performed using two barriers, the first one to
+ * set the checksums state to "inprogress-on" (which is performed by
+ * SetDataChecksumsOnInProgress()) and the second one to set the state to "on"
+ * (performed here).
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on".  This state
+ * requires data checksums to be written but not verified. This ensures that
+ * all data pages can be checksummed without the risk of false negatives in
+ * validation during the process.  When all existing pages are guaranteed to
+ * have checksums, and all new pages will be initiated with checksums, the
+ * state can be changed to "on". Once the state is "on" checksums will be both
+ * written and verified. See datachecksumsworker.c for a longer discussion on
+ * how data checksums can be enabled in a running cluster.
+ *
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOn(void)
 {
+	uint64		barrier;
+
 	Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/*
+	 * The only allowed state transition to "on" is from "inprogress-on" since
+	 * that state ensures that all pages will have data checksums written.
+	 */
+	if (ControlFile->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+		elog(ERROR, "checksums not in \"inprogress-on\" mode");
+	}
+
+	LWLockRelease(ControlFileLock);
+
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
+	UpdateControlFile();
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state transition of "on" in all backends. When done we know that
+	 * data checksums are enabled in all backends and data checksums are both
+	 * written and verified.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOff
+ *		Disables data checksums cluster-wide
+ *
+ * Disabling data checksums must be performed with two sets of barriers, each
+ * carrying a different state. The state is first set to "inprogress-off"
+ * during which checksums are still written but not verified. This ensures that
+ * backends which have yet to observe the state change from "on" won't get
+ * validation errors on concurrently modified pages. Once all backends have
+ * changed to "inprogress-off", the barrier for moving to "off" can be emitted.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOff(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/* If data checksums are already disabled there is nothing to do */
+	if (ControlFile->data_checksum_version == 0)
+	{
+		LWLockRelease(ControlFileLock);
+		return;
+	}
+
+	/*
+	 * If data checksums are currently enabled we first transition to the
+	 * "inprogress-off" state during which backends continue to write
+	 * checksums without verifying them. When all backends are in
+	 * "inprogress-off" the next transition to "off" can be performed, after
+	 * which all data checksum processing is disabled.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+
+		MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+		START_CRIT_SECTION();
+
+		XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+		UpdateControlFile();
+		LWLockRelease(ControlFileLock);
+
+		barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+
+		END_CRIT_SECTION();
+		MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+
+		/*
+		 * Update local state in all backends to ensure that any backend in
+		 * "on" state is changed to "inprogress-off".
+		 */
+		WaitForProcSignalBarrier(barrier);
+
+		/*
+		 * At this point we know that no backends are verifying data checksums
+		 * during reading. Next, we can safely move to state "off" to also
+		 * stop writing checksums.
+		 */
+	}
+	else
+	{
+		/*
+		 * Ending up here implies that the checksums state is "inprogress-on"
+		 * or "inprogress-off" and we can transition directly to "off" from
+		 * there.
+		 */
+		LWLockRelease(ControlFileLock);
+	}
+
+	/*
+	 * Ensure that we don't incur a checkpoint during disabling checksums.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(0);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = 0;
+	UpdateControlFile();
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * ProcSignalBarrier absorption functions for enabling and disabling data
+ * checksums in a running cluster. The procsignalbarriers are emitted in the
+ * SetDataChecksums* functions.
+ */
+bool
+AbsorbChecksumsOnInProgressBarrier(void)
+{
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOnBarrier(void)
+{
+	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffInProgressBarrier(void)
+{
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffBarrier(void)
+{
+	LocalDataChecksumVersion = 0;
+	return true;
+}
+
+/*
+ * InitLocalControlData
+ *
+ * Set up backend local caches of controldata variables which may change at
+ * any point during runtime and thus require special cased locking. So far
+ * this only applies to data_checksum_version, but it's intended to be general
+ * purpose enough to handle future cases.
+ */
+void
+InitLocalControldata(void)
+{
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	LocalDataChecksumVersion = ControlFile->data_checksum_version;
+	LWLockRelease(ControlFileLock);
+}
+
+/* guc hook */
+const char *
+show_data_checksums(void)
+{
+	if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)
+		return "on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		return "inprogress-on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+		return "inprogress-off";
+	else
+		return "off";
 }
 
 /*
@@ -6141,6 +6490,33 @@ StartupXLOG(void)
 	 */
 	CompleteCommitTsInitialization();
 
+	/*
+	 * If we reach this point with checksums being enabled ("inprogress-on"
+	 * state), we notify the user that they need to manually restart the
+	 * process to enable checksums. This is because we cannot launch a dynamic
+	 * background worker directly from here, it has to be launched from a
+	 * regular backend.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		ereport(WARNING,
+				(errmsg("data checksums are being enabled, but no worker is running"),
+				 errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
+
+	/*
+	 * If data checksums were being disabled when the cluster was shut down,
+	 * we know that we have a state where all backends have stopped validating
+	 * checksums and we can move to off instead of prompting the user to
+	 * perform any action.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+	{
+		XLogChecksums(0);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = 0;
+		LWLockRelease(ControlFileLock);
+	}
+
 	/*
 	 * All done with end-of-recovery actions.
 	 *
@@ -8152,6 +8528,24 @@ XLogReportParameters(void)
 	}
 }
 
+/*
+ * Log the new state of checksums
+ */
+static void
+XLogChecksums(ChecksumType new_type)
+{
+	xl_checksum_state xlrec;
+	XLogRecPtr	recptr;
+
+	xlrec.new_checksumtype = new_type;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, sizeof(xl_checksum_state));
+
+	recptr = XLogInsert(RM_XLOG_ID, XLOG_CHECKSUMS);
+	XLogFlush(recptr);
+}
+
 /*
  * Update full_page_writes in shared memory, and write an
  * XLOG_FPW_CHANGE record if necessary.
@@ -8580,6 +8974,47 @@ xlog_redo(XLogReaderState *record)
 	{
 		/* nothing to do here, just for informational purposes */
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state state;
+		uint64		barrier;
+
+		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = state.new_checksumtype;
+		UpdateControlFile();
+		LWLockRelease(ControlFileLock);
+
+		/*
+		 * Block on a procsignalbarrier to await all processes having seen the
+		 * change to checksum status. Once the barrier has been passed we can
+		 * initiate the corresponding processing.
+		 */
+		switch (state.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			default:
+				Assert(state.new_checksumtype == 0);
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+		}
+	}
 }
 
 /*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 3e3d2bb618..c1b22e0b9c 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -27,6 +27,7 @@
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "postmaster/datachecksumsworker.h"
 #include "replication/walreceiver.h"
 #include "storage/fd.h"
 #include "storage/proc.h"
@@ -803,3 +804,43 @@ pg_wal_replay_wait(PG_FUNCTION_ARGS)
 
 	PG_RETURN_VOID();
 }
+
+/*
+ * Disables data checksums for the cluster, if applicable. Starts a background
+ * worker which turns off the data checksums.
+ */
+Datum
+disable_data_checksums(PG_FUNCTION_ARGS)
+{
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	StartDataChecksumsWorkerLauncher(false, 0, 0, false);
+	PG_RETURN_VOID();
+}
+
+/*
+ * Enables data checksums for the cluster, if applicable.  Supports vacuum-
+ * like cost based throttling to limit system load. Starts a background worker
+ * which updates data checksums on existing data.
+ */
+Datum
+enable_data_checksums(PG_FUNCTION_ARGS)
+{
+	int			cost_delay = PG_GETARG_INT32(0);
+	int			cost_limit = PG_GETARG_INT32(1);
+	bool		fast = PG_GETARG_BOOL(2);
+
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	if (cost_delay < 0)
+		ereport(ERROR, errmsg("cost delay cannot be a negative value"));
+
+	if (cost_limit <= 0)
+		ereport(ERROR, errmsg("cost limit must be greater than zero"));
+
+	StartDataChecksumsWorkerLauncher(true, cost_delay, cost_limit, fast);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 14e5ba72e9..3c64413dd4 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1612,7 +1612,8 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	 * enabled for this cluster, and if this is a relation file, then verify
 	 * the checksum.
 	 */
-	if (!noverify_checksums && DataChecksumsEnabled() &&
+	if (!noverify_checksums &&
+		DataChecksumsNeedWrite() &&
 		RelFileNumberIsValid(relfilenumber))
 		verify_checksum = true;
 
@@ -2005,6 +2006,9 @@ verify_page_checksum(Page page, XLogRecPtr start_lsn, BlockNumber blkno,
 	if (PageIsNew(page) || PageGetLSN(page) >= start_lsn)
 		return true;
 
+	if (!DataChecksumsNeedVerify())
+		return true;
+
 	/* Perform the actual checksum calculation. */
 	checksum = pg_checksum_page(page, blkno);
 
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index b0d0de051e..16c77db056 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,11 @@ LANGUAGE INTERNAL
 CALLED ON NULL INPUT VOLATILE PARALLEL SAFE
 AS 'pg_stat_reset_slru';
 
+CREATE OR REPLACE FUNCTION
+  pg_enable_data_checksums(cost_delay integer DEFAULT 0, cost_limit integer DEFAULT 100, fast boolean DEFAULT false)
+  RETURNS void STRICT VOLATILE LANGUAGE internal AS 'enable_data_checksums'
+  PARALLEL RESTRICTED;
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
@@ -760,6 +765,8 @@ REVOKE EXECUTE ON FUNCTION pg_ls_logicalmapdir() FROM PUBLIC;
 
 REVOKE EXECUTE ON FUNCTION pg_ls_replslotdir(text) FROM PUBLIC;
 
+REVOKE EXECUTE ON FUNCTION pg_enable_data_checksums(integer, integer, boolean) FROM public;
+
 --
 -- We also set up some things as accessible to standard roles.
 --
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 49109dbdc8..e9d23d6826 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1322,6 +1322,27 @@ CREATE VIEW pg_stat_progress_copy AS
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_datachecksums AS
+	SELECT
+		S.pid AS pid, S.datid AS datid, D.datname AS datname,
+		CASE S.param1 WHEN 0 THEN 'enabling'
+                      WHEN 1 THEN 'disabling'
+					  WHEN 2 THEN 'waiting'
+					  WHEN 3 THEN 'waiting on backends'
+					  WHEN 4 THEN 'waiting on temporary tables'
+					  WHEN 5 THEN 'done'
+					  END AS phase,
+		CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+		CASE S.param3 WHEN -1 THEN NULL ELSE S.param3 END AS relations_total,
+		S.param4 AS databases_processed,
+		S.param5 AS relations_processed,
+		S.param6 AS databases_current,
+		S.param7 AS relation_current,
+		S.param8 AS relation_current_blocks,
+		S.param9 AS relation_current_blocks_processed
+	FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+		LEFT JOIN pg_database D ON S.datid = D.oid;
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/postmaster/Makefile b/src/backend/postmaster/Makefile
index db08543d19..e112d4b53e 100644
--- a/src/backend/postmaster/Makefile
+++ b/src/backend/postmaster/Makefile
@@ -18,6 +18,7 @@ OBJS = \
 	bgworker.o \
 	bgwriter.o \
 	checkpointer.o \
+	datachecksumsworker.o \
 	fork_process.o \
 	interrupt.o \
 	launch_backend.o \
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 07bc5517fc..662cd12681 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -18,6 +18,7 @@
 #include "pgstat.h"
 #include "port/atomics.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "replication/logicallauncher.h"
 #include "replication/logicalworker.h"
@@ -132,6 +133,12 @@ static const struct
 	},
 	{
 		"TablesyncWorkerMain", TablesyncWorkerMain
+	},
+	{
+		"DataChecksumsWorkerLauncherMain", DataChecksumsWorkerLauncherMain
+	},
+	{
+		"DataChecksumsWorkerMain", DataChecksumsWorkerMain
 	}
 };
 
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
new file mode 100644
index 0000000000..72c75a092d
--- /dev/null
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -0,0 +1,1404 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.c
+ *	  Background worker for enabling or disabling data checksums online
+ *
+ * When enabling data checksums on a database at initdb time or when shut down
+ * with pg_checksums, no extra process is required as each page is checksummed,
+ * and verified, when accessed.  When enabling checksums on an already running
+ * cluster, this worker will ensure that all pages are checksummed before
+ * verification of the checksums is turned on. In the case of disabling
+ * checksums, the state transition is performed only in the control file, no
+ * changes are performed on the data pages.
+ *
+ * Checksums can be either enabled or disabled cluster-wide, with on/off being
+ * the end state for data_checksums.
+ *
+ * Enabling checksums
+ * ------------------
+ * When enabling checksums in an online cluster, data_checksums will be set to
+ * "inprogress-on" which signals that write operations MUST compute and write
+ * the checksum on the data page, but during reading the checksum SHALL NOT be
+ * verified. This ensures that all objects created during checksumming will
+ * have checksums set, but no reads will fail due to incorrect checksum. The
+ * DataChecksumsWorker will compile a list of databases which exist at the
+ * start of checksumming, and all of these which haven't been dropped during
+ * the processing MUST have been processed successfully in order for checksums
+ * to be enabled. Any new relation created during processing will see the
+ * in-progress state and will automatically be checksummed.
+ *
+ * For each database, all relations which have storage are read and every data
+ * page is marked dirty to force a write with the checksum. This will generate
+ * a lot of WAL as the entire database is read and written.
+ *
+ * If the processing is interrupted by a cluster restart, it will be restarted
+ * from the beginning again as state isn't persisted.
+ *
+ * Disabling checksums
+ * -------------------
+ * When disabling checksums, data_checksums will be set to "inprogress-off"
+ * which signals that checksums are written but no longer verified. This ensure
+ * that backends which have yet to move from the "on" state will still be able
+ * to process data checksum validation.
+ *
+ * Synchronization and Correctness
+ * -------------------------------
+ * The processes involved in enabling, or disabling, data checksums in an
+ * online cluster must be properly synchronized with the normal backends
+ * serving concurrent queries to ensure correctness. Correctness is defined
+ * as the following:
+ *
+ *    - Backends SHALL NOT violate local data_checksums state
+ *    - Data checksums SHALL NOT be considered enabled cluster-wide until all
+ *      currently connected backends have the local state "enabled"
+ *
+ * There are two levels of synchronization required for enabling data checksums
+ * in an online cluster: (i) changing state in the active backends ("on",
+ * "off", "inprogress-on" and "inprogress-off"), and (ii) ensuring no
+ * incompatible objects and processes are left in a database when workers end.
+ * The former deals with cluster-wide agreement on data checksum state and the
+ * latter with ensuring that any concurrent activity cannot break the data
+ * checksum contract during processing.
+ *
+ * Synchronizing the state change is done with procsignal barriers, where the
+ * WAL logging backend updating the global state in the controlfile will wait
+ * for all other backends to absorb the barrier. Barrier absorption will happen
+ * during interrupt processing, which means that connected backends will change
+ * state at different times. To prevent data checksum state changes when
+ * writing and verifying checksums, interrupts shall be held off before
+ * interrogating state and resumed when the IO operation has been performed.
+ *
+ *   When Enabling Data Checksums
+ *   ----------------------------
+ *   A process which fails to observe data checksums being enabled can induce
+ *   two types of errors: failing to write the checksum when modifying the page
+ *   and failing to validate the data checksum on the page when reading it.
+ *
+ *   When processing starts all backends belong to one of the below sets, with
+ *   one set being empty:
+ *
+ *   Bd: Backends in "off" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   If processing is started in an online cluster then all backends are in Bd.
+ *   If processing was halted by the cluster shutting down, the controlfile
+ *   state "inprogress-on" will be observed on system startup and all backends
+ *   will be in Bd. Backends transition Bd -> Bi via a procsignalbarrier.  When
+ *   the DataChecksumsWorker has finished writing checksums on all pages and
+ *   enables data checksums cluster-wide, there are four sets of backends where
+ *   Bd shall be an empty set:
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   Backends in Bi and Be will write checksums when modifying a page, but only
+ *   backends in Be will verify the checksum during reading. The Bg backend is
+ *   blocked waiting for all backends in Bi to process interrupts and move to
+ *   Be. Any backend starting while Bg is waiting on the procsignalbarrier will
+ *   observe the global state being "on" and will thus automatically belong to
+ *   Be.  Checksums are enabled cluster-wide when Bi is an empty set. Bi and Be
+ *   are compatible sets while still operating based on their local state as
+ *   both write data checksums.
+ *
+ *   When Disabling Data Checksums
+ *   -----------------------------
+ *   A process which fails to observe that data checksums have been disabled
+ *   can induce two types of errors: writing the checksum when modifying the
+ *   page and validating a data checksum which is no longer correct due to
+ *   modifications to the page.
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bo: Backends in "inprogress-off" state
+ *
+ *   Backends transition from the Be state to Bd like so: Be -> Bo -> Bd
+ *
+ *   The goal is to transition all backends to Bd making the others empty sets.
+ *   Backends in Bo write data checksums, but don't validate them, such that
+ *   backends still in Be can continue to validate pages until the barrier has
+ *   been absorbed such that they are in Bo. Once all backends are in Bo, the
+ *   barrier to transition to "off" can be raised and all backends can safely
+ *   stop writing data checksums as no backend is enforcing data checksum
+ *   validation any longer.
+ *
+ *
+ * Potential optimizations
+ * -----------------------
+ * Below are some potential optimizations and improvements which were brought
+ * up during reviews of this feature, but which weren't implemented in the
+ * initial version. These are ideas listed without any validation on their
+ * feasibility or potential payoff. More discussion on these can be found on
+ * the -hackers threads linked to in the commit message of this feature.
+ *
+ *   * Launching datachecksumsworker for resuming operation from the startup
+ *     process: Currently users have to restart processing manually after a
+ *     restart since dynamic background worker cannot be started from the
+ *     postmaster. Changing the startup process could make restarting the
+ *     processing automatic on cluster restart.
+ *   * Avoid dirtying the page when checksums already match: Iff the checksum
+ *     on the page happens to already match we still dirty the page. It should
+ *     be enough to only do the log_newpage_buffer() call in that case.
+ *   * Invent a lightweight WAL record that doesn't contain the full-page
+ *     image but just the block number: On replay, the redo routine would read
+ *     the page from disk.
+ *   * Teach pg_checksums to avoid checksummed pages when pg_checksums is used
+ *     to enable checksums on a cluster which is in inprogress-on state and
+ *     may have checksummed pages (make pg_checksums be able to resume an
+ *     online operation).
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/postmaster/datachecksumsworker.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/htup_details.h"
+#include "access/xact.h"
+#include "access/xloginsert.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_database.h"
+#include "commands/progress.h"
+#include "commands/vacuum.h"
+#include "common/relpath.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "postmaster/bgworker.h"
+#include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/bufmgr.h"
+#include "storage/checksum.h"
+#include "storage/ipc.h"
+#include "storage/lmgr.h"
+#include "storage/lwlock.h"
+#include "storage/procarray.h"
+#include "storage/smgr.h"
+#include "tcop/tcopprot.h"
+#include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
+#include "utils/ps_status.h"
+#include "utils/syscache.h"
+
+/*
+ * Number of times we retry to open a database before giving up and consider
+ * it to have failed processing.
+ */
+#define DATACHECKSUMSWORKER_MAX_DB_RETRIES 5
+
+typedef enum
+{
+	DATACHECKSUMSWORKER_SUCCESSFUL = 0,
+	DATACHECKSUMSWORKER_ABORTED,
+	DATACHECKSUMSWORKER_FAILED,
+	DATACHECKSUMSWORKER_RETRYDB,
+} DataChecksumsWorkerResult;
+
+/*
+ * Signaling between backends calling pg_enable/disable_data_checksums, the
+ * checksums launcher process, and the checksums worker process.
+ *
+ * This struct is protected by DataChecksumsWorkerLock
+ */
+typedef struct DataChecksumsWorkerShmemStruct
+{
+	/*
+	 * These are set by pg_enable/disable_data_checksums, to tell the launcher
+	 * what the target state is.
+	 */
+	bool		launch_enable_checksums;	/* True if checksums are being
+											 * enabled, else false */
+	int			launch_cost_delay;
+	int			launch_cost_limit;
+	bool		launch_fast;
+
+	/*
+	 * Is a launcher process is currently running?
+	 *
+	 * This is set by the launcher process, after it has read the above
+	 * launch_* parameters.
+	 */
+	bool		launcher_running;
+
+	/*
+	 * These fields indicate the target state that the launcher is currently
+	 * working towards. They can be different from the corresponding launch_*
+	 * fields, if a new pg_enable/disable_data_checksums() call was made while
+	 * the launcher/worker was already running.
+	 *
+	 * The below members are set when the launcher starts, and are only
+	 * accessed read-only by the single worker. Thus, we can access these
+	 * without a lock. If multiple workers, or dynamic cost parameters, are
+	 * supported at some point then this would need to be revisited.
+	 */
+	bool		enabling_checksums; /* True if checksums are being enabled,
+									 * else false */
+	int			cost_delay;
+	int			cost_limit;
+	bool		immediate_checkpoint;
+
+	/*
+	 * Signaling between the launcher and the worker process.
+	 *
+	 * As there is only a single worker, and the launcher won't read these
+	 * until the worker exits, they can be accessed without the need for a
+	 * lock. If multiple workers are supported then this will have to be
+	 * revisited.
+	 */
+
+	/* result, set by worker before exiting */
+	DataChecksumsWorkerResult success;
+
+	/*
+	 * tells the worker process whether it should also process the shared
+	 * catalogs
+	 */
+	bool		process_shared_catalogs;
+} DataChecksumsWorkerShmemStruct;
+
+/* Shared memory segment for datachecksumsworker */
+static DataChecksumsWorkerShmemStruct *DataChecksumsWorkerShmem;
+
+/* Bookkeeping for work to do */
+typedef struct DataChecksumsWorkerDatabase
+{
+	Oid			dboid;
+	char	   *dbname;
+} DataChecksumsWorkerDatabase;
+
+typedef struct DataChecksumsWorkerResultEntry
+{
+	Oid			dboid;
+	DataChecksumsWorkerResult result;
+	int			retries;
+} DataChecksumsWorkerResultEntry;
+
+
+/*
+ * Flag set by the interrupt handler
+ */
+static volatile sig_atomic_t abort_requested = false;
+
+/*
+ * Have we set the DataChecksumsWorkerShmemStruct->launcher_running flag?
+ * If we have, we need to clear it before exiting!
+ */
+static volatile sig_atomic_t launcher_running = false;
+
+/*
+ * Are we enabling data checksums, or disabling them?
+ */
+static bool enabling_checksums;
+
+/* Prototypes */
+static List *BuildDatabaseList(void);
+static List *BuildRelationList(bool temp_relations, bool include_shared);
+static void FreeDatabaseList(List *dblist);
+static DataChecksumsWorkerResult ProcessDatabase(DataChecksumsWorkerDatabase *db);
+static bool ProcessAllDatabases(bool immediate_checkpoint);
+static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy);
+static void launcher_cancel_handler(SIGNAL_ARGS);
+static void WaitForAllTransactionsToFinish(void);
+
+/*
+ * StartDataChecksumsWorkerLauncher
+ *		Main entry point for datachecksumsworker launcher process
+ *
+ * The main entrypoint for starting data checksums processing for enabling as
+ * well as disabling.
+ */
+void
+StartDataChecksumsWorkerLauncher(bool enable_checksums,
+								 int cost_delay,
+								 int cost_limit,
+								 bool fast)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	bool		launcher_running;
+
+	/* the cost delay settings have no effect when disabling */
+	Assert(enable_checksums || cost_delay == 0);
+	Assert(enable_checksums || cost_limit == 0);
+
+	/* store the desired state in shared memory */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	DataChecksumsWorkerShmem->launch_enable_checksums = enable_checksums;
+	DataChecksumsWorkerShmem->launch_cost_delay = cost_delay;
+	DataChecksumsWorkerShmem->launch_cost_limit = cost_limit;
+	DataChecksumsWorkerShmem->launch_fast = fast;
+
+	/* is the launcher already running? */
+	launcher_running = DataChecksumsWorkerShmem->launcher_running;
+
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * Launch a new launcher process, if it's not running already.
+	 *
+	 * If the launcher is currently busy enabling the checksums, and we want
+	 * them disabled (or vice versa), the launcher will notice that at latest
+	 * when it's about to exit, and will loop back process the new request. So
+	 * if the launcher is already running, we don't need to do anything more
+	 * here to abort it.
+	 *
+	 * If you call pg_enable/disable_data_checksums() twice in a row, before
+	 * the launcher has had a chance to start up, we still end up launching it
+	 * twice.  That's OK, the second invocation will see that a launcher is
+	 * already running and exit quickly.
+	 *
+	 * TODO: We could optimize here and skip launching the launcher, if we are
+	 * already in the desired state, i.e. if the checksums are already enabled
+	 * and you call pg_enable_data_checksums().
+	 */
+	if (!launcher_running)
+	{
+		/*
+		 * Prepare the BackgroundWorker and launch it.
+		 */
+		memset(&bgw, 0, sizeof(bgw));
+		bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+		bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+		snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+		snprintf(bgw.bgw_function_name, BGW_MAXLEN, "DataChecksumsWorkerLauncherMain");
+		snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker launcher");
+		snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker launcher");
+		bgw.bgw_restart_time = BGW_NEVER_RESTART;
+		bgw.bgw_notify_pid = MyProcPid;
+		bgw.bgw_main_arg = (Datum) 0;
+
+		if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+			ereport(ERROR,
+					(errmsg("failed to start background worker to process data checksums")));
+	}
+}
+
+/*
+ * ProcessSingleRelationFork
+ *		Enable data checksums in a single relation/fork.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy)
+{
+	BlockNumber numblocks = RelationGetNumberOfBlocksInFork(reln, forkNum);
+	BlockNumber blknum;
+	char		activity[NAMEDATALEN * 2 + 128];
+	char	   *relns;
+
+	relns = get_namespace_name(RelationGetNamespace(reln));
+
+	if (!relns)
+		return false;
+
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_CUR_REL_TOTAL_BLOCKS,
+								 numblocks);
+
+	/* Report the current relation to pgstat_activity */
+	snprintf(activity, sizeof(activity) - 1, "processing: %s.%s (%s, %dblocks)",
+			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	/*
+	 * We are looping over the blocks which existed at the time of process
+	 * start, which is safe since new blocks are created with checksums set
+	 * already due to the state being "inprogress-on".
+	 */
+	for (blknum = 0; blknum < numblocks; blknum++)
+	{
+		Buffer		buf = ReadBufferExtended(reln, forkNum, blknum, RBM_NORMAL, strategy);
+
+		/* Progress report the current block */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_CUR_REL_PROCESSED_BLOCKS,
+									 blknum);
+
+		/* Need to get an exclusive lock before we can flag as dirty */
+		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+
+		/*
+		 * Mark the buffer as dirty and force a full page write.  We have to
+		 * re-write the page to WAL even if the checksum hasn't changed,
+		 * because if there is a replica it might have a slightly different
+		 * version of the page with an invalid checksum, caused by unlogged
+		 * changes (e.g. hintbits) on the master happening while checksums
+		 * were off. This can happen if there was a valid checksum on the page
+		 * at one point in the past, so only when checksums are first on, then
+		 * off, and then turned on again. Iff wal_level is set to "minimal",
+		 * this could be avoided iff the checksum is calculated to be correct.
+		 */
+		START_CRIT_SECTION();
+		MarkBufferDirty(buf);
+		log_newpage_buffer(buf, false);
+		END_CRIT_SECTION();
+
+		UnlockReleaseBuffer(buf);
+
+		/*
+		 * This is the only place where we check if we are asked to abort, the
+		 * abortion will bubble up from here. It's safe to check this without
+		 * a lock, because if we miss it being set, we will try again soon.
+		 */
+		Assert(enabling_checksums);
+		if (!DataChecksumsWorkerShmem->launch_enable_checksums)
+			abort_requested = true;
+		if (abort_requested)
+			return false;
+
+		vacuum_delay_point();
+	}
+
+	pfree(relns);
+	return true;
+}
+
+/*
+ * ProcessSingleRelationByOid
+ *		Process a single relation based on oid.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationByOid(Oid relationId, BufferAccessStrategy strategy)
+{
+	Relation	rel;
+	ForkNumber	fnum;
+	bool		aborted = false;
+
+	StartTransactionCommand();
+
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_CUR_REL, relationId);
+
+	rel = try_relation_open(relationId, AccessShareLock);
+	if (rel == NULL)
+	{
+		/*
+		 * Relation no longer exists. We don't consider this an error since
+		 * there are no pages in it that need data checksums, and thus return
+		 * true. The worker operates off a list of relations generated at the
+		 * start of processing, so relations being dropped in the meantime is
+		 * to be expected.
+		 */
+		CommitTransactionCommand();
+		pgstat_report_activity(STATE_IDLE, NULL);
+		return true;
+	}
+	RelationGetSmgr(rel);
+
+	for (fnum = 0; fnum <= MAX_FORKNUM; fnum++)
+	{
+		if (smgrexists(rel->rd_smgr, fnum))
+		{
+			if (!ProcessSingleRelationFork(rel, fnum, strategy))
+			{
+				aborted = true;
+				break;
+			}
+		}
+	}
+	relation_close(rel, AccessShareLock);
+	elog(DEBUG2,
+		 "data checksum processing done for relation with OID %u: %s",
+		 relationId, (aborted ? "aborted" : "finished"));
+
+	CommitTransactionCommand();
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return !aborted;
+}
+
+/*
+ * ProcessDatabase
+ *		Enable data checksums in a single database.
+ *
+ * We do this by launching a dynamic background worker into this database, and
+ * waiting for it to finish.  We have to do this in a separate worker, since
+ * each process can only be connected to one database during its lifetime.
+ */
+static DataChecksumsWorkerResult
+ProcessDatabase(DataChecksumsWorkerDatabase *db)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	BgwHandleStatus status;
+	pid_t		pid;
+	char		activity[NAMEDATALEN + 64];
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_FAILED;
+
+	memset(&bgw, 0, sizeof(bgw));
+	bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+	bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+	snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+	snprintf(bgw.bgw_function_name, BGW_MAXLEN, "%s", "DataChecksumsWorkerMain");
+	snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker worker");
+	snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker worker");
+	bgw.bgw_restart_time = BGW_NEVER_RESTART;
+	bgw.bgw_notify_pid = MyProcPid;
+	bgw.bgw_main_arg = ObjectIdGetDatum(db->dboid);
+
+	/*
+	 * If there are no worker slots available, make sure we retry processing
+	 * this database. This will make the datachecksumsworker move on to the
+	 * next database and quite likely fail with the same problem. TODO: Maybe
+	 * we need a backoff to avoid running through all the databases here in
+	 * short order.
+	 */
+	if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+	{
+		ereport(WARNING,
+				(errmsg("failed to start worker for enabling data checksums in database \"%s\", retrying",
+						db->dbname),
+				 errhint("The max_worker_processes setting might be too low.")));
+		return DATACHECKSUMSWORKER_RETRYDB;
+	}
+
+	status = WaitForBackgroundWorkerStartup(bgw_handle, &pid);
+	if (status == BGWH_STOPPED)
+	{
+		ereport(WARNING,
+				(errmsg("could not start background worker for enabling data checksums in database \"%s\"",
+						db->dbname),
+				 errhint("More details on the error might be found in the server log.")));
+		return DATACHECKSUMSWORKER_FAILED;
+	}
+
+	/*
+	 * If the postmaster crashed we cannot end up with a processed database so
+	 * we have no alternative other than exiting. When enabling checksums we
+	 * won't at this time have changed the pg_control version to enabled so
+	 * when the cluster comes back up processing will have to be restarted.
+	 * When disabling, the pg_control version will be set to off before this
+	 * so when the cluster comes up checksums will be off as expected.
+	 */
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("cannot enable data checksums without the postmaster process"),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	Assert(status == BGWH_STARTED);
+	ereport(DEBUG1,
+			(errmsg("initiating data checksum processing in database \"%s\"",
+					db->dbname)));
+
+	snprintf(activity, sizeof(activity) - 1,
+			 "Waiting for worker in database %s (pid %d)", db->dbname, pid);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	status = WaitForBackgroundWorkerShutdown(bgw_handle);
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("postmaster exited during data checksum processing in \"%s\"",
+						db->dbname),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	if (DataChecksumsWorkerShmem->success == DATACHECKSUMSWORKER_ABORTED)
+		ereport(LOG,
+				(errmsg("data checksums processing was aborted in database \"%s\"",
+						db->dbname)));
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return DataChecksumsWorkerShmem->success;
+}
+
+/*
+ * launcher_exit
+ *
+ * Internal routine for cleaning up state when the launcher process exits. We
+ * need to clean up the abort flag to ensure that processing can be restarted
+ * again after it was previously aborted.
+ */
+static void
+launcher_exit(int code, Datum arg)
+{
+	if (launcher_running)
+	{
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		launcher_running = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		LWLockRelease(DataChecksumsWorkerLock);
+	}
+}
+
+/*
+ * launcher_cancel_handler
+ *
+ * Internal routine for reacting to SIGINT and flagging the worker to abort.
+ * The worker won't be interrupted immediately but will check for abort flag
+ * between each block in a relation.
+ */
+static void
+launcher_cancel_handler(SIGNAL_ARGS)
+{
+	int			save_errno = errno;
+
+	abort_requested = true;
+
+	/*
+	 * There is no sleeping in the main loop, the flag will be checked
+	 * periodically in ProcessSingleRelationFork. The worker does however
+	 * sleep when waiting for concurrent transactions to end so we still need
+	 * to set the latch.
+	 */
+	SetLatch(MyLatch);
+
+	errno = save_errno;
+}
+
+/*
+ * WaitForAllTransactionsToFinish
+ *		Blocks awaiting all current transactions to finish
+ *
+ * Returns when all transactions which are active at the call of the function
+ * have ended, or if the postmaster dies while waiting. If the postmaster dies
+ * the abort flag will be set to indicate that the caller of this shouldn't
+ * proceed.
+ *
+ * NB: this will return early, if aborted by SIGINT or if the target state
+ * is changed while we're running.
+ */
+static void
+WaitForAllTransactionsToFinish(void)
+{
+	TransactionId waitforxid;
+
+	LWLockAcquire(XidGenLock, LW_SHARED);
+	waitforxid = XidFromFullTransactionId(TransamVariables->nextXid);
+	LWLockRelease(XidGenLock);
+
+	while (TransactionIdPrecedes(GetOldestActiveTransactionId(), waitforxid))
+	{
+		char		activity[64];
+		int			rc;
+
+		/* Oldest running xid is older than us, so wait */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for current transactions to finish (waiting for %u)",
+				 waitforxid);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		rc = WaitLatch(MyLatch,
+					   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+					   5000,
+					   WAIT_EVENT_CHECKSUM_ENABLE_STARTCONDITION);
+
+		/*
+		 * If the postmaster died we won't be able to enable checksums
+		 * cluster-wide so abort and hope to continue when restarted.
+		 */
+		if (rc & WL_POSTMASTER_DEATH)
+			ereport(FATAL,
+					(errmsg("postmaster exited during data checksum processing"),
+					 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_SHARED);
+		if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			abort_requested = true;
+		LWLockRelease(DataChecksumsWorkerLock);
+		if (abort_requested)
+			break;
+	}
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+	return;
+}
+
+/*
+ * DataChecksumsWorkerLauncherMain
+ *
+ * Main function for launching dynamic background workers for processing data
+ * checksums in databases. This function has the bgworker management, with
+ * ProcessAllDatabases being responsible for looping over the databases and
+ * initiating processing.
+ */
+void
+DataChecksumsWorkerLauncherMain(Datum arg)
+{
+	on_shmem_exit(launcher_exit, 0);
+
+	ereport(DEBUG1,
+			errmsg("background worker \"datachecksumsworker\" launcher started"));
+
+	pqsignal(SIGTERM, die);
+	pqsignal(SIGINT, launcher_cancel_handler);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_LAUNCHER;
+	init_ps_display(NULL);
+
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	if (DataChecksumsWorkerShmem->launcher_running)
+	{
+		/* Launcher was already running, let it finish */
+		LWLockRelease(DataChecksumsWorkerLock);
+		return;
+	}
+
+	launcher_running = true;
+
+	/*
+	 * Initialize a connection to shared catalogs only.
+	 */
+	BackgroundWorkerInitializeConnectionByOid(InvalidOid, InvalidOid, 0);
+
+	/* Initialize backend status information */
+	pgstat_bestart();
+
+	enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+	DataChecksumsWorkerShmem->launcher_running = true;
+	DataChecksumsWorkerShmem->enabling_checksums = enabling_checksums;
+	DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+	DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+	DataChecksumsWorkerShmem->immediate_checkpoint = DataChecksumsWorkerShmem->launch_fast;
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * The target state can change while we are busy enabling/disabling
+	 * checksums, if the user calls pg_disable/enable_data_checksums() before
+	 * we are finished with the previous request. In that case, we will loop
+	 * back here, to process the new request.
+	 */
+again:
+
+	/*
+	 * If we're asked to enable checksums, we need to check if processing was
+	 * previously interrupted such that we should resume rather than start
+	 * from scratch.
+	 */
+	if (enabling_checksums)
+	{
+		/*
+		 * If we are asked to enable checksums in a cluster which already has
+		 * checksums enabled, exit immediately as there is nothing more to do.
+		 * Hold interrupts to make sure state doesn't change during checking.
+		 */
+		HOLD_INTERRUPTS();
+		if (DataChecksumsNeedVerify())
+		{
+			RESUME_INTERRUPTS();
+			goto done;
+		}
+		RESUME_INTERRUPTS();
+
+		/*
+		 * Initialize progress and indicate that we are waiting on the other
+		 * backends to clear the procsignalbarrier.
+		 */
+		pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+									  InvalidOid);
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS);
+
+		/*
+		 * Set the state to inprogress-on and wait on the procsignal barrier.
+		 */
+		SetDataChecksumsOnInProgress();
+
+		if (!ProcessAllDatabases(DataChecksumsWorkerShmem->immediate_checkpoint))
+		{
+			/*
+			 * If the target state changed during processing then it's not a
+			 * failure, so restart processing instead.
+			 */
+			LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+			if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			{
+				LWLockRelease(DataChecksumsWorkerLock);
+				goto done;
+			}
+			LWLockRelease(DataChecksumsWorkerLock);
+			ereport(ERROR,
+					(errmsg("unable to enable data checksums in cluster")));
+		}
+
+		SetDataChecksumsOn();
+	}
+	else
+	{
+		SetDataChecksumsOff();
+	}
+
+done:
+
+	/*
+	 * All done. But before we exit, check if the target state was changed
+	 * while we were running. In that case we will have to start all over
+	 * again.
+	 */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+	if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+	{
+		DataChecksumsWorkerShmem->enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+		DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+		LWLockRelease(DataChecksumsWorkerLock);
+		goto again;
+	}
+
+	/* Shut down progress reporting as we are done */
+	pgstat_progress_end_command();
+
+	launcher_running = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	LWLockRelease(DataChecksumsWorkerLock);
+}
+
+/*
+ * ProcessAllDatabases
+ *		Compute the list of all databases and process checksums in each
+ *
+ * This will repeatedly generate a list of databases to process for enabling
+ * checksums. Until no new databases are found, this will loop around computing
+ * a new list and comparing it to the already seen ones.
+ *
+ * If immediate_checkpoint is set to true then a CHECKPOINT_IMMEDIATE will be
+ * issued. This is useful for testing but should be avoided in production use
+ * as it may affect cluster performance drastically.
+ */
+static bool
+ProcessAllDatabases(bool immediate_checkpoint)
+{
+	List	   *DatabaseList;
+	HTAB	   *ProcessedDatabases = NULL;
+	ListCell   *lc;
+	HASHCTL		hash_ctl;
+	bool		found_failed = false;
+	int			flags;
+
+	/* Initialize a hash tracking all processed databases */
+	memset(&hash_ctl, 0, sizeof(hash_ctl));
+	hash_ctl.keysize = sizeof(Oid);
+	hash_ctl.entrysize = sizeof(DataChecksumsWorkerResultEntry);
+	ProcessedDatabases = hash_create("Processed databases",
+									 64,
+									 &hash_ctl,
+									 HASH_ELEM | HASH_BLOBS);
+	/*
+	 * Set up so first run processes shared catalogs, but not once in every
+	 * db.
+	 */
+	DataChecksumsWorkerShmem->process_shared_catalogs = true;
+
+	/*
+	 * Get a list of all databases to process. This may include databases that
+	 * were created during our runtime.  Since a database can be created as a
+	 * copy of any other database (which may not have existed in our last
+	 * run), we have to repeat this loop until no new databases show up in the
+	 * list.
+	 */
+	DatabaseList = BuildDatabaseList();
+
+	/*
+	 * Update progress reporting with the total number of databases we need
+	 * to process. This number should not be changed during processing, the
+	 * columns for processed databases is instead increased such that it can
+	 * be compared against the total.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_DB,
+								 list_length(DatabaseList));
+
+	while (true)
+	{
+		int			processed_databases = 0;
+
+		foreach(lc, DatabaseList)
+		{
+			DataChecksumsWorkerDatabase *db = (DataChecksumsWorkerDatabase *) lfirst(lc);
+			DataChecksumsWorkerResult result;
+			DataChecksumsWorkerResultEntry *entry;
+			bool		found;
+
+			/*
+			 * Indicate which database is being processed set the number of
+			 * relations to -1 to clear field from previous values. -1 will
+			 * translate to NULL in the progress view.
+			 */
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_CUR_DB, db->dboid);
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_REL, -1);
+
+			/*
+			 * Check if this database has been processed already, and if so
+			 * whether it should be retried or skipped.
+			 */
+			entry = (DataChecksumsWorkerResultEntry *) hash_search(ProcessedDatabases, &db->dboid,
+																   HASH_FIND, NULL);
+
+			if (entry)
+			{
+				if (entry->result == DATACHECKSUMSWORKER_RETRYDB)
+				{
+					/*
+					 * Limit the number of retries to avoid infinite looping
+					 * in case there simply won't be enough workers in the
+					 * cluster to finish this operation.
+					 */
+					if (entry->retries > DATACHECKSUMSWORKER_MAX_DB_RETRIES)
+						entry->result = DATACHECKSUMSWORKER_FAILED;
+				}
+
+				/* Skip if this database has been processed already */
+				if (entry->result != DATACHECKSUMSWORKER_RETRYDB)
+					continue;
+			}
+
+			result = ProcessDatabase(db);
+			processed_databases++;
+
+			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
+			{
+				/*
+				 * If one database has completed shared catalogs, we don't
+				 * have to process them again.
+				 */
+				if (DataChecksumsWorkerShmem->process_shared_catalogs)
+					DataChecksumsWorkerShmem->process_shared_catalogs = false;
+			}
+			else if (result == DATACHECKSUMSWORKER_ABORTED)
+			{
+				/* Abort flag set, so exit the whole process */
+				return false;
+			}
+
+			entry = hash_search(ProcessedDatabases, &db->dboid, HASH_ENTER, &found);
+			entry->dboid = db->dboid;
+			entry->result = result;
+			if (!found)
+				entry->retries = 0;
+			else
+				entry->retries++;
+		}
+
+		elog(DEBUG1,
+			 "%i databases processed for data checksum enabling, %s",
+			 processed_databases,
+			 (processed_databases ? "process with restart" : "process completed"));
+
+		FreeDatabaseList(DatabaseList);
+
+		/*
+		 * If no databases were processed in this run of the loop, we have now
+		 * finished all databases and no concurrently created ones can exist.
+		 */
+		if (processed_databases == 0)
+			break;
+
+		/*
+		 * Re-generate the list of databases for another pass. Since we wait
+		 * for all pre-existing transactions finish, this way we can be
+		 * certain that there are no databases left without checksums.
+		 */
+		WaitForAllTransactionsToFinish();
+		DatabaseList = BuildDatabaseList();
+	}
+
+	/*
+	 * ProcessedDatabases now has all databases and the results of their
+	 * processing. Failure to enable checksums for a database can be because
+	 * they actually failed for some reason, or because the database was
+	 * dropped between us getting the database list and trying to process it.
+	 * Get a fresh list of databases to detect the second case where the
+	 * database was dropped before we had started processing it. If a database
+	 * still exists, but enabling checksums failed then we fail the entire
+	 * checksumming process and exit with an error.
+	 */
+	WaitForAllTransactionsToFinish();
+	DatabaseList = BuildDatabaseList();
+
+	foreach(lc, DatabaseList)
+	{
+		DataChecksumsWorkerDatabase *db = (DataChecksumsWorkerDatabase *) lfirst(lc);
+		DataChecksumsWorkerResult *entry;
+		bool		found;
+
+		entry = hash_search(ProcessedDatabases, (void *) &db->dboid,
+							HASH_FIND, &found);
+
+		/*
+		 * We are only interested in the databases where the failed database
+		 * still exists.
+		 */
+		if (found && *entry == DATACHECKSUMSWORKER_FAILED)
+		{
+			ereport(WARNING,
+					(errmsg("failed to enable data checksums in \"%s\"",
+							db->dbname)));
+			found_failed = found;
+			continue;
+		}
+	}
+
+	FreeDatabaseList(DatabaseList);
+
+	if (found_failed)
+	{
+		/* Disable checksums on cluster, because we failed */
+		SetDataChecksumsOff();
+		ereport(ERROR,
+				(errmsg("data checksums failed to get enabled in all databases, aborting"),
+				 errhint("The server log might have more information on the cause of the error.")));
+	}
+
+	/*
+	 * Force a checkpoint to get everything out to disk. The use of immediate
+	 * checkpoints is for running tests, as they would otherwise not execute
+	 * in such a way that they can reliably be placed under timeout control.
+	 */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_IMMEDIATE;
+	RequestCheckpoint(flags);
+
+	return true;
+}
+
+/*
+ * DataChecksumsWorkerShmemSize
+ *		Compute required space for datachecksumsworker-related shared memory
+ */
+Size
+DataChecksumsWorkerShmemSize(void)
+{
+	Size		size;
+
+	size = sizeof(DataChecksumsWorkerShmemStruct);
+	size = MAXALIGN(size);
+
+	return size;
+}
+
+/*
+ * DataChecksumsWorkerShmemInit
+ *		Allocate and initialize datachecksumsworker-related shared memory
+ */
+void
+DataChecksumsWorkerShmemInit(void)
+{
+	bool		found;
+
+	DataChecksumsWorkerShmem = (DataChecksumsWorkerShmemStruct *)
+		ShmemInitStruct("DataChecksumsWorker Data",
+						DataChecksumsWorkerShmemSize(),
+						&found);
+
+	MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
+
+	/*
+	 * Even if this is a redundant assignment, we want to be explicit about
+	 * our intent for readability, since we want to be able to query this
+	 * state in case of restartability.
+	 */
+	DataChecksumsWorkerShmem->launch_enable_checksums = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	DataChecksumsWorkerShmem->launch_fast = false;
+}
+
+/*
+ * BuildDatabaseList
+ *		Compile a list of all currently available databases in the cluster
+ *
+ * This creates the list of databases for the datachecksumsworker workers to
+ * add checksums to. If the caller wants to ensure that no concurrently
+ * running CREATE DATABASE calls exist, this needs to be preceded by a call
+ * to WaitForAllTransactionsToFinish().
+ */
+static List *
+BuildDatabaseList(void)
+{
+	List	   *DatabaseList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(DatabaseRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_database pgdb = (Form_pg_database) GETSTRUCT(tup);
+		DataChecksumsWorkerDatabase *db;
+
+		oldctx = MemoryContextSwitchTo(ctx);
+
+		db = (DataChecksumsWorkerDatabase *) palloc0(sizeof(DataChecksumsWorkerDatabase));
+
+		db->dboid = pgdb->oid;
+		db->dbname = pstrdup(NameStr(pgdb->datname));
+
+		DatabaseList = lappend(DatabaseList, db);
+
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return DatabaseList;
+}
+
+static void
+FreeDatabaseList(List *dblist)
+{
+	ListCell   *lc;
+
+	if (!dblist)
+		return;
+
+	foreach(lc, dblist)
+	{
+		DataChecksumsWorkerDatabase *db = lfirst(lc);
+
+		if (db->dbname != NULL)
+			pfree(db->dbname);
+	}
+
+	list_free_deep(dblist);
+}
+
+/*
+ * BuildRelationList
+ *		Compile a list of relations in the database
+ *
+ * Returns a list of OIDs for the request relation types. If temp_relations
+ * is True then only temporary relations are returned. If temp_relations is
+ * False then non-temporary relations which have data checksums are returned.
+ * If include_shared is True then shared relations are included as well in a
+ * non-temporary list. include_shared has no relevance when building a list of
+ * temporary relations.
+ */
+static List *
+BuildRelationList(bool temp_relations, bool include_shared)
+{
+	List	   *RelationList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(RelationRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_class pgc = (Form_pg_class) GETSTRUCT(tup);
+
+		/*
+		 * Only include temporary relations when asked for a temp relation
+		 * list.
+		 */
+		if (pgc->relpersistence == RELPERSISTENCE_TEMP)
+		{
+			if (!temp_relations)
+				continue;
+		}
+		else
+		{
+			/*
+			 * If we are only interested in temp relations then continue
+			 * immediately as the current relation isn't a temp relation.
+			 */
+			if (temp_relations)
+				continue;
+
+			if (!RELKIND_HAS_STORAGE(pgc->relkind))
+				continue;
+
+			if (pgc->relisshared && !include_shared)
+				continue;
+		}
+
+		oldctx = MemoryContextSwitchTo(ctx);
+		RelationList = lappend_oid(RelationList, pgc->oid);
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return RelationList;
+}
+
+/*
+ * DataChecksumsWorkerMain
+ *
+ * Main function for enabling checksums in a single database, This is the
+ * function set as the bgw_function_name in the dynamic background worker
+ * process initiated for each database by the worker launcher. After enabling
+ * data checksums in each applicable relation in the database, it will wait for
+ * all temporary relations that were present when the function started to
+ * disappear before returning. This is required since we cannot rewrite
+ * existing temporary relations with data checksums.
+ */
+void
+DataChecksumsWorkerMain(Datum arg)
+{
+	Oid			dboid = DatumGetObjectId(arg);
+	List	   *RelationList = NIL;
+	List	   *InitialTempTableList = NIL;
+	ListCell   *lc;
+	BufferAccessStrategy strategy;
+	bool		aborted = false;
+
+	enabling_checksums = true;
+
+	pqsignal(SIGTERM, die);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_WORKER;
+	init_ps_display(NULL);
+
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS, dboid);
+
+	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
+											  BGWORKER_BYPASS_ALLOWCONN);
+
+	/*
+	 * Get a list of all temp tables present as we start in this database. We
+	 * need to wait until they are all gone until we are done, since we cannot
+	 * access these relations and modify them.
+	 */
+	InitialTempTableList = BuildRelationList(true, false);
+
+	/*
+	 * Enable vacuum cost delay, if any.
+	 */
+	Assert(DataChecksumsWorkerShmem->enabling_checksums);
+	VacuumCostDelay = DataChecksumsWorkerShmem->cost_delay;
+	VacuumCostLimit = DataChecksumsWorkerShmem->cost_limit;
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumCostPageHit = 0;
+	VacuumCostPageMiss = 0;
+	VacuumCostPageDirty = 0;
+
+	/*
+	 * Create and set the vacuum strategy as our buffer strategy.
+	 */
+	strategy = GetAccessStrategy(BAS_VACUUM);
+
+	RelationList = BuildRelationList(false,
+									 DataChecksumsWorkerShmem->process_shared_catalogs);
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_REL,
+								 list_length(RelationList));
+	foreach(lc, RelationList)
+	{
+		Oid			reloid = lfirst_oid(lc);
+
+		if (!ProcessSingleRelationByOid(reloid, strategy))
+		{
+			aborted = true;
+			break;
+		}
+	}
+	list_free(RelationList);
+
+	if (aborted)
+	{
+		DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+		ereport(DEBUG1,
+				(errmsg("data checksum processing aborted in database OID %u",
+						dboid)));
+		return;
+	}
+
+	/*
+	 * Wait for all temp tables that existed when we started to go away. This
+	 * is necessary since we cannot "reach" them to enable checksums. Any temp
+	 * tables created after we started will already have checksums in them
+	 * (due to the "inprogress-on" state), so no need to wait for those.
+	 */
+	for (;;)
+	{
+		List	   *CurrentTempTables;
+		ListCell   *lc;
+		int			numleft;
+		char		activity[64];
+		const int index[] = {
+			PROGRESS_DATACHECKSUMS_PHASE,
+			PROGRESS_DATACHECKSUMS_TOTAL_REL
+		};
+		int64 vals[2];
+
+		CurrentTempTables = BuildRelationList(true, false);
+		numleft = 0;
+		foreach(lc, InitialTempTableList)
+		{
+			if (list_member_oid(CurrentTempTables, lfirst_oid(lc)))
+				numleft++;
+		}
+		list_free(CurrentTempTables);
+
+		if (numleft == 0)
+			break;
+
+		/*
+		 * At least one temp table is left to wait for, indicate in pgstat
+		 * activity and progress reporting.
+		 */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for %d temp tables to be removed", numleft);
+		pgstat_report_activity(STATE_RUNNING, activity);
+		vals[0] = PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL;
+		vals[1] = numleft;
+		pgstat_progress_update_multi_param(2, index, vals);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		(void) WaitLatch(MyLatch,
+						 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+						 5000,
+						 WAIT_EVENT_CHECKSUM_ENABLE_FINISHCONDITION);
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		aborted = DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums;
+		LWLockRelease(DataChecksumsWorkerLock);
+
+		if (aborted || abort_requested)
+		{
+			DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+			ereport(DEBUG1,
+					(errmsg("data checksum processing aborted in database OID %u",
+							dboid)));
+			return;
+		}
+	}
+
+	list_free(InitialTempTableList);
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
+
+	pgstat_progress_end_command();
+}
diff --git a/src/backend/postmaster/meson.build b/src/backend/postmaster/meson.build
index 0ea4bbe084..3aacb2e0e7 100644
--- a/src/backend/postmaster/meson.build
+++ b/src/backend/postmaster/meson.build
@@ -6,6 +6,7 @@ backend_sources += files(
   'bgworker.c',
   'bgwriter.c',
   'checkpointer.c',
+  'datachecksumsworker.c',
   'fork_process.c',
   'interrupt.c',
   'launch_backend.c',
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index d687ceee33..6c38a57a20 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -186,6 +186,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 		case XLOG_FPW_CHANGE:
 		case XLOG_FPI_FOR_HINT:
 		case XLOG_FPI:
+		case XLOG_CHECKSUMS:
 		case XLOG_OVERWRITE_CONTRECORD:
 		case XLOG_CHECKPOINT_REDO:
 			break;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 4852044300..93e198f19e 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1516,7 +1516,9 @@ WaitReadBuffers(ReadBuffersOperation *operation)
 				bufBlock = BufHdrGetBlock(bufHdr);
 			}
 
-			/* check for garbage data */
+			/*
+			 * Check for garbage data.
+			 */
 			if (!PageIsVerifiedExtended((Page) bufBlock, io_first_block + j,
 										PIV_LOG_WARNING | PIV_REPORT_STAT))
 			{
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 10fc18f252..b9d98d0ada 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -31,6 +31,7 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "postmaster/walsummarizer.h"
 #include "replication/logicallauncher.h"
@@ -152,6 +153,7 @@ CalculateShmemSize(int *num_semaphores)
 	size = add_size(size, InjectionPointShmemSize());
 	size = add_size(size, SlotSyncShmemSize());
 	size = add_size(size, WaitLSNShmemSize());
+	size = add_size(size, DataChecksumsWorkerShmemSize());
 
 	/* include additional requested shmem from preload libraries */
 	size = add_size(size, total_addin_request);
@@ -334,6 +336,7 @@ CreateOrAttachShmemStructs(void)
 	PgArchShmemInit();
 	ApplyLauncherShmemInit();
 	SlotSyncShmemInit();
+	DataChecksumsWorkerShmemInit();
 
 	/*
 	 * Set up other modules that need some shared memory space
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 87027f27eb..e51cfb3a98 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -18,6 +18,7 @@
 #include <unistd.h>
 
 #include "access/parallel.h"
+#include "access/xlog.h"
 #include "commands/async.h"
 #include "miscadmin.h"
 #include "pgstat.h"
@@ -573,6 +574,18 @@ ProcessProcSignalBarrier(void)
 					case PROCSIGNAL_BARRIER_SMGRRELEASE:
 						processed = ProcessBarrierSmgrRelease();
 						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON:
+						processed = AbsorbChecksumsOnInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_ON:
+						processed = AbsorbChecksumsOnBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF:
+						processed = AbsorbChecksumsOffInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_OFF:
+						processed = AbsorbChecksumsOffBarrier();
+						break;
 				}
 
 				/*
diff --git a/src/backend/storage/page/README b/src/backend/storage/page/README
index e30d7ac59a..73c36a6390 100644
--- a/src/backend/storage/page/README
+++ b/src/backend/storage/page/README
@@ -10,7 +10,9 @@ http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf, discussed
 2010/12/22 on -hackers list.
 
 Current implementation requires this be enabled system-wide at initdb time, or
-by using the pg_checksums tool on an offline cluster.
+by using the pg_checksums tool on an offline cluster.  Checksums can also be
+enabled at runtime using pg_enable_data_checksums(), and disabled by using
+pg_disable_data_checksums().
 
 The checksum is not valid at all times on a data page!!
 The checksum is valid when the page leaves the shared pool and is checked
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index be6f1f62d2..112c05ee7e 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -100,7 +100,7 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 	 */
 	if (!PageIsNew(page))
 	{
-		if (DataChecksumsEnabled())
+		if (DataChecksumsNeedVerify())
 		{
 			checksum = pg_checksum_page((char *) page, blkno);
 
@@ -1512,7 +1512,7 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 	static char *pageCopy = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return (char *) page;
 
 	/*
@@ -1542,7 +1542,7 @@ void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
 	/* If we don't need a checksum, just return */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return;
 
 	((PageHeader) page)->pd_checksum = pg_checksum_page((char *) page, blkno);
diff --git a/src/backend/storage/smgr/bulk_write.c b/src/backend/storage/smgr/bulk_write.c
index 1a5f3ce96e..1981ed768d 100644
--- a/src/backend/storage/smgr/bulk_write.c
+++ b/src/backend/storage/smgr/bulk_write.c
@@ -36,6 +36,7 @@
 
 #include "access/xloginsert.h"
 #include "access/xlogrecord.h"
+#include "miscadmin.h"
 #include "storage/bufmgr.h"
 #include "storage/bufpage.h"
 #include "storage/bulk_write.h"
@@ -303,6 +304,7 @@ smgr_bulk_flush(BulkWriteState *bulkstate)
 		}
 		else
 			smgrwrite(bulkstate->smgr, bulkstate->forknum, blkno, page, true);
+
 		pfree(page);
 	}
 
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index d1768a89f6..b612d9d0fc 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -353,7 +353,6 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
 		.reset_timestamp_cb = pgstat_subscription_reset_timestamp_cb,
 	},
 
-
 	/* stats for fixed-numbered (mostly 1) objects */
 
 	[PGSTAT_KIND_ARCHIVER] = {
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index cc2ffc78aa..2b5671e2ca 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -359,6 +359,8 @@ pgstat_tracks_io_bktype(BackendType bktype)
 		case B_WAL_SUMMARIZER:
 			return false;
 
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 		case B_AUTOVAC_LAUNCHER:
 		case B_AUTOVAC_WORKER:
 		case B_BACKEND:
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index 8efb4044d6..4ed6ec157a 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -115,6 +115,8 @@ CHECKPOINT_DELAY_COMPLETE	"Waiting for a backend that blocks a checkpoint from c
 CHECKPOINT_DELAY_START	"Waiting for a backend that blocks a checkpoint from starting."
 CHECKPOINT_DONE	"Waiting for a checkpoint to complete."
 CHECKPOINT_START	"Waiting for a checkpoint to start."
+CHECKSUM_ENABLE_STARTCONDITION	"Waiting for data checksums enabling to start."
+CHECKSUM_ENABLE_FINISHCONDITION	"Waiting for data checksums to be enabled."
 EXECUTE_GATHER	"Waiting for activity from a child process while executing a <literal>Gather</literal> plan node."
 HASH_BATCH_ALLOCATE	"Waiting for an elected Parallel Hash participant to allocate a hash table."
 HASH_BATCH_ELECT	"Waiting to elect a Parallel Hash participant to allocate a hash table."
@@ -347,6 +349,7 @@ DSMRegistry	"Waiting to read or update the dynamic shared memory registry."
 InjectionPoint	"Waiting to read or update information related to injection points."
 SerialControl	"Waiting to read or update shared <filename>pg_serial</filename> state."
 WaitLSN	"Waiting to read or update shared Wait-for-LSN state."
+DataChecksumsWorker	"Waiting for data checksumsworker."
 
 #
 # END OF PREDEFINED LWLOCKS (DO NOT CHANGE THIS LINE)
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 17b0fc02ef..bcc6da7996 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -246,6 +246,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		cmdtype = PROGRESS_COMMAND_BASEBACKUP;
 	else if (pg_strcasecmp(cmd, "COPY") == 0)
 		cmdtype = PROGRESS_COMMAND_COPY;
+	else if (pg_strcasecmp(cmd, "DATACHECKSUMS") == 0)
+		cmdtype = PROGRESS_COMMAND_DATACHECKSUMS;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -1115,9 +1117,6 @@ pg_stat_get_db_checksum_failures(PG_FUNCTION_ARGS)
 	int64		result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
@@ -1133,9 +1132,6 @@ pg_stat_get_db_checksum_last_failure(PG_FUNCTION_ARGS)
 	TimestampTz result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index 537d92c0cf..6571c187cc 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -290,6 +290,12 @@ GetBackendTypeDesc(BackendType backendType)
 		case B_CHECKPOINTER:
 			backendDesc = "checkpointer";
 			break;
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+			backendDesc = "datachecksumsworker launcher";
+			break;
+		case B_DATACHECKSUMSWORKER_WORKER:
+			backendDesc = "datachecksumsworker worker";
+			break;
 		case B_LOGGER:
 			backendDesc = "logger";
 			break;
@@ -841,7 +847,8 @@ InitializeSessionUserIdStandalone(void)
 	 * workers, in slot sync worker and in background workers.
 	 */
 	Assert(!IsUnderPostmaster || AmAutoVacuumWorkerProcess() ||
-		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess());
+		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess() ||
+		   AmDataChecksumsWorkerProcess());
 
 	/* call only once */
 	Assert(!OidIsValid(AuthenticatedUserId));
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index a024b1151d..513b163ed7 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -720,6 +720,11 @@ InitPostgres(const char *in_dbname, Oid dboid,
 	 */
 	SharedInvalBackendInit(false);
 
+	/*
+	 * Set up backend local cache of Controldata values.
+	 */
+	InitLocalControldata();
+
 	ProcSignalInit(MyCancelKeyValid, MyCancelKey);
 
 	/*
@@ -864,7 +869,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
 					 errhint("You should immediately run CREATE USER \"%s\" SUPERUSER;.",
 							 username != NULL ? username : "postgres")));
 	}
-	else if (AmBackgroundWorkerProcess())
+	else if (AmBackgroundWorkerProcess() || AmDataChecksumsWorkerProcess())
 	{
 		if (username == NULL && !OidIsValid(useroid))
 		{
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 686309db58..e56e43f701 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -474,6 +474,14 @@ static const struct config_enum_entry wal_compression_options[] = {
 	{NULL, 0, false}
 };
 
+static const struct config_enum_entry data_checksums_options[] = {
+	{"on", DATA_CHECKSUMS_ON, true},
+	{"off", DATA_CHECKSUMS_OFF, true},
+	{"inprogress-on", DATA_CHECKSUMS_INPROGRESS_ON, true},
+	{"inprogress-off", DATA_CHECKSUMS_INPROGRESS_OFF, true},
+	{NULL, 0, false}
+};
+
 /*
  * Options for enum values stored in other modules
  */
@@ -599,7 +607,7 @@ static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
 static int	num_os_semaphores;
-static bool data_checksums;
+static int data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
@@ -1919,17 +1927,6 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
-	{
-		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
-			gettext_noop("Shows whether data checksums are turned on for this cluster."),
-			NULL,
-			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
-		},
-		&data_checksums,
-		false,
-		NULL, NULL, NULL
-	},
-
 	{
 		{"syslog_sequence_numbers", PGC_SIGHUP, LOGGING_WHERE,
 			gettext_noop("Add sequence number to syslog messages to avoid duplicate suppression."),
@@ -5196,6 +5193,17 @@ struct config_enum ConfigureNamesEnum[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows whether data checksums are turned on for this cluster."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+		},
+		&data_checksums,
+		DATA_CHECKSUMS_OFF, data_checksums_options,
+		NULL, NULL, show_data_checksums
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index f5f7ff1045..3bda3adb04 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -577,7 +577,7 @@ main(int argc, char *argv[])
 		mode == PG_MODE_DISABLE)
 		pg_fatal("data checksums are already disabled in cluster");
 
-	if (ControlFile->data_checksum_version > 0 &&
+	if (ControlFile->data_checksum_version == DATA_CHECKSUMS_ON &&
 		mode == PG_MODE_ENABLE)
 		pg_fatal("data checksums are already enabled in cluster");
 
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index 854c6887a2..527c807f1c 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -693,6 +693,15 @@ check_control_data(ControlData *oldctrl,
 	 * check_for_isn_and_int8_passing_mismatch().
 	 */
 
+	/*
+	 * If data checksums have been turned on in the old cluster, but the
+	 * datachecksumsworker have yet to finish, then disallow the upgrade. The
+	 * user should either let the process finish, or turn off checksums,
+	 * before retrying.
+	 */
+	if (oldctrl->data_checksum_version == 2)
+		pg_fatal("checksums are being enabled in the old cluster");
+
 	/*
 	 * We might eventually allow upgrades from checksum to no-checksum
 	 * clusters.
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 36f6e4e4b4..0e2b53c8d2 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -117,7 +117,7 @@ extern PGDLLIMPORT int wal_level;
  * of the bits make it to disk, but the checksum wouldn't match.  Also WAL-log
  * them if forced by wal_log_hints=on.
  */
-#define XLogHintBitIsNeeded() (DataChecksumsEnabled() || wal_log_hints)
+#define XLogHintBitIsNeeded() (wal_log_hints || DataChecksumsNeedWrite())
 
 /* Do we need to WAL-log information required only for Hot Standby and logical replication? */
 #define XLogStandbyInfoActive() (wal_level >= WAL_LEVEL_REPLICA)
@@ -229,7 +229,19 @@ extern XLogRecPtr GetXLogWriteRecPtr(void);
 
 extern uint64 GetSystemIdentifier(void);
 extern char *GetMockAuthenticationNonce(void);
-extern bool DataChecksumsEnabled(void);
+extern bool DataChecksumsNeedWrite(void);
+extern bool DataChecksumsNeedVerify(void);
+extern bool DataChecksumsOnInProgress(void);
+extern bool DataChecksumsOffInProgress(void);
+extern void SetDataChecksumsOnInProgress(void);
+extern void SetDataChecksumsOn(void);
+extern void SetDataChecksumsOff(void);
+extern bool AbsorbChecksumsOnInProgressBarrier(void);
+extern bool AbsorbChecksumsOffInProgressBarrier(void);
+extern bool AbsorbChecksumsOnBarrier(void);
+extern bool AbsorbChecksumsOffBarrier(void);
+extern const char *show_data_checksums(void);
+extern void InitLocalControldata(void);
 extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
 extern Size XLOGShmemSize(void);
 extern void XLOGShmemInit(void);
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index 5ef244bcdb..21edc8b737 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,6 +25,7 @@
 #include "lib/stringinfo.h"
 #include "pgtime.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/relfilelocator.h"
 
 
@@ -289,6 +290,12 @@ typedef struct xl_restore_point
 	char		rp_name[MAXFNAMELEN];
 } xl_restore_point;
 
+/* Information logged when data checksum level is changed */
+typedef struct xl_checksum_state
+{
+	ChecksumType new_checksumtype;
+}			xl_checksum_state;
+
 /* Overwrite of prior contrecord */
 typedef struct xl_overwrite_contrecord
 {
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index e80ff8e414..17da375aa6 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -80,6 +80,7 @@ typedef struct CheckPoint
 /* 0xC0 is used in Postgres 9.5-11 */
 #define XLOG_OVERWRITE_CONTRECORD		0xD0
 #define XLOG_CHECKPOINT_REDO			0xE0
+#define XLOG_CHECKSUMS					0xF0
 
 
 /*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 05fcbf7515..c2703308eb 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12140,6 +12140,23 @@
   proname => 'jsonb_subscript_handler', prorettype => 'internal',
   proargtypes => 'internal', prosrc => 'jsonb_subscript_handler' },
 
+# data checksum management functions
+{ oid => '9258',
+  descr => 'disable data checksums',
+  proname => 'pg_disable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => '',
+  prosrc => 'disable_data_checksums' },
+
+{ oid => '9257',
+  descr => 'enable data checksums',
+  proname => 'pg_enable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'int4 int4 bool', proallargtypes => '{int4,int4,bool}',
+  proargmodes => '{i,i,i}',
+  proargnames => '{cost_delay,cost_limit,fast}',
+  prosrc => 'enable_data_checksums' },
+
 # collation management functions
 { oid => '3445', descr => 'import collations from operating system',
   proname => 'pg_import_system_collations', procost => '100',
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 5616d64523..8e3c180392 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -155,4 +155,22 @@
 #define PROGRESS_COPY_TYPE_PIPE 3
 #define PROGRESS_COPY_TYPE_CALLBACK 4
 
+/* Progress parameters for PROGRESS_DATACHECKSUMS */
+#define PROGRESS_DATACHECKSUMS_PHASE 0
+#define PROGRESS_DATACHECKSUMS_TOTAL_DB 1
+#define PROGRESS_DATACHECKSUMS_TOTAL_REL 2
+#define PROGRESS_DATACHECKSUMS_PROCESSED_DB 3
+#define PROGRESS_DATACHECKSUMS_PROCESSED_REL 4
+#define PROGRESS_DATACHECKSUMS_CUR_DB 5
+#define PROGRESS_DATACHECKSUMS_CUR_REL 6
+#define PROGRESS_DATACHECKSUMS_CUR_REL_TOTAL_BLOCKS 7
+#define PROGRESS_DATACHECKSUMS_CUR_REL_PROCESSED_BLOCKS 8
+
+/* Phases of datachecksumsworker operation */
+#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING 0
+#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING 1
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS 2
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL 3
+#define PROGRESS_DATACHECKSUMS_DONE 4
+
 #endif
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index e26d108a47..55e507683a 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -357,6 +357,9 @@ typedef enum BackendType
 	B_WAL_SUMMARIZER,
 	B_WAL_WRITER,
 
+	B_DATACHECKSUMSWORKER_LAUNCHER,
+	B_DATACHECKSUMSWORKER_WORKER,
+
 	/*
 	 * Logger is not connected to shared memory and does not have a PGPROC
 	 * entry.
@@ -380,6 +383,7 @@ extern PGDLLIMPORT BackendType MyBackendType;
 #define AmWalReceiverProcess()		(MyBackendType == B_WAL_RECEIVER)
 #define AmWalSummarizerProcess()	(MyBackendType == B_WAL_SUMMARIZER)
 #define AmWalWriterProcess()		(MyBackendType == B_WAL_WRITER)
+#define AmDataChecksumsWorkerProcess()	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
 
 extern const char *GetBackendTypeDesc(BackendType backendType);
 
diff --git a/src/include/postmaster/datachecksumsworker.h b/src/include/postmaster/datachecksumsworker.h
new file mode 100644
index 0000000000..59c9000d64
--- /dev/null
+++ b/src/include/postmaster/datachecksumsworker.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.h
+ *	  header file for checksum helper background worker
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/postmaster/datachecksumsworker.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef DATACHECKSUMSWORKER_H
+#define DATACHECKSUMSWORKER_H
+
+/* Shared memory */
+extern Size DataChecksumsWorkerShmemSize(void);
+extern void DataChecksumsWorkerShmemInit(void);
+
+/* Start the background processes for enabling or disabling checksums */
+void		StartDataChecksumsWorkerLauncher(bool enable_checksums,
+											 int cost_delay,
+											 int cost_limit,
+											 bool fast);
+
+/* Background worker entrypoints */
+void		DataChecksumsWorkerLauncherMain(Datum arg);
+void		DataChecksumsWorkerMain(Datum arg);
+
+#endif							/* DATACHECKSUMSWORKER_H */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 6222d46e53..9ff4127e58 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -204,7 +204,17 @@ typedef PageHeaderData *PageHeader;
  * handling pages.
  */
 #define PG_PAGE_LAYOUT_VERSION		4
+
+/*
+ * Checksum version 0 is used for when data checksums are disabled.
+ * PG_DATA_CHECKSUM_VERSION defines that data checksums are enabled in the
+ * cluster and PG_DATA_CHECKSUM_INPROGRESS_{ON|OFF}_VERSION defines that data
+ * checksums are either currently being enabled or disabled.
+ */
 #define PG_DATA_CHECKSUM_VERSION	1
+#define PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION	2
+#define PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION	3
+
 
 /* ----------------------------------------------------------------
  *						page support functions
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 08e9d598ce..e3e6ec3d4a 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -15,6 +15,14 @@
 
 #include "storage/block.h"
 
+typedef enum ChecksumType
+{
+	DATA_CHECKSUMS_OFF = 0,
+	DATA_CHECKSUMS_ON,
+	DATA_CHECKSUMS_INPROGRESS_ON,
+	DATA_CHECKSUMS_INPROGRESS_OFF
+} ChecksumType;
+
 /*
  * Compute the checksum for a Postgres page.  The page must be aligned on a
  * 4-byte boundary.
diff --git a/src/include/storage/lwlocklist.h b/src/include/storage/lwlocklist.h
index 88dc79b2bd..c3b3011f72 100644
--- a/src/include/storage/lwlocklist.h
+++ b/src/include/storage/lwlocklist.h
@@ -84,3 +84,4 @@ PG_LWLOCK(50, DSMRegistry)
 PG_LWLOCK(51, InjectionPoint)
 PG_LWLOCK(52, SerialControl)
 PG_LWLOCK(53, WaitLSN)
+PG_LWLOCK(54, DataChecksumsWorker)
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index f94c11a9a8..bba00f9d19 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -54,6 +54,11 @@ typedef enum
 typedef enum
 {
 	PROCSIGNAL_BARRIER_SMGRRELEASE, /* ask smgr to close files */
+
+	PROCSIGNAL_BARRIER_CHECKSUM_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_ON,
 } ProcSignalBarrierType;
 
 /*
diff --git a/src/include/utils/backend_progress.h b/src/include/utils/backend_progress.h
index 7b63d38f97..063d3ef86c 100644
--- a/src/include/utils/backend_progress.h
+++ b/src/include/utils/backend_progress.h
@@ -28,6 +28,7 @@ typedef enum ProgressCommandType
 	PROGRESS_COMMAND_CREATE_INDEX,
 	PROGRESS_COMMAND_BASEBACKUP,
 	PROGRESS_COMMAND_COPY,
+	PROGRESS_COMMAND_DATACHECKSUMS,
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	20
diff --git a/src/test/Makefile b/src/test/Makefile
index dbd3192874..36023c1878 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,15 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = perl regress isolation modules authentication recovery subscription
+SUBDIRS = \
+		perl \
+		regress \
+		isolation \
+		modules \
+		authentication \
+		recovery \
+		subscription \
+		checksum
 
 ifeq ($(with_icu),yes)
 SUBDIRS += icu
diff --git a/src/test/checksum/.gitignore b/src/test/checksum/.gitignore
new file mode 100644
index 0000000000..871e943d50
--- /dev/null
+++ b/src/test/checksum/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/checksum/Makefile b/src/test/checksum/Makefile
new file mode 100644
index 0000000000..fd03bf73df
--- /dev/null
+++ b/src/test/checksum/Makefile
@@ -0,0 +1,23 @@
+#-------------------------------------------------------------------------
+#
+# Makefile for src/test/checksum
+#
+# Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/checksum/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/test/checksum
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
+
+clean distclean maintainer-clean:
+	rm -rf tmp_check
diff --git a/src/test/checksum/README b/src/test/checksum/README
new file mode 100644
index 0000000000..0f0317060b
--- /dev/null
+++ b/src/test/checksum/README
@@ -0,0 +1,22 @@
+src/test/checksum/README
+
+Regression tests for data checksums
+===================================
+
+This directory contains a test suite for enabling data checksums
+in a running cluster.
+
+Running the tests
+=================
+
+    make check
+
+or
+
+    make installcheck
+
+NOTE: This creates a temporary installation (in the case of "check"),
+with multiple nodes, be they master or standby(s) for the purpose of
+the tests.
+
+NOTE: This requires the --enable-tap-tests argument to configure.
diff --git a/src/test/checksum/meson.build b/src/test/checksum/meson.build
new file mode 100644
index 0000000000..5f96b5c246
--- /dev/null
+++ b/src/test/checksum/meson.build
@@ -0,0 +1,15 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+tests += {
+  'name': 'checksums',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'tests': [
+      't/001_basic.pl',
+      't/002_restarts.pl',
+      't/003_standby_restarts.pl',
+      't/004_offline.pl',
+    ],
+  },
+}
diff --git a/src/test/checksum/t/001_basic.pl b/src/test/checksum/t/001_basic.pl
new file mode 100644
index 0000000000..f16cf78b91
--- /dev/null
+++ b/src/test/checksum/t/001_basic.pl
@@ -0,0 +1,85 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init();
+$node->start();
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are turned off
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable data checksums
+$node->safe_psql('postgres', "SELECT pg_enable_data_checksums($enable_params);");
+
+# Wait for checksums to become enabled
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Enable data checksums again which should be a no-op..
+$node->safe_psql('postgres', "SELECT pg_enable_data_checksums($enable_params);");
+# ..and make sure we can still read/write data
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums again
+$node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are disabled');
+
+# Test reading again
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure previously checksummed pages can be read back');
+
+# Re-enable checksums and make sure that the underlying data has changed to
+# ensure that checksums will be different.
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+
+$node->safe_psql('postgres', "SELECT pg_enable_data_checksums($enable_params);");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back the data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$node->stop;
+
+done_testing();
diff --git a/src/test/checksum/t/002_restarts.pl b/src/test/checksum/t/002_restarts.pl
new file mode 100644
index 0000000000..dea0ec31df
--- /dev/null
+++ b/src/test/checksum/t/002_restarts.pl
@@ -0,0 +1,90 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# restarting the processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init;
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres', "SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, '1', "ensure checksums aren't enabled yet");
+
+$bsession->quit;
+$node->stop;
+$node->start;
+
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'inprogress-on', "ensure checksums aren't enabled yet");
+
+$node->safe_psql('postgres', "SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are turned on');
+
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+$result = $node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are turned off');
+
+done_testing();
diff --git a/src/test/checksum/t/003_standby_restarts.pl b/src/test/checksum/t/003_standby_restarts.pl
new file mode 100644
index 0000000000..26ad93f86e
--- /dev/null
+++ b/src/test/checksum/t/003_standby_restarts.pl
@@ -0,0 +1,130 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# streaming replication
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize primary node
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+$node_primary->init(allows_streaming => 1);
+$node_primary->start;
+my $backup_name = 'my_backup';
+
+# Take backup
+$node_primary->backup($backup_name);
+
+# Create streaming standby linking to primary
+my $node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $backup_name,
+	has_streaming => 1);
+$node_standby_1->start;
+
+# Create some content on the primary to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Wait for standbys to catch up
+$node_primary->wait_for_catchup($node_standby_1, 'replay',
+	$node_primary->lsn('insert'));
+
+# Check that checksums are turned off on all nodes
+my $result = $node_primary->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on primary');
+
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on standby_1');
+
+# Enable checksums for the cluster
+$node_primary->safe_psql('postgres', "SELECT pg_enable_data_checksums($enable_params);");
+
+# Ensure that the primary switches to "inprogress-on"
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	"inprogress-on");
+is($result, 1, 'ensure checksums are in progress on primary');
+
+# Wait for checksum enable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Ensure that the standby has switched to "inprogress-on" or "on".  Normally it
+# would be "inprogress-on", but it is theoretically possible for the primary to
+# complete the checksum enabling *and* have the standby replay that record
+# before we reach the check below.
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting = 'off' FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'f');
+is($result, 1, 'ensure standby has absorbed the inprogress-on barrier');
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+
+is(($result eq 'inprogress-on' || $result eq 'on'),
+	1, 'ensure checksums are on, or in progress, on standby_1');
+
+# Insert some more data which should be checksummed on INSERT
+$node_primary->safe_psql('postgres',
+	"INSERT INTO t VALUES (generate_series(1, 10000));");
+
+# Wait for checksums enabled on the primary
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the primary');
+
+# Wait for checksums enabled on the standby
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the standby');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, '20000', 'ensure we can safely read all data with checksums');
+
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+# Disable checksums and ensure it's propagated to standby and that we can
+# still read all data
+$node_primary->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+# Wait for checksum disable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure data checksums are disabled on the primary 2');
+
+# Ensure that the standby has switched to off
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are off on standby_1');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, "20000", 'ensure we can safely read all data without checksums');
+
+done_testing();
diff --git a/src/test/checksum/t/004_offline.pl b/src/test/checksum/t/004_offline.pl
new file mode 100644
index 0000000000..b1f585ec7c
--- /dev/null
+++ b/src/test/checksum/t/004_offline.pl
@@ -0,0 +1,100 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums offline from various states
+# of checksum processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init();
+$node->start();
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable checksums offline using pg_checksums
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are enabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums offline again using pg_checksums
+$node->stop;
+$node->checksum_disable_offline;
+$node->start;
+
+# Ensure that checksums are disabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres', "SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, 1, 'ensure checksums are in the process of being enabled');
+
+# Turn the cluster off and enable checksums offline, then start back up
+$bsession->quit;
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are now enabled even though processing wasn't
+# restarted
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+done_testing();
diff --git a/src/test/meson.build b/src/test/meson.build
index c3d0dfedf1..b07d5d2d00 100644
--- a/src/test/meson.build
+++ b/src/test/meson.build
@@ -7,6 +7,7 @@ subdir('authentication')
 subdir('recovery')
 subdir('subscription')
 subdir('modules')
+subdir('checksum')
 
 if ssl.found()
   subdir('ssl')
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 90a842f96a..c008459ae8 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3483,6 +3483,40 @@ sub advance_wal
 	}
 }
 
+=item $node->checksum_enable_offline()
+
+Enable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_enable_offline
+{
+	my ($self) = @_;
+
+	print "### Enabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-e');
+	return;
+}
+
+=item checksum_disable_offline
+
+Disable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_disable_offline
+{
+	my ($self) = @_;
+
+	print "### Disabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-d');
+	return;
+}
+
 =pod
 
 =back
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 5fabb127d7..929232bf5b 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -396,6 +396,7 @@ CheckPointStmt
 CheckpointStatsData
 CheckpointerRequest
 CheckpointerShmemStruct
+ChecksumType
 Chromosome
 CkptSortItem
 CkptTsStatus
@@ -582,6 +583,10 @@ DataPageDeleteStack
 DataTypesUsageChecks
 DataTypesUsageVersionCheck
 DatabaseInfo
+DataChecksumsWorkerDatabase
+DataChecksumsWorkerResult
+DataChecksumsWorkerResultEntry
+DataChecksumsWorkerShmemStruct
 DateADT
 DateTimeErrorExtra
 Datum
-- 
2.39.3 (Apple Git-146)

#5Michael Banck
mbanck@gmx.net
In reply to: Daniel Gustafsson (#4)
Re: Changing the state of data checksums in a running cluster

Hi,

On Mon, Sep 30, 2024 at 11:21:30PM +0200, Daniel Gustafsson wrote:

Yeah, I think a view like pg_stat_progress_checksums would work.

Added in the attached version. It probably needs some polish (the docs for
sure do) but it's at least a start.

Just a nitpick, but we call it data_checksums about everywhere, but the
new view is called pg_stat_progress_datachecksums - I think
pg_stat_progress_data_checksums would look better even if it gets quite
long.

Michael

#6Daniel Gustafsson
daniel@yesql.se
In reply to: Michael Banck (#5)
Re: Changing the state of data checksums in a running cluster

On 1 Oct 2024, at 00:43, Michael Banck <mbanck@gmx.net> wrote:

Hi,

On Mon, Sep 30, 2024 at 11:21:30PM +0200, Daniel Gustafsson wrote:

Yeah, I think a view like pg_stat_progress_checksums would work.

Added in the attached version. It probably needs some polish (the docs for
sure do) but it's at least a start.

Just a nitpick, but we call it data_checksums about everywhere, but the
new view is called pg_stat_progress_datachecksums - I think
pg_stat_progress_data_checksums would look better even if it gets quite
long.

That's a fair point, I'll make sure to switch for the next version of the
patch.

--
Daniel Gustafsson

#7Daniel Gustafsson
daniel@yesql.se
In reply to: Daniel Gustafsson (#6)
1 attachment(s)
Re: Changing the state of data checksums in a running cluster

On 1 Oct 2024, at 20:55, Daniel Gustafsson <daniel@yesql.se> wrote:

On 1 Oct 2024, at 00:43, Michael Banck <mbanck@gmx.net> wrote:

Hi,

On Mon, Sep 30, 2024 at 11:21:30PM +0200, Daniel Gustafsson wrote:

Yeah, I think a view like pg_stat_progress_checksums would work.

Added in the attached version. It probably needs some polish (the docs for
sure do) but it's at least a start.

Just a nitpick, but we call it data_checksums about everywhere, but the
new view is called pg_stat_progress_datachecksums - I think
pg_stat_progress_data_checksums would look better even if it gets quite
long.

That's a fair point, I'll make sure to switch for the next version of the
patch.

A rebased v3 attached with that change.

--
Daniel Gustafsson

Attachments:

v3-0001-Support-checksum-enable-disable-in-a-running-clus.patchapplication/octet-stream; name=v3-0001-Support-checksum-enable-disable-in-a-running-clus.patch; x-unix-mode=0644Download
From e196816b490f5c6783d6e986688cc58a188c2622 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Tue, 2 Jul 2024 15:20:43 +0200
Subject: [PATCH v3] Support checksum enable/disable in a running cluster

This allows data checksums to be enabled, or disabled, in a running
cluster without restricting access to the cluster during processing.

Data checksums could prior to this only be enabled at initdb time, or
when the cluster is offline using pg_checksums. This commit introduce
functionality to enable, and disable, data checksums without the need
for turning off the cluster.

A dynamic background worker is responsible for launching a per-database
worker which will mark all buffers dirty for all relation with storage
in order for them to have data checksums on write. Once all relations
in all databases have been processed, the data_checksums state can be
set to "on" and the cluster will at that point be identical to one
which had checksums enabled from the start.

While the cluster is writing checksums on existing buffers, checksums
are written but not verified during reading to avoid false negatives.
Disabling checksums will not touch any buffers (but existing checksums
cannot be re-used in case checksums are immediately re-enabled). While
disabling, checksums are again written but not verified to ensure that
concurrent backends which haven't started disabling checksums will
incur a verification error.

New in-progress states are introduced for data_checksums which during
processing ensures that backends know whether to verify and write
checksums. All state changes across backends are synchronized using
procsignalbarriers

Earlier versions of this patch were reviewed by Heikki Linnakangas,
Robert Haas, Andres Freund, Tomas Vondra, Michael Banck and Andrey
Borodin.

Authors: Daniel Gustafsson, Magnus Hagander
Discussion: https://postgr.es/m/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com
Discussion: https://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de
Discussion: https://postgr.es/m/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com
---
 doc/src/sgml/func.sgml                        |   71 +
 doc/src/sgml/monitoring.sgml                  |  206 ++-
 doc/src/sgml/ref/pg_checksums.sgml            |    6 +
 doc/src/sgml/wal.sgml                         |   57 +-
 src/backend/access/heap/heapam.c              |    1 +
 src/backend/access/rmgrdesc/xlogdesc.c        |   24 +
 src/backend/access/transam/xlog.c             |  455 +++++-
 src/backend/access/transam/xlogfuncs.c        |   41 +
 src/backend/backup/basebackup.c               |    6 +-
 src/backend/catalog/system_functions.sql      |    7 +
 src/backend/catalog/system_views.sql          |   21 +
 src/backend/postmaster/Makefile               |    1 +
 src/backend/postmaster/bgworker.c             |    7 +
 src/backend/postmaster/datachecksumsworker.c  | 1404 +++++++++++++++++
 src/backend/postmaster/meson.build            |    1 +
 src/backend/replication/logical/decode.c      |    1 +
 src/backend/storage/buffer/bufmgr.c           |    4 +-
 src/backend/storage/ipc/ipci.c                |    3 +
 src/backend/storage/ipc/procsignal.c          |   13 +
 src/backend/storage/page/README               |    4 +-
 src/backend/storage/page/bufpage.c            |    6 +-
 src/backend/storage/smgr/bulk_write.c         |    2 +
 src/backend/utils/activity/pgstat.c           |    1 -
 src/backend/utils/activity/pgstat_io.c        |    2 +
 .../utils/activity/wait_event_names.txt       |    3 +
 src/backend/utils/adt/pgstatfuncs.c           |    8 +-
 src/backend/utils/init/miscinit.c             |    9 +-
 src/backend/utils/init/postinit.c             |    7 +-
 src/backend/utils/misc/guc_tables.c           |   32 +-
 src/bin/pg_checksums/pg_checksums.c           |    2 +-
 src/bin/pg_upgrade/controldata.c              |    9 +
 src/include/access/xlog.h                     |   16 +-
 src/include/access/xlog_internal.h            |    7 +
 src/include/catalog/pg_control.h              |    1 +
 src/include/catalog/pg_proc.dat               |   17 +
 src/include/commands/progress.h               |   18 +
 src/include/miscadmin.h                       |    4 +
 src/include/postmaster/datachecksumsworker.h  |   31 +
 src/include/storage/bufpage.h                 |   10 +
 src/include/storage/checksum.h                |    8 +
 src/include/storage/lwlocklist.h              |    1 +
 src/include/storage/procsignal.h              |    5 +
 src/include/utils/backend_progress.h          |    1 +
 src/test/Makefile                             |   10 +-
 src/test/checksum/.gitignore                  |    2 +
 src/test/checksum/Makefile                    |   23 +
 src/test/checksum/README                      |   22 +
 src/test/checksum/meson.build                 |   15 +
 src/test/checksum/t/001_basic.pl              |   85 +
 src/test/checksum/t/002_restarts.pl           |   90 ++
 src/test/checksum/t/003_standby_restarts.pl   |  130 ++
 src/test/checksum/t/004_offline.pl            |  100 ++
 src/test/meson.build                          |    1 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |   34 +
 src/tools/pgindent/typedefs.list              |    5 +
 55 files changed, 3001 insertions(+), 49 deletions(-)
 create mode 100644 src/backend/postmaster/datachecksumsworker.c
 create mode 100644 src/include/postmaster/datachecksumsworker.h
 create mode 100644 src/test/checksum/.gitignore
 create mode 100644 src/test/checksum/Makefile
 create mode 100644 src/test/checksum/README
 create mode 100644 src/test/checksum/meson.build
 create mode 100644 src/test/checksum/t/001_basic.pl
 create mode 100644 src/test/checksum/t/002_restarts.pl
 create mode 100644 src/test/checksum/t/003_standby_restarts.pl
 create mode 100644 src/test/checksum/t/004_offline.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7b4fbb5047..110f25324c 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29715,6 +29715,77 @@ DETAIL:  Make sure pg_wal_replay_wait() isn't called within a transaction with a
 
   </sect2>
 
+  <sect2 id="functions-admin-checksum">
+   <title>Data Checksum Functions</title>
+
+   <para>
+    The functions shown in <xref linkend="functions-checksums-table" /> can
+    be used to enable or disable data checksums in a running cluster.
+    See <xref linkend="checksums" /> for details.
+   </para>
+
+   <table id="functions-checksums-table">
+    <title>Data Checksum Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_enable_data_checksums</primary>
+        </indexterm>
+        <function>pg_enable_data_checksums</function> ( <optional><parameter>cost_delay</parameter> <type>int</type>, <parameter>cost_limit</parameter> <type>int</type></optional> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Initiates data checksums for the cluster. This will switch the data
+        checksums mode to <literal>inprogress-on</literal> as well as start a
+        background worker that will process all pages in the database and
+        enable checksums on them. When all data pages have had checksums
+        enabled, the cluster will automatically switch data checksums mode to
+        <literal>on</literal>.
+       </para>
+       <para>
+        If <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter> are
+        specified, the speed of the process is throttled using the same principles as
+        <link linkend="runtime-config-resource-vacuum-cost">Cost-based Vacuum Delay</link>.
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_disable_data_checksums</primary>
+        </indexterm>
+        <function>pg_disable_data_checksums</function> ()
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Disables data checksum validation and calculation for the cluster. This
+        will switch the data checksum mode to <literal>inprogress-off</literal>
+        while data checksums are being disabled. When all active backends have
+        stopped validating data checksums, the data checksum mode will be
+        changed to <literal>off</literal>.  At this point the data pages will
+        still have checksums recorded but they are not updated when pages are
+        modified.
+       </para></entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+
+  </sect2>
+
   <sect2 id="functions-admin-dbobject">
    <title>Database Object Management Functions</title>
 
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 331315f8d3..9ed5c64948 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3496,8 +3496,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Number of data page checksum failures detected in this
-       database (or on a shared object), or NULL if data checksums are not
-       enabled.
+       database. Detected failures are reported regardless of the
+       <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -3507,8 +3507,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Time at which the last data page checksum failure was detected in
-       this database (or on a shared object), or NULL if data checksums are not
-       enabled.
+       this database (or on a shared object). Last failure is reported
+       regardless of the <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -6670,6 +6670,204 @@ FROM pg_stat_get_backend_idset() AS backendid;
 
  </sect2>
 
+ <sect2 id="data-checksum-progress-reporting">
+  <title>Data Checksum Progress Reporting</title>
+
+  <indexterm>
+   <primary>pg_stat_progress_data_checksums</primary>
+  </indexterm>
+
+  <para>
+   When data checksums are being enabled on a running cluster, the
+   <structname>pg_stat_progress_data_checksums</structname> view will contain
+   a row for each background worker which is currently calculating checksums
+   for the data pages.
+  </para>
+
+  <table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
+   <title><structname>pg_stat_progress_data_checksums</structname> View</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        Column Type
+       </para>
+       <para>
+        Description>
+       </para>
+      </entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>pid</structfield> <type>integer</type>
+       </para>
+       <para>
+        Process ID of a datachecksumworker process.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>phase</structfield> <type>text</type>
+       </para>
+       <para>
+        Current processing phase, see <xref linkend="datachecksum-phases"/>
+        for description of the phases.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of database which will be processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of relations which will be processed, or
+        <literal>NULL</literal> if the datachecksumsworker hasn't calcuated
+        the number of relations yet.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of databases which have been processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of relations which have been processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_current</structfield> <type>oid</type>
+       </para>
+       <para>
+        OID of the database currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current</structfield> <type>oid</type>
+       </para>
+       <para>
+        OID of the relation currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current_blocks</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of blocks in the relation currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current_blocks_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks which have been processed in the relation currently
+        being processed.
+       </para>
+      </entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+  <table id="datachecksum-phases">
+   <title>Data Checksum Phases</title>
+   <tgroup cols="2">
+    <colspec colname="col1" colwidth="1*"/>
+    <colspec colname="col2" colwidth="2*"/>
+    <thead>
+     <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+    <tbody>
+     <row>
+      <entry><literal>enabling</literal></entry>
+      <entry>
+       The command is currently enabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>disabling</literal></entry>
+      <entry>
+       The command is currently disabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on backends</literal></entry>
+      <entry>
+       The command is currently waiting for backends to acknowledge the data
+       checksum operation.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on temporary tables</literal></entry>
+      <entry>
+       The command is currently waiting for all temporary tables which existed
+       at the time the command was started to be removed.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>done</literal></entry>
+      <entry>
+       The command has finished processing and is exiting.
+      </entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+ </sect2>
+
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index 95043aa329..0343710af5 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -45,6 +45,12 @@ PostgreSQL documentation
    exit status is nonzero if the operation failed.
   </para>
 
+  <para>
+   When enabling checksums, if checksums were in the process of being enabled
+   when the cluster was shut down, <application>pg_checksums</application>
+   will still process all relations regardless of the online processing.
+  </para>
+
   <para>
    When verifying checksums, every file in the cluster is scanned. When
    enabling checksums, each relation file block with a changed checksum is
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index a34cddb5ed..448a9c0c94 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -248,9 +248,10 @@
   <para>
    Checksums are normally enabled when the cluster is initialized using <link
    linkend="app-initdb-data-checksums"><application>initdb</application></link>.
-   They can also be enabled or disabled at a later time as an offline
-   operation. Data checksums are enabled or disabled at the full cluster
-   level, and cannot be specified individually for databases or tables.
+   They can also be enabled or disabled at a later time either as an offline
+   operation or online in a running cluster allowing concurrent access. Data
+   checksums are enabled or disabled at the full cluster level, and cannot be
+   specified individually for databases or tables.
   </para>
 
   <para>
@@ -267,7 +268,7 @@
   </para>
 
   <sect2 id="checksums-offline-enable-disable">
-   <title>Off-line Enabling of Checksums</title>
+   <title>Offline Enabling of Checksums</title>
 
    <para>
     The <link linkend="app-pgchecksums"><application>pg_checksums</application></link>
@@ -276,6 +277,54 @@
    </para>
 
   </sect2>
+
+  <sect2 id="checksums-online-enable-disable">
+   <title>Online Enabling of Checksums</title>
+
+   <para>
+    Checksums can be enabled or disabled online, by calling the appropriate
+    <link linkend="functions-admin-checksum">functions</link>.
+   </para>
+
+   <para>
+    Enabling checksums will put the cluster checksum mode in
+    <literal>inprogress-on</literal> mode.  During this time, checksums will be
+    written but not verified. In addition to this, a background worker process
+    is started that enables checksums on all existing data in the cluster. Once
+    this worker has completed processing all databases in the cluster, the
+    checksum mode will automatically switch to <literal>on</literal>. The
+    processing will consume two background worker processes, make sure that
+    <varname>max_worker_processes</varname> allows for at least two more
+    additional processes.
+   </para>
+
+   <para>
+    The process will initially wait for all open transactions to finish before
+    it starts, so that it can be certain that there are no tables that have been
+    created inside a transaction that has not committed yet and thus would not
+    be visible to the process enabling checksums. It will also, for each database,
+    wait for all pre-existing temporary tables to get removed before it finishes.
+    If long-lived temporary tables are used in the application it may be necessary
+    to terminate these application connections to allow the process to complete.
+   </para>
+
+   <para>
+    If the cluster is stopped while in <literal>inprogress-on</literal> mode, for
+    any reason, then this process must be restarted manually. To do this,
+    re-execute the function <function>pg_enable_data_checksums()</function>
+    once the cluster has been restarted. The background worker will attempt
+    to resume the work from where it was interrupted.
+   </para>
+
+   <note>
+    <para>
+     Enabling checksums can cause significant I/O to the system, as most of the
+     database pages will need to be rewritten, and will be written both to the
+     data files and the WAL.
+    </para>
+   </note>
+
+  </sect2>
  </sect1>
 
   <sect1 id="wal-intro">
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index da5e656a08..0a5516f45d 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -8576,6 +8576,7 @@ log_heap_visible(Relation rel, Buffer heap_buffer, Buffer vm_buffer,
 	XLogRegisterBuffer(0, vm_buffer, 0);
 
 	flags = REGBUF_STANDARD;
+
 	if (!XLogHintBitIsNeeded())
 		flags |= REGBUF_NO_IMAGE;
 	XLogRegisterBuffer(1, heap_buffer, flags);
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 363294d623..465153e5d1 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -18,6 +18,7 @@
 #include "access/xlog.h"
 #include "access/xlog_internal.h"
 #include "catalog/pg_control.h"
+#include "storage/bufpage.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -167,6 +168,26 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
 		memcpy(&wal_level, rec, sizeof(int));
 		appendStringInfo(buf, "wal_level %s", get_wal_level_string(wal_level));
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state xlrec;
+
+		memcpy(&xlrec, rec, sizeof(xl_checksum_state));
+		switch (xlrec.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_VERSION:
+				appendStringInfoString(buf, "on");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				appendStringInfoString(buf, "inprogress-off");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				appendStringInfoString(buf, "inprogress-on");
+				break;
+			default:
+				appendStringInfoString(buf, "off");
+		}
+	}
 }
 
 const char *
@@ -218,6 +239,9 @@ xlog_identify(uint8 info)
 		case XLOG_CHECKPOINT_REDO:
 			id = "CHECKPOINT_REDO";
 			break;
+		case XLOG_CHECKSUMS:
+			id = "CHECKSUMS";
+			break;
 	}
 
 	return id;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 9102c8d772..89078f2a77 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -646,6 +646,16 @@ static XLogRecPtr LocalMinRecoveryPoint;
 static TimeLineID LocalMinRecoveryPointTLI;
 static bool updateMinRecoveryPoint = true;
 
+/*
+ * Local state fror Controlfile data_checksum_version. After initialization
+ * this is only updated when absorbing a procsignal barrier during interrupt
+ * processing. The reason for keeping a copy in backend-private memory is to
+ * avoid locking for interrogating checksum state. Possible values are the
+ * checksum versions defined in storage/bufpage.h as well as zero when data
+ * checksums are disabled.
+ */
+static uint32 LocalDataChecksumVersion = 0;
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -714,6 +724,8 @@ static void WALInsertLockAcquireExclusive(void);
 static void WALInsertLockRelease(void);
 static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);
 
+static void XLogChecksums(ChecksumType new_type);
+
 /*
  * Insert an XLOG record represented by an already-constructed chain of data
  * chunks.  This is a low-level routine; to construct the WAL record header
@@ -827,9 +839,10 @@ XLogInsertRecord(XLogRecData *rdata,
 		 * only happen just after a checkpoint, so it's better to be slow in
 		 * this case and fast otherwise.
 		 *
-		 * Also check to see if fullPageWrites was just turned on or there's a
-		 * running backup (which forces full-page writes); if we weren't
-		 * already doing full-page writes then go back and recompute.
+		 * Also check to see if fullPageWrites was just turned on, there's a
+		 * running backup or if checksums are enabled (all of which forces
+		 * full-page writes); if we weren't already doing full-page writes
+		 * then go back and recompute.
 		 *
 		 * If we aren't doing full-page writes then RedoRecPtr doesn't
 		 * actually affect the contents of the XLOG record, so we'll update
@@ -842,7 +855,9 @@ XLogInsertRecord(XLogRecData *rdata,
 			Assert(RedoRecPtr < Insert->RedoRecPtr);
 			RedoRecPtr = Insert->RedoRecPtr;
 		}
-		doPageWrites = (Insert->fullPageWrites || Insert->runningBackups > 0);
+		doPageWrites = (Insert->fullPageWrites ||
+						Insert->runningBackups > 0 ||
+						DataChecksumsNeedWrite());
 
 		if (doPageWrites &&
 			(!prevDoPageWrites ||
@@ -4539,9 +4554,7 @@ ReadControlFile(void)
 
 	CalculateCheckpointSegments();
 
-	/* Make the initdb settings visible as GUC variables, too */
-	SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no",
-					PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
+	LocalDataChecksumVersion = ControlFile->data_checksum_version;
 }
 
 /*
@@ -4575,13 +4588,349 @@ GetMockAuthenticationNonce(void)
 }
 
 /*
- * Are checksums enabled for data pages?
+ * DataChecksumsNeedWrite
+ *		Returns whether data checksums must be written or not
+ *
+ * Returns true iff data checksums are enabled or are in the process of being
+ * enabled. During "inprogress-on" and "inprogress-off" states checksums must
+ * be written even though they are not verified (see datachecksumsworker.c for
+ * a longer discussion).
+ *
+ * This function is intended for callsites which are about to write a data page
+ * to storage, and need to know whether to re-calculate the checksum for the
+ * page header. Calling this function must be performed as close to the write
+ * operation as possible to keep the critical section short.
+ */
+bool
+DataChecksumsNeedWrite(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * DataChecksumsNeedVerify
+ *		Returns whether data checksums must be verified or not
+ *
+ * Data checksums are only verified if they are fully enabled in the cluster.
+ * During the "inprogress-on" and "inprogress-off" states they are only
+ * updated, not verified (see datachecksumsworker.c for a longer discussion).
+ *
+ * This function is intended for callsites which have read data and are about
+ * to perform checksum validation based on the result of this.  Calling this
+ * function must be performed as close to the validation call as possible to
+ * keep the critical section short. This is in order to protect against time of
+ * check/time of use situations around data checksum validation.
+ */
+bool
+DataChecksumsNeedVerify(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
+}
+
+/*
+ * DataChecksumsOnInProgress
+ *		Returns whether data checksums are being enabled
+ *
+ * Most operations don't need to worry about the "inprogress" states, and
+ * should use DataChecksumsNeedVerify() or DataChecksumsNeedWrite(). The
+ * "inprogress-on" state for enabling checksums is used when the checksum
+ * worker is setting checksums on all pages, it can thus be used to check for
+ * aborted checksum processing which need to be restarted.
+ */
+inline bool
+DataChecksumsOnInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+}
+
+/*
+ * DataChecksumsOffInProgress
+ *		Returns whether data checksums are being disabled
+ *
+ * The "inprogress-off" state for disabling checksums is used for when the
+ * worker resets the catalog state.  DataChecksumsNeedVerify() or
+ * DataChecksumsNeedWrite() should be used for deciding whether to read/write
+ * checksums.
  */
 bool
-DataChecksumsEnabled(void)
+DataChecksumsOffInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * SetDataChecksumsOnInProgress
+ *		Sets the data checksum state to "inprogress-on" to enable checksums
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on". See
+ * SetDataChecksumsOn below for a description on how this state change works.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOnInProgress(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile != NULL);
+
+	/*
+	 * The state transition is performed in a critical section with
+	 * checkpoints held off to provide crash safety.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	UpdateControlFile();
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state change in all backends to ensure that all backends are in
+	 * "inprogress-on". Once done we know that all backends are writing data
+	 * checksums.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOn
+ *		Enables data checksums cluster-wide
+ *
+ * Enabling data checksums is performed using two barriers, the first one to
+ * set the checksums state to "inprogress-on" (which is performed by
+ * SetDataChecksumsOnInProgress()) and the second one to set the state to "on"
+ * (performed here).
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on".  This state
+ * requires data checksums to be written but not verified. This ensures that
+ * all data pages can be checksummed without the risk of false negatives in
+ * validation during the process.  When all existing pages are guaranteed to
+ * have checksums, and all new pages will be initiated with checksums, the
+ * state can be changed to "on". Once the state is "on" checksums will be both
+ * written and verified. See datachecksumsworker.c for a longer discussion on
+ * how data checksums can be enabled in a running cluster.
+ *
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOn(void)
 {
+	uint64		barrier;
+
 	Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/*
+	 * The only allowed state transition to "on" is from "inprogress-on" since
+	 * that state ensures that all pages will have data checksums written.
+	 */
+	if (ControlFile->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+		elog(ERROR, "checksums not in \"inprogress-on\" mode");
+	}
+
+	LWLockRelease(ControlFileLock);
+
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
+	UpdateControlFile();
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state transition of "on" in all backends. When done we know that
+	 * data checksums are enabled in all backends and data checksums are both
+	 * written and verified.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOff
+ *		Disables data checksums cluster-wide
+ *
+ * Disabling data checksums must be performed with two sets of barriers, each
+ * carrying a different state. The state is first set to "inprogress-off"
+ * during which checksums are still written but not verified. This ensures that
+ * backends which have yet to observe the state change from "on" won't get
+ * validation errors on concurrently modified pages. Once all backends have
+ * changed to "inprogress-off", the barrier for moving to "off" can be emitted.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOff(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/* If data checksums are already disabled there is nothing to do */
+	if (ControlFile->data_checksum_version == 0)
+	{
+		LWLockRelease(ControlFileLock);
+		return;
+	}
+
+	/*
+	 * If data checksums are currently enabled we first transition to the
+	 * "inprogress-off" state during which backends continue to write
+	 * checksums without verifying them. When all backends are in
+	 * "inprogress-off" the next transition to "off" can be performed, after
+	 * which all data checksum processing is disabled.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+
+		MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+		START_CRIT_SECTION();
+
+		XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+		UpdateControlFile();
+		LWLockRelease(ControlFileLock);
+
+		barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+
+		END_CRIT_SECTION();
+		MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+
+		/*
+		 * Update local state in all backends to ensure that any backend in
+		 * "on" state is changed to "inprogress-off".
+		 */
+		WaitForProcSignalBarrier(barrier);
+
+		/*
+		 * At this point we know that no backends are verifying data checksums
+		 * during reading. Next, we can safely move to state "off" to also
+		 * stop writing checksums.
+		 */
+	}
+	else
+	{
+		/*
+		 * Ending up here implies that the checksums state is "inprogress-on"
+		 * or "inprogress-off" and we can transition directly to "off" from
+		 * there.
+		 */
+		LWLockRelease(ControlFileLock);
+	}
+
+	/*
+	 * Ensure that we don't incur a checkpoint during disabling checksums.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(0);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = 0;
+	UpdateControlFile();
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * ProcSignalBarrier absorption functions for enabling and disabling data
+ * checksums in a running cluster. The procsignalbarriers are emitted in the
+ * SetDataChecksums* functions.
+ */
+bool
+AbsorbChecksumsOnInProgressBarrier(void)
+{
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOnBarrier(void)
+{
+	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffInProgressBarrier(void)
+{
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffBarrier(void)
+{
+	LocalDataChecksumVersion = 0;
+	return true;
+}
+
+/*
+ * InitLocalControlData
+ *
+ * Set up backend local caches of controldata variables which may change at
+ * any point during runtime and thus require special cased locking. So far
+ * this only applies to data_checksum_version, but it's intended to be general
+ * purpose enough to handle future cases.
+ */
+void
+InitLocalControldata(void)
+{
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	LocalDataChecksumVersion = ControlFile->data_checksum_version;
+	LWLockRelease(ControlFileLock);
+}
+
+/* guc hook */
+const char *
+show_data_checksums(void)
+{
+	if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)
+		return "on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		return "inprogress-on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+		return "inprogress-off";
+	else
+		return "off";
 }
 
 /*
@@ -6141,6 +6490,33 @@ StartupXLOG(void)
 	 */
 	CompleteCommitTsInitialization();
 
+	/*
+	 * If we reach this point with checksums being enabled ("inprogress-on"
+	 * state), we notify the user that they need to manually restart the
+	 * process to enable checksums. This is because we cannot launch a dynamic
+	 * background worker directly from here, it has to be launched from a
+	 * regular backend.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		ereport(WARNING,
+				(errmsg("data checksums are being enabled, but no worker is running"),
+				 errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
+
+	/*
+	 * If data checksums were being disabled when the cluster was shut down,
+	 * we know that we have a state where all backends have stopped validating
+	 * checksums and we can move to off instead of prompting the user to
+	 * perform any action.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+	{
+		XLogChecksums(0);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = 0;
+		LWLockRelease(ControlFileLock);
+	}
+
 	/*
 	 * All done with end-of-recovery actions.
 	 *
@@ -8154,6 +8530,24 @@ XLogReportParameters(void)
 	}
 }
 
+/*
+ * Log the new state of checksums
+ */
+static void
+XLogChecksums(ChecksumType new_type)
+{
+	xl_checksum_state xlrec;
+	XLogRecPtr	recptr;
+
+	xlrec.new_checksumtype = new_type;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, sizeof(xl_checksum_state));
+
+	recptr = XLogInsert(RM_XLOG_ID, XLOG_CHECKSUMS);
+	XLogFlush(recptr);
+}
+
 /*
  * Update full_page_writes in shared memory, and write an
  * XLOG_FPW_CHANGE record if necessary.
@@ -8582,6 +8976,47 @@ xlog_redo(XLogReaderState *record)
 	{
 		/* nothing to do here, just for informational purposes */
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state state;
+		uint64		barrier;
+
+		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = state.new_checksumtype;
+		UpdateControlFile();
+		LWLockRelease(ControlFileLock);
+
+		/*
+		 * Block on a procsignalbarrier to await all processes having seen the
+		 * change to checksum status. Once the barrier has been passed we can
+		 * initiate the corresponding processing.
+		 */
+		switch (state.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			default:
+				Assert(state.new_checksumtype == 0);
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+		}
+	}
 }
 
 /*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 3e3d2bb618..c1b22e0b9c 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -27,6 +27,7 @@
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "postmaster/datachecksumsworker.h"
 #include "replication/walreceiver.h"
 #include "storage/fd.h"
 #include "storage/proc.h"
@@ -803,3 +804,43 @@ pg_wal_replay_wait(PG_FUNCTION_ARGS)
 
 	PG_RETURN_VOID();
 }
+
+/*
+ * Disables data checksums for the cluster, if applicable. Starts a background
+ * worker which turns off the data checksums.
+ */
+Datum
+disable_data_checksums(PG_FUNCTION_ARGS)
+{
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	StartDataChecksumsWorkerLauncher(false, 0, 0, false);
+	PG_RETURN_VOID();
+}
+
+/*
+ * Enables data checksums for the cluster, if applicable.  Supports vacuum-
+ * like cost based throttling to limit system load. Starts a background worker
+ * which updates data checksums on existing data.
+ */
+Datum
+enable_data_checksums(PG_FUNCTION_ARGS)
+{
+	int			cost_delay = PG_GETARG_INT32(0);
+	int			cost_limit = PG_GETARG_INT32(1);
+	bool		fast = PG_GETARG_BOOL(2);
+
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	if (cost_delay < 0)
+		ereport(ERROR, errmsg("cost delay cannot be a negative value"));
+
+	if (cost_limit <= 0)
+		ereport(ERROR, errmsg("cost limit must be greater than zero"));
+
+	StartDataChecksumsWorkerLauncher(true, cost_delay, cost_limit, fast);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 14e5ba72e9..3c64413dd4 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1612,7 +1612,8 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	 * enabled for this cluster, and if this is a relation file, then verify
 	 * the checksum.
 	 */
-	if (!noverify_checksums && DataChecksumsEnabled() &&
+	if (!noverify_checksums &&
+		DataChecksumsNeedWrite() &&
 		RelFileNumberIsValid(relfilenumber))
 		verify_checksum = true;
 
@@ -2005,6 +2006,9 @@ verify_page_checksum(Page page, XLogRecPtr start_lsn, BlockNumber blkno,
 	if (PageIsNew(page) || PageGetLSN(page) >= start_lsn)
 		return true;
 
+	if (!DataChecksumsNeedVerify())
+		return true;
+
 	/* Perform the actual checksum calculation. */
 	checksum = pg_checksum_page(page, blkno);
 
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index b0d0de051e..16c77db056 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,11 @@ LANGUAGE INTERNAL
 CALLED ON NULL INPUT VOLATILE PARALLEL SAFE
 AS 'pg_stat_reset_slru';
 
+CREATE OR REPLACE FUNCTION
+  pg_enable_data_checksums(cost_delay integer DEFAULT 0, cost_limit integer DEFAULT 100, fast boolean DEFAULT false)
+  RETURNS void STRICT VOLATILE LANGUAGE internal AS 'enable_data_checksums'
+  PARALLEL RESTRICTED;
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
@@ -760,6 +765,8 @@ REVOKE EXECUTE ON FUNCTION pg_ls_logicalmapdir() FROM PUBLIC;
 
 REVOKE EXECUTE ON FUNCTION pg_ls_replslotdir(text) FROM PUBLIC;
 
+REVOKE EXECUTE ON FUNCTION pg_enable_data_checksums(integer, integer, boolean) FROM public;
+
 --
 -- We also set up some things as accessible to standard roles.
 --
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 3456b821bc..841d416fa5 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1323,6 +1323,27 @@ CREATE VIEW pg_stat_progress_copy AS
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_data_checksums AS
+	SELECT
+		S.pid AS pid, S.datid AS datid, D.datname AS datname,
+		CASE S.param1 WHEN 0 THEN 'enabling'
+                      WHEN 1 THEN 'disabling'
+					  WHEN 2 THEN 'waiting'
+					  WHEN 3 THEN 'waiting on backends'
+					  WHEN 4 THEN 'waiting on temporary tables'
+					  WHEN 5 THEN 'done'
+					  END AS phase,
+		CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+		CASE S.param3 WHEN -1 THEN NULL ELSE S.param3 END AS relations_total,
+		S.param4 AS databases_processed,
+		S.param5 AS relations_processed,
+		S.param6 AS databases_current,
+		S.param7 AS relation_current,
+		S.param8 AS relation_current_blocks,
+		S.param9 AS relation_current_blocks_processed
+	FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+		LEFT JOIN pg_database D ON S.datid = D.oid;
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/postmaster/Makefile b/src/backend/postmaster/Makefile
index db08543d19..e112d4b53e 100644
--- a/src/backend/postmaster/Makefile
+++ b/src/backend/postmaster/Makefile
@@ -18,6 +18,7 @@ OBJS = \
 	bgworker.o \
 	bgwriter.o \
 	checkpointer.o \
+	datachecksumsworker.o \
 	fork_process.o \
 	interrupt.o \
 	launch_backend.o \
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 07bc5517fc..662cd12681 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -18,6 +18,7 @@
 #include "pgstat.h"
 #include "port/atomics.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "replication/logicallauncher.h"
 #include "replication/logicalworker.h"
@@ -132,6 +133,12 @@ static const struct
 	},
 	{
 		"TablesyncWorkerMain", TablesyncWorkerMain
+	},
+	{
+		"DataChecksumsWorkerLauncherMain", DataChecksumsWorkerLauncherMain
+	},
+	{
+		"DataChecksumsWorkerMain", DataChecksumsWorkerMain
 	}
 };
 
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
new file mode 100644
index 0000000000..72c75a092d
--- /dev/null
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -0,0 +1,1404 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.c
+ *	  Background worker for enabling or disabling data checksums online
+ *
+ * When enabling data checksums on a database at initdb time or when shut down
+ * with pg_checksums, no extra process is required as each page is checksummed,
+ * and verified, when accessed.  When enabling checksums on an already running
+ * cluster, this worker will ensure that all pages are checksummed before
+ * verification of the checksums is turned on. In the case of disabling
+ * checksums, the state transition is performed only in the control file, no
+ * changes are performed on the data pages.
+ *
+ * Checksums can be either enabled or disabled cluster-wide, with on/off being
+ * the end state for data_checksums.
+ *
+ * Enabling checksums
+ * ------------------
+ * When enabling checksums in an online cluster, data_checksums will be set to
+ * "inprogress-on" which signals that write operations MUST compute and write
+ * the checksum on the data page, but during reading the checksum SHALL NOT be
+ * verified. This ensures that all objects created during checksumming will
+ * have checksums set, but no reads will fail due to incorrect checksum. The
+ * DataChecksumsWorker will compile a list of databases which exist at the
+ * start of checksumming, and all of these which haven't been dropped during
+ * the processing MUST have been processed successfully in order for checksums
+ * to be enabled. Any new relation created during processing will see the
+ * in-progress state and will automatically be checksummed.
+ *
+ * For each database, all relations which have storage are read and every data
+ * page is marked dirty to force a write with the checksum. This will generate
+ * a lot of WAL as the entire database is read and written.
+ *
+ * If the processing is interrupted by a cluster restart, it will be restarted
+ * from the beginning again as state isn't persisted.
+ *
+ * Disabling checksums
+ * -------------------
+ * When disabling checksums, data_checksums will be set to "inprogress-off"
+ * which signals that checksums are written but no longer verified. This ensure
+ * that backends which have yet to move from the "on" state will still be able
+ * to process data checksum validation.
+ *
+ * Synchronization and Correctness
+ * -------------------------------
+ * The processes involved in enabling, or disabling, data checksums in an
+ * online cluster must be properly synchronized with the normal backends
+ * serving concurrent queries to ensure correctness. Correctness is defined
+ * as the following:
+ *
+ *    - Backends SHALL NOT violate local data_checksums state
+ *    - Data checksums SHALL NOT be considered enabled cluster-wide until all
+ *      currently connected backends have the local state "enabled"
+ *
+ * There are two levels of synchronization required for enabling data checksums
+ * in an online cluster: (i) changing state in the active backends ("on",
+ * "off", "inprogress-on" and "inprogress-off"), and (ii) ensuring no
+ * incompatible objects and processes are left in a database when workers end.
+ * The former deals with cluster-wide agreement on data checksum state and the
+ * latter with ensuring that any concurrent activity cannot break the data
+ * checksum contract during processing.
+ *
+ * Synchronizing the state change is done with procsignal barriers, where the
+ * WAL logging backend updating the global state in the controlfile will wait
+ * for all other backends to absorb the barrier. Barrier absorption will happen
+ * during interrupt processing, which means that connected backends will change
+ * state at different times. To prevent data checksum state changes when
+ * writing and verifying checksums, interrupts shall be held off before
+ * interrogating state and resumed when the IO operation has been performed.
+ *
+ *   When Enabling Data Checksums
+ *   ----------------------------
+ *   A process which fails to observe data checksums being enabled can induce
+ *   two types of errors: failing to write the checksum when modifying the page
+ *   and failing to validate the data checksum on the page when reading it.
+ *
+ *   When processing starts all backends belong to one of the below sets, with
+ *   one set being empty:
+ *
+ *   Bd: Backends in "off" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   If processing is started in an online cluster then all backends are in Bd.
+ *   If processing was halted by the cluster shutting down, the controlfile
+ *   state "inprogress-on" will be observed on system startup and all backends
+ *   will be in Bd. Backends transition Bd -> Bi via a procsignalbarrier.  When
+ *   the DataChecksumsWorker has finished writing checksums on all pages and
+ *   enables data checksums cluster-wide, there are four sets of backends where
+ *   Bd shall be an empty set:
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   Backends in Bi and Be will write checksums when modifying a page, but only
+ *   backends in Be will verify the checksum during reading. The Bg backend is
+ *   blocked waiting for all backends in Bi to process interrupts and move to
+ *   Be. Any backend starting while Bg is waiting on the procsignalbarrier will
+ *   observe the global state being "on" and will thus automatically belong to
+ *   Be.  Checksums are enabled cluster-wide when Bi is an empty set. Bi and Be
+ *   are compatible sets while still operating based on their local state as
+ *   both write data checksums.
+ *
+ *   When Disabling Data Checksums
+ *   -----------------------------
+ *   A process which fails to observe that data checksums have been disabled
+ *   can induce two types of errors: writing the checksum when modifying the
+ *   page and validating a data checksum which is no longer correct due to
+ *   modifications to the page.
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bo: Backends in "inprogress-off" state
+ *
+ *   Backends transition from the Be state to Bd like so: Be -> Bo -> Bd
+ *
+ *   The goal is to transition all backends to Bd making the others empty sets.
+ *   Backends in Bo write data checksums, but don't validate them, such that
+ *   backends still in Be can continue to validate pages until the barrier has
+ *   been absorbed such that they are in Bo. Once all backends are in Bo, the
+ *   barrier to transition to "off" can be raised and all backends can safely
+ *   stop writing data checksums as no backend is enforcing data checksum
+ *   validation any longer.
+ *
+ *
+ * Potential optimizations
+ * -----------------------
+ * Below are some potential optimizations and improvements which were brought
+ * up during reviews of this feature, but which weren't implemented in the
+ * initial version. These are ideas listed without any validation on their
+ * feasibility or potential payoff. More discussion on these can be found on
+ * the -hackers threads linked to in the commit message of this feature.
+ *
+ *   * Launching datachecksumsworker for resuming operation from the startup
+ *     process: Currently users have to restart processing manually after a
+ *     restart since dynamic background worker cannot be started from the
+ *     postmaster. Changing the startup process could make restarting the
+ *     processing automatic on cluster restart.
+ *   * Avoid dirtying the page when checksums already match: Iff the checksum
+ *     on the page happens to already match we still dirty the page. It should
+ *     be enough to only do the log_newpage_buffer() call in that case.
+ *   * Invent a lightweight WAL record that doesn't contain the full-page
+ *     image but just the block number: On replay, the redo routine would read
+ *     the page from disk.
+ *   * Teach pg_checksums to avoid checksummed pages when pg_checksums is used
+ *     to enable checksums on a cluster which is in inprogress-on state and
+ *     may have checksummed pages (make pg_checksums be able to resume an
+ *     online operation).
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/postmaster/datachecksumsworker.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/htup_details.h"
+#include "access/xact.h"
+#include "access/xloginsert.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_database.h"
+#include "commands/progress.h"
+#include "commands/vacuum.h"
+#include "common/relpath.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "postmaster/bgworker.h"
+#include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/bufmgr.h"
+#include "storage/checksum.h"
+#include "storage/ipc.h"
+#include "storage/lmgr.h"
+#include "storage/lwlock.h"
+#include "storage/procarray.h"
+#include "storage/smgr.h"
+#include "tcop/tcopprot.h"
+#include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
+#include "utils/ps_status.h"
+#include "utils/syscache.h"
+
+/*
+ * Number of times we retry to open a database before giving up and consider
+ * it to have failed processing.
+ */
+#define DATACHECKSUMSWORKER_MAX_DB_RETRIES 5
+
+typedef enum
+{
+	DATACHECKSUMSWORKER_SUCCESSFUL = 0,
+	DATACHECKSUMSWORKER_ABORTED,
+	DATACHECKSUMSWORKER_FAILED,
+	DATACHECKSUMSWORKER_RETRYDB,
+} DataChecksumsWorkerResult;
+
+/*
+ * Signaling between backends calling pg_enable/disable_data_checksums, the
+ * checksums launcher process, and the checksums worker process.
+ *
+ * This struct is protected by DataChecksumsWorkerLock
+ */
+typedef struct DataChecksumsWorkerShmemStruct
+{
+	/*
+	 * These are set by pg_enable/disable_data_checksums, to tell the launcher
+	 * what the target state is.
+	 */
+	bool		launch_enable_checksums;	/* True if checksums are being
+											 * enabled, else false */
+	int			launch_cost_delay;
+	int			launch_cost_limit;
+	bool		launch_fast;
+
+	/*
+	 * Is a launcher process is currently running?
+	 *
+	 * This is set by the launcher process, after it has read the above
+	 * launch_* parameters.
+	 */
+	bool		launcher_running;
+
+	/*
+	 * These fields indicate the target state that the launcher is currently
+	 * working towards. They can be different from the corresponding launch_*
+	 * fields, if a new pg_enable/disable_data_checksums() call was made while
+	 * the launcher/worker was already running.
+	 *
+	 * The below members are set when the launcher starts, and are only
+	 * accessed read-only by the single worker. Thus, we can access these
+	 * without a lock. If multiple workers, or dynamic cost parameters, are
+	 * supported at some point then this would need to be revisited.
+	 */
+	bool		enabling_checksums; /* True if checksums are being enabled,
+									 * else false */
+	int			cost_delay;
+	int			cost_limit;
+	bool		immediate_checkpoint;
+
+	/*
+	 * Signaling between the launcher and the worker process.
+	 *
+	 * As there is only a single worker, and the launcher won't read these
+	 * until the worker exits, they can be accessed without the need for a
+	 * lock. If multiple workers are supported then this will have to be
+	 * revisited.
+	 */
+
+	/* result, set by worker before exiting */
+	DataChecksumsWorkerResult success;
+
+	/*
+	 * tells the worker process whether it should also process the shared
+	 * catalogs
+	 */
+	bool		process_shared_catalogs;
+} DataChecksumsWorkerShmemStruct;
+
+/* Shared memory segment for datachecksumsworker */
+static DataChecksumsWorkerShmemStruct *DataChecksumsWorkerShmem;
+
+/* Bookkeeping for work to do */
+typedef struct DataChecksumsWorkerDatabase
+{
+	Oid			dboid;
+	char	   *dbname;
+} DataChecksumsWorkerDatabase;
+
+typedef struct DataChecksumsWorkerResultEntry
+{
+	Oid			dboid;
+	DataChecksumsWorkerResult result;
+	int			retries;
+} DataChecksumsWorkerResultEntry;
+
+
+/*
+ * Flag set by the interrupt handler
+ */
+static volatile sig_atomic_t abort_requested = false;
+
+/*
+ * Have we set the DataChecksumsWorkerShmemStruct->launcher_running flag?
+ * If we have, we need to clear it before exiting!
+ */
+static volatile sig_atomic_t launcher_running = false;
+
+/*
+ * Are we enabling data checksums, or disabling them?
+ */
+static bool enabling_checksums;
+
+/* Prototypes */
+static List *BuildDatabaseList(void);
+static List *BuildRelationList(bool temp_relations, bool include_shared);
+static void FreeDatabaseList(List *dblist);
+static DataChecksumsWorkerResult ProcessDatabase(DataChecksumsWorkerDatabase *db);
+static bool ProcessAllDatabases(bool immediate_checkpoint);
+static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy);
+static void launcher_cancel_handler(SIGNAL_ARGS);
+static void WaitForAllTransactionsToFinish(void);
+
+/*
+ * StartDataChecksumsWorkerLauncher
+ *		Main entry point for datachecksumsworker launcher process
+ *
+ * The main entrypoint for starting data checksums processing for enabling as
+ * well as disabling.
+ */
+void
+StartDataChecksumsWorkerLauncher(bool enable_checksums,
+								 int cost_delay,
+								 int cost_limit,
+								 bool fast)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	bool		launcher_running;
+
+	/* the cost delay settings have no effect when disabling */
+	Assert(enable_checksums || cost_delay == 0);
+	Assert(enable_checksums || cost_limit == 0);
+
+	/* store the desired state in shared memory */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	DataChecksumsWorkerShmem->launch_enable_checksums = enable_checksums;
+	DataChecksumsWorkerShmem->launch_cost_delay = cost_delay;
+	DataChecksumsWorkerShmem->launch_cost_limit = cost_limit;
+	DataChecksumsWorkerShmem->launch_fast = fast;
+
+	/* is the launcher already running? */
+	launcher_running = DataChecksumsWorkerShmem->launcher_running;
+
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * Launch a new launcher process, if it's not running already.
+	 *
+	 * If the launcher is currently busy enabling the checksums, and we want
+	 * them disabled (or vice versa), the launcher will notice that at latest
+	 * when it's about to exit, and will loop back process the new request. So
+	 * if the launcher is already running, we don't need to do anything more
+	 * here to abort it.
+	 *
+	 * If you call pg_enable/disable_data_checksums() twice in a row, before
+	 * the launcher has had a chance to start up, we still end up launching it
+	 * twice.  That's OK, the second invocation will see that a launcher is
+	 * already running and exit quickly.
+	 *
+	 * TODO: We could optimize here and skip launching the launcher, if we are
+	 * already in the desired state, i.e. if the checksums are already enabled
+	 * and you call pg_enable_data_checksums().
+	 */
+	if (!launcher_running)
+	{
+		/*
+		 * Prepare the BackgroundWorker and launch it.
+		 */
+		memset(&bgw, 0, sizeof(bgw));
+		bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+		bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+		snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+		snprintf(bgw.bgw_function_name, BGW_MAXLEN, "DataChecksumsWorkerLauncherMain");
+		snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker launcher");
+		snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker launcher");
+		bgw.bgw_restart_time = BGW_NEVER_RESTART;
+		bgw.bgw_notify_pid = MyProcPid;
+		bgw.bgw_main_arg = (Datum) 0;
+
+		if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+			ereport(ERROR,
+					(errmsg("failed to start background worker to process data checksums")));
+	}
+}
+
+/*
+ * ProcessSingleRelationFork
+ *		Enable data checksums in a single relation/fork.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy)
+{
+	BlockNumber numblocks = RelationGetNumberOfBlocksInFork(reln, forkNum);
+	BlockNumber blknum;
+	char		activity[NAMEDATALEN * 2 + 128];
+	char	   *relns;
+
+	relns = get_namespace_name(RelationGetNamespace(reln));
+
+	if (!relns)
+		return false;
+
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_CUR_REL_TOTAL_BLOCKS,
+								 numblocks);
+
+	/* Report the current relation to pgstat_activity */
+	snprintf(activity, sizeof(activity) - 1, "processing: %s.%s (%s, %dblocks)",
+			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	/*
+	 * We are looping over the blocks which existed at the time of process
+	 * start, which is safe since new blocks are created with checksums set
+	 * already due to the state being "inprogress-on".
+	 */
+	for (blknum = 0; blknum < numblocks; blknum++)
+	{
+		Buffer		buf = ReadBufferExtended(reln, forkNum, blknum, RBM_NORMAL, strategy);
+
+		/* Progress report the current block */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_CUR_REL_PROCESSED_BLOCKS,
+									 blknum);
+
+		/* Need to get an exclusive lock before we can flag as dirty */
+		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+
+		/*
+		 * Mark the buffer as dirty and force a full page write.  We have to
+		 * re-write the page to WAL even if the checksum hasn't changed,
+		 * because if there is a replica it might have a slightly different
+		 * version of the page with an invalid checksum, caused by unlogged
+		 * changes (e.g. hintbits) on the master happening while checksums
+		 * were off. This can happen if there was a valid checksum on the page
+		 * at one point in the past, so only when checksums are first on, then
+		 * off, and then turned on again. Iff wal_level is set to "minimal",
+		 * this could be avoided iff the checksum is calculated to be correct.
+		 */
+		START_CRIT_SECTION();
+		MarkBufferDirty(buf);
+		log_newpage_buffer(buf, false);
+		END_CRIT_SECTION();
+
+		UnlockReleaseBuffer(buf);
+
+		/*
+		 * This is the only place where we check if we are asked to abort, the
+		 * abortion will bubble up from here. It's safe to check this without
+		 * a lock, because if we miss it being set, we will try again soon.
+		 */
+		Assert(enabling_checksums);
+		if (!DataChecksumsWorkerShmem->launch_enable_checksums)
+			abort_requested = true;
+		if (abort_requested)
+			return false;
+
+		vacuum_delay_point();
+	}
+
+	pfree(relns);
+	return true;
+}
+
+/*
+ * ProcessSingleRelationByOid
+ *		Process a single relation based on oid.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationByOid(Oid relationId, BufferAccessStrategy strategy)
+{
+	Relation	rel;
+	ForkNumber	fnum;
+	bool		aborted = false;
+
+	StartTransactionCommand();
+
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_CUR_REL, relationId);
+
+	rel = try_relation_open(relationId, AccessShareLock);
+	if (rel == NULL)
+	{
+		/*
+		 * Relation no longer exists. We don't consider this an error since
+		 * there are no pages in it that need data checksums, and thus return
+		 * true. The worker operates off a list of relations generated at the
+		 * start of processing, so relations being dropped in the meantime is
+		 * to be expected.
+		 */
+		CommitTransactionCommand();
+		pgstat_report_activity(STATE_IDLE, NULL);
+		return true;
+	}
+	RelationGetSmgr(rel);
+
+	for (fnum = 0; fnum <= MAX_FORKNUM; fnum++)
+	{
+		if (smgrexists(rel->rd_smgr, fnum))
+		{
+			if (!ProcessSingleRelationFork(rel, fnum, strategy))
+			{
+				aborted = true;
+				break;
+			}
+		}
+	}
+	relation_close(rel, AccessShareLock);
+	elog(DEBUG2,
+		 "data checksum processing done for relation with OID %u: %s",
+		 relationId, (aborted ? "aborted" : "finished"));
+
+	CommitTransactionCommand();
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return !aborted;
+}
+
+/*
+ * ProcessDatabase
+ *		Enable data checksums in a single database.
+ *
+ * We do this by launching a dynamic background worker into this database, and
+ * waiting for it to finish.  We have to do this in a separate worker, since
+ * each process can only be connected to one database during its lifetime.
+ */
+static DataChecksumsWorkerResult
+ProcessDatabase(DataChecksumsWorkerDatabase *db)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	BgwHandleStatus status;
+	pid_t		pid;
+	char		activity[NAMEDATALEN + 64];
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_FAILED;
+
+	memset(&bgw, 0, sizeof(bgw));
+	bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+	bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+	snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+	snprintf(bgw.bgw_function_name, BGW_MAXLEN, "%s", "DataChecksumsWorkerMain");
+	snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker worker");
+	snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker worker");
+	bgw.bgw_restart_time = BGW_NEVER_RESTART;
+	bgw.bgw_notify_pid = MyProcPid;
+	bgw.bgw_main_arg = ObjectIdGetDatum(db->dboid);
+
+	/*
+	 * If there are no worker slots available, make sure we retry processing
+	 * this database. This will make the datachecksumsworker move on to the
+	 * next database and quite likely fail with the same problem. TODO: Maybe
+	 * we need a backoff to avoid running through all the databases here in
+	 * short order.
+	 */
+	if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+	{
+		ereport(WARNING,
+				(errmsg("failed to start worker for enabling data checksums in database \"%s\", retrying",
+						db->dbname),
+				 errhint("The max_worker_processes setting might be too low.")));
+		return DATACHECKSUMSWORKER_RETRYDB;
+	}
+
+	status = WaitForBackgroundWorkerStartup(bgw_handle, &pid);
+	if (status == BGWH_STOPPED)
+	{
+		ereport(WARNING,
+				(errmsg("could not start background worker for enabling data checksums in database \"%s\"",
+						db->dbname),
+				 errhint("More details on the error might be found in the server log.")));
+		return DATACHECKSUMSWORKER_FAILED;
+	}
+
+	/*
+	 * If the postmaster crashed we cannot end up with a processed database so
+	 * we have no alternative other than exiting. When enabling checksums we
+	 * won't at this time have changed the pg_control version to enabled so
+	 * when the cluster comes back up processing will have to be restarted.
+	 * When disabling, the pg_control version will be set to off before this
+	 * so when the cluster comes up checksums will be off as expected.
+	 */
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("cannot enable data checksums without the postmaster process"),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	Assert(status == BGWH_STARTED);
+	ereport(DEBUG1,
+			(errmsg("initiating data checksum processing in database \"%s\"",
+					db->dbname)));
+
+	snprintf(activity, sizeof(activity) - 1,
+			 "Waiting for worker in database %s (pid %d)", db->dbname, pid);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	status = WaitForBackgroundWorkerShutdown(bgw_handle);
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("postmaster exited during data checksum processing in \"%s\"",
+						db->dbname),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	if (DataChecksumsWorkerShmem->success == DATACHECKSUMSWORKER_ABORTED)
+		ereport(LOG,
+				(errmsg("data checksums processing was aborted in database \"%s\"",
+						db->dbname)));
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return DataChecksumsWorkerShmem->success;
+}
+
+/*
+ * launcher_exit
+ *
+ * Internal routine for cleaning up state when the launcher process exits. We
+ * need to clean up the abort flag to ensure that processing can be restarted
+ * again after it was previously aborted.
+ */
+static void
+launcher_exit(int code, Datum arg)
+{
+	if (launcher_running)
+	{
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		launcher_running = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		LWLockRelease(DataChecksumsWorkerLock);
+	}
+}
+
+/*
+ * launcher_cancel_handler
+ *
+ * Internal routine for reacting to SIGINT and flagging the worker to abort.
+ * The worker won't be interrupted immediately but will check for abort flag
+ * between each block in a relation.
+ */
+static void
+launcher_cancel_handler(SIGNAL_ARGS)
+{
+	int			save_errno = errno;
+
+	abort_requested = true;
+
+	/*
+	 * There is no sleeping in the main loop, the flag will be checked
+	 * periodically in ProcessSingleRelationFork. The worker does however
+	 * sleep when waiting for concurrent transactions to end so we still need
+	 * to set the latch.
+	 */
+	SetLatch(MyLatch);
+
+	errno = save_errno;
+}
+
+/*
+ * WaitForAllTransactionsToFinish
+ *		Blocks awaiting all current transactions to finish
+ *
+ * Returns when all transactions which are active at the call of the function
+ * have ended, or if the postmaster dies while waiting. If the postmaster dies
+ * the abort flag will be set to indicate that the caller of this shouldn't
+ * proceed.
+ *
+ * NB: this will return early, if aborted by SIGINT or if the target state
+ * is changed while we're running.
+ */
+static void
+WaitForAllTransactionsToFinish(void)
+{
+	TransactionId waitforxid;
+
+	LWLockAcquire(XidGenLock, LW_SHARED);
+	waitforxid = XidFromFullTransactionId(TransamVariables->nextXid);
+	LWLockRelease(XidGenLock);
+
+	while (TransactionIdPrecedes(GetOldestActiveTransactionId(), waitforxid))
+	{
+		char		activity[64];
+		int			rc;
+
+		/* Oldest running xid is older than us, so wait */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for current transactions to finish (waiting for %u)",
+				 waitforxid);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		rc = WaitLatch(MyLatch,
+					   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+					   5000,
+					   WAIT_EVENT_CHECKSUM_ENABLE_STARTCONDITION);
+
+		/*
+		 * If the postmaster died we won't be able to enable checksums
+		 * cluster-wide so abort and hope to continue when restarted.
+		 */
+		if (rc & WL_POSTMASTER_DEATH)
+			ereport(FATAL,
+					(errmsg("postmaster exited during data checksum processing"),
+					 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_SHARED);
+		if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			abort_requested = true;
+		LWLockRelease(DataChecksumsWorkerLock);
+		if (abort_requested)
+			break;
+	}
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+	return;
+}
+
+/*
+ * DataChecksumsWorkerLauncherMain
+ *
+ * Main function for launching dynamic background workers for processing data
+ * checksums in databases. This function has the bgworker management, with
+ * ProcessAllDatabases being responsible for looping over the databases and
+ * initiating processing.
+ */
+void
+DataChecksumsWorkerLauncherMain(Datum arg)
+{
+	on_shmem_exit(launcher_exit, 0);
+
+	ereport(DEBUG1,
+			errmsg("background worker \"datachecksumsworker\" launcher started"));
+
+	pqsignal(SIGTERM, die);
+	pqsignal(SIGINT, launcher_cancel_handler);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_LAUNCHER;
+	init_ps_display(NULL);
+
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	if (DataChecksumsWorkerShmem->launcher_running)
+	{
+		/* Launcher was already running, let it finish */
+		LWLockRelease(DataChecksumsWorkerLock);
+		return;
+	}
+
+	launcher_running = true;
+
+	/*
+	 * Initialize a connection to shared catalogs only.
+	 */
+	BackgroundWorkerInitializeConnectionByOid(InvalidOid, InvalidOid, 0);
+
+	/* Initialize backend status information */
+	pgstat_bestart();
+
+	enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+	DataChecksumsWorkerShmem->launcher_running = true;
+	DataChecksumsWorkerShmem->enabling_checksums = enabling_checksums;
+	DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+	DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+	DataChecksumsWorkerShmem->immediate_checkpoint = DataChecksumsWorkerShmem->launch_fast;
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * The target state can change while we are busy enabling/disabling
+	 * checksums, if the user calls pg_disable/enable_data_checksums() before
+	 * we are finished with the previous request. In that case, we will loop
+	 * back here, to process the new request.
+	 */
+again:
+
+	/*
+	 * If we're asked to enable checksums, we need to check if processing was
+	 * previously interrupted such that we should resume rather than start
+	 * from scratch.
+	 */
+	if (enabling_checksums)
+	{
+		/*
+		 * If we are asked to enable checksums in a cluster which already has
+		 * checksums enabled, exit immediately as there is nothing more to do.
+		 * Hold interrupts to make sure state doesn't change during checking.
+		 */
+		HOLD_INTERRUPTS();
+		if (DataChecksumsNeedVerify())
+		{
+			RESUME_INTERRUPTS();
+			goto done;
+		}
+		RESUME_INTERRUPTS();
+
+		/*
+		 * Initialize progress and indicate that we are waiting on the other
+		 * backends to clear the procsignalbarrier.
+		 */
+		pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+									  InvalidOid);
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS);
+
+		/*
+		 * Set the state to inprogress-on and wait on the procsignal barrier.
+		 */
+		SetDataChecksumsOnInProgress();
+
+		if (!ProcessAllDatabases(DataChecksumsWorkerShmem->immediate_checkpoint))
+		{
+			/*
+			 * If the target state changed during processing then it's not a
+			 * failure, so restart processing instead.
+			 */
+			LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+			if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			{
+				LWLockRelease(DataChecksumsWorkerLock);
+				goto done;
+			}
+			LWLockRelease(DataChecksumsWorkerLock);
+			ereport(ERROR,
+					(errmsg("unable to enable data checksums in cluster")));
+		}
+
+		SetDataChecksumsOn();
+	}
+	else
+	{
+		SetDataChecksumsOff();
+	}
+
+done:
+
+	/*
+	 * All done. But before we exit, check if the target state was changed
+	 * while we were running. In that case we will have to start all over
+	 * again.
+	 */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+	if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+	{
+		DataChecksumsWorkerShmem->enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+		DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+		LWLockRelease(DataChecksumsWorkerLock);
+		goto again;
+	}
+
+	/* Shut down progress reporting as we are done */
+	pgstat_progress_end_command();
+
+	launcher_running = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	LWLockRelease(DataChecksumsWorkerLock);
+}
+
+/*
+ * ProcessAllDatabases
+ *		Compute the list of all databases and process checksums in each
+ *
+ * This will repeatedly generate a list of databases to process for enabling
+ * checksums. Until no new databases are found, this will loop around computing
+ * a new list and comparing it to the already seen ones.
+ *
+ * If immediate_checkpoint is set to true then a CHECKPOINT_IMMEDIATE will be
+ * issued. This is useful for testing but should be avoided in production use
+ * as it may affect cluster performance drastically.
+ */
+static bool
+ProcessAllDatabases(bool immediate_checkpoint)
+{
+	List	   *DatabaseList;
+	HTAB	   *ProcessedDatabases = NULL;
+	ListCell   *lc;
+	HASHCTL		hash_ctl;
+	bool		found_failed = false;
+	int			flags;
+
+	/* Initialize a hash tracking all processed databases */
+	memset(&hash_ctl, 0, sizeof(hash_ctl));
+	hash_ctl.keysize = sizeof(Oid);
+	hash_ctl.entrysize = sizeof(DataChecksumsWorkerResultEntry);
+	ProcessedDatabases = hash_create("Processed databases",
+									 64,
+									 &hash_ctl,
+									 HASH_ELEM | HASH_BLOBS);
+	/*
+	 * Set up so first run processes shared catalogs, but not once in every
+	 * db.
+	 */
+	DataChecksumsWorkerShmem->process_shared_catalogs = true;
+
+	/*
+	 * Get a list of all databases to process. This may include databases that
+	 * were created during our runtime.  Since a database can be created as a
+	 * copy of any other database (which may not have existed in our last
+	 * run), we have to repeat this loop until no new databases show up in the
+	 * list.
+	 */
+	DatabaseList = BuildDatabaseList();
+
+	/*
+	 * Update progress reporting with the total number of databases we need
+	 * to process. This number should not be changed during processing, the
+	 * columns for processed databases is instead increased such that it can
+	 * be compared against the total.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_DB,
+								 list_length(DatabaseList));
+
+	while (true)
+	{
+		int			processed_databases = 0;
+
+		foreach(lc, DatabaseList)
+		{
+			DataChecksumsWorkerDatabase *db = (DataChecksumsWorkerDatabase *) lfirst(lc);
+			DataChecksumsWorkerResult result;
+			DataChecksumsWorkerResultEntry *entry;
+			bool		found;
+
+			/*
+			 * Indicate which database is being processed set the number of
+			 * relations to -1 to clear field from previous values. -1 will
+			 * translate to NULL in the progress view.
+			 */
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_CUR_DB, db->dboid);
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_REL, -1);
+
+			/*
+			 * Check if this database has been processed already, and if so
+			 * whether it should be retried or skipped.
+			 */
+			entry = (DataChecksumsWorkerResultEntry *) hash_search(ProcessedDatabases, &db->dboid,
+																   HASH_FIND, NULL);
+
+			if (entry)
+			{
+				if (entry->result == DATACHECKSUMSWORKER_RETRYDB)
+				{
+					/*
+					 * Limit the number of retries to avoid infinite looping
+					 * in case there simply won't be enough workers in the
+					 * cluster to finish this operation.
+					 */
+					if (entry->retries > DATACHECKSUMSWORKER_MAX_DB_RETRIES)
+						entry->result = DATACHECKSUMSWORKER_FAILED;
+				}
+
+				/* Skip if this database has been processed already */
+				if (entry->result != DATACHECKSUMSWORKER_RETRYDB)
+					continue;
+			}
+
+			result = ProcessDatabase(db);
+			processed_databases++;
+
+			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
+			{
+				/*
+				 * If one database has completed shared catalogs, we don't
+				 * have to process them again.
+				 */
+				if (DataChecksumsWorkerShmem->process_shared_catalogs)
+					DataChecksumsWorkerShmem->process_shared_catalogs = false;
+			}
+			else if (result == DATACHECKSUMSWORKER_ABORTED)
+			{
+				/* Abort flag set, so exit the whole process */
+				return false;
+			}
+
+			entry = hash_search(ProcessedDatabases, &db->dboid, HASH_ENTER, &found);
+			entry->dboid = db->dboid;
+			entry->result = result;
+			if (!found)
+				entry->retries = 0;
+			else
+				entry->retries++;
+		}
+
+		elog(DEBUG1,
+			 "%i databases processed for data checksum enabling, %s",
+			 processed_databases,
+			 (processed_databases ? "process with restart" : "process completed"));
+
+		FreeDatabaseList(DatabaseList);
+
+		/*
+		 * If no databases were processed in this run of the loop, we have now
+		 * finished all databases and no concurrently created ones can exist.
+		 */
+		if (processed_databases == 0)
+			break;
+
+		/*
+		 * Re-generate the list of databases for another pass. Since we wait
+		 * for all pre-existing transactions finish, this way we can be
+		 * certain that there are no databases left without checksums.
+		 */
+		WaitForAllTransactionsToFinish();
+		DatabaseList = BuildDatabaseList();
+	}
+
+	/*
+	 * ProcessedDatabases now has all databases and the results of their
+	 * processing. Failure to enable checksums for a database can be because
+	 * they actually failed for some reason, or because the database was
+	 * dropped between us getting the database list and trying to process it.
+	 * Get a fresh list of databases to detect the second case where the
+	 * database was dropped before we had started processing it. If a database
+	 * still exists, but enabling checksums failed then we fail the entire
+	 * checksumming process and exit with an error.
+	 */
+	WaitForAllTransactionsToFinish();
+	DatabaseList = BuildDatabaseList();
+
+	foreach(lc, DatabaseList)
+	{
+		DataChecksumsWorkerDatabase *db = (DataChecksumsWorkerDatabase *) lfirst(lc);
+		DataChecksumsWorkerResult *entry;
+		bool		found;
+
+		entry = hash_search(ProcessedDatabases, (void *) &db->dboid,
+							HASH_FIND, &found);
+
+		/*
+		 * We are only interested in the databases where the failed database
+		 * still exists.
+		 */
+		if (found && *entry == DATACHECKSUMSWORKER_FAILED)
+		{
+			ereport(WARNING,
+					(errmsg("failed to enable data checksums in \"%s\"",
+							db->dbname)));
+			found_failed = found;
+			continue;
+		}
+	}
+
+	FreeDatabaseList(DatabaseList);
+
+	if (found_failed)
+	{
+		/* Disable checksums on cluster, because we failed */
+		SetDataChecksumsOff();
+		ereport(ERROR,
+				(errmsg("data checksums failed to get enabled in all databases, aborting"),
+				 errhint("The server log might have more information on the cause of the error.")));
+	}
+
+	/*
+	 * Force a checkpoint to get everything out to disk. The use of immediate
+	 * checkpoints is for running tests, as they would otherwise not execute
+	 * in such a way that they can reliably be placed under timeout control.
+	 */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_IMMEDIATE;
+	RequestCheckpoint(flags);
+
+	return true;
+}
+
+/*
+ * DataChecksumsWorkerShmemSize
+ *		Compute required space for datachecksumsworker-related shared memory
+ */
+Size
+DataChecksumsWorkerShmemSize(void)
+{
+	Size		size;
+
+	size = sizeof(DataChecksumsWorkerShmemStruct);
+	size = MAXALIGN(size);
+
+	return size;
+}
+
+/*
+ * DataChecksumsWorkerShmemInit
+ *		Allocate and initialize datachecksumsworker-related shared memory
+ */
+void
+DataChecksumsWorkerShmemInit(void)
+{
+	bool		found;
+
+	DataChecksumsWorkerShmem = (DataChecksumsWorkerShmemStruct *)
+		ShmemInitStruct("DataChecksumsWorker Data",
+						DataChecksumsWorkerShmemSize(),
+						&found);
+
+	MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
+
+	/*
+	 * Even if this is a redundant assignment, we want to be explicit about
+	 * our intent for readability, since we want to be able to query this
+	 * state in case of restartability.
+	 */
+	DataChecksumsWorkerShmem->launch_enable_checksums = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	DataChecksumsWorkerShmem->launch_fast = false;
+}
+
+/*
+ * BuildDatabaseList
+ *		Compile a list of all currently available databases in the cluster
+ *
+ * This creates the list of databases for the datachecksumsworker workers to
+ * add checksums to. If the caller wants to ensure that no concurrently
+ * running CREATE DATABASE calls exist, this needs to be preceded by a call
+ * to WaitForAllTransactionsToFinish().
+ */
+static List *
+BuildDatabaseList(void)
+{
+	List	   *DatabaseList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(DatabaseRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_database pgdb = (Form_pg_database) GETSTRUCT(tup);
+		DataChecksumsWorkerDatabase *db;
+
+		oldctx = MemoryContextSwitchTo(ctx);
+
+		db = (DataChecksumsWorkerDatabase *) palloc0(sizeof(DataChecksumsWorkerDatabase));
+
+		db->dboid = pgdb->oid;
+		db->dbname = pstrdup(NameStr(pgdb->datname));
+
+		DatabaseList = lappend(DatabaseList, db);
+
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return DatabaseList;
+}
+
+static void
+FreeDatabaseList(List *dblist)
+{
+	ListCell   *lc;
+
+	if (!dblist)
+		return;
+
+	foreach(lc, dblist)
+	{
+		DataChecksumsWorkerDatabase *db = lfirst(lc);
+
+		if (db->dbname != NULL)
+			pfree(db->dbname);
+	}
+
+	list_free_deep(dblist);
+}
+
+/*
+ * BuildRelationList
+ *		Compile a list of relations in the database
+ *
+ * Returns a list of OIDs for the request relation types. If temp_relations
+ * is True then only temporary relations are returned. If temp_relations is
+ * False then non-temporary relations which have data checksums are returned.
+ * If include_shared is True then shared relations are included as well in a
+ * non-temporary list. include_shared has no relevance when building a list of
+ * temporary relations.
+ */
+static List *
+BuildRelationList(bool temp_relations, bool include_shared)
+{
+	List	   *RelationList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(RelationRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_class pgc = (Form_pg_class) GETSTRUCT(tup);
+
+		/*
+		 * Only include temporary relations when asked for a temp relation
+		 * list.
+		 */
+		if (pgc->relpersistence == RELPERSISTENCE_TEMP)
+		{
+			if (!temp_relations)
+				continue;
+		}
+		else
+		{
+			/*
+			 * If we are only interested in temp relations then continue
+			 * immediately as the current relation isn't a temp relation.
+			 */
+			if (temp_relations)
+				continue;
+
+			if (!RELKIND_HAS_STORAGE(pgc->relkind))
+				continue;
+
+			if (pgc->relisshared && !include_shared)
+				continue;
+		}
+
+		oldctx = MemoryContextSwitchTo(ctx);
+		RelationList = lappend_oid(RelationList, pgc->oid);
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return RelationList;
+}
+
+/*
+ * DataChecksumsWorkerMain
+ *
+ * Main function for enabling checksums in a single database, This is the
+ * function set as the bgw_function_name in the dynamic background worker
+ * process initiated for each database by the worker launcher. After enabling
+ * data checksums in each applicable relation in the database, it will wait for
+ * all temporary relations that were present when the function started to
+ * disappear before returning. This is required since we cannot rewrite
+ * existing temporary relations with data checksums.
+ */
+void
+DataChecksumsWorkerMain(Datum arg)
+{
+	Oid			dboid = DatumGetObjectId(arg);
+	List	   *RelationList = NIL;
+	List	   *InitialTempTableList = NIL;
+	ListCell   *lc;
+	BufferAccessStrategy strategy;
+	bool		aborted = false;
+
+	enabling_checksums = true;
+
+	pqsignal(SIGTERM, die);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_WORKER;
+	init_ps_display(NULL);
+
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS, dboid);
+
+	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
+											  BGWORKER_BYPASS_ALLOWCONN);
+
+	/*
+	 * Get a list of all temp tables present as we start in this database. We
+	 * need to wait until they are all gone until we are done, since we cannot
+	 * access these relations and modify them.
+	 */
+	InitialTempTableList = BuildRelationList(true, false);
+
+	/*
+	 * Enable vacuum cost delay, if any.
+	 */
+	Assert(DataChecksumsWorkerShmem->enabling_checksums);
+	VacuumCostDelay = DataChecksumsWorkerShmem->cost_delay;
+	VacuumCostLimit = DataChecksumsWorkerShmem->cost_limit;
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumCostPageHit = 0;
+	VacuumCostPageMiss = 0;
+	VacuumCostPageDirty = 0;
+
+	/*
+	 * Create and set the vacuum strategy as our buffer strategy.
+	 */
+	strategy = GetAccessStrategy(BAS_VACUUM);
+
+	RelationList = BuildRelationList(false,
+									 DataChecksumsWorkerShmem->process_shared_catalogs);
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_REL,
+								 list_length(RelationList));
+	foreach(lc, RelationList)
+	{
+		Oid			reloid = lfirst_oid(lc);
+
+		if (!ProcessSingleRelationByOid(reloid, strategy))
+		{
+			aborted = true;
+			break;
+		}
+	}
+	list_free(RelationList);
+
+	if (aborted)
+	{
+		DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+		ereport(DEBUG1,
+				(errmsg("data checksum processing aborted in database OID %u",
+						dboid)));
+		return;
+	}
+
+	/*
+	 * Wait for all temp tables that existed when we started to go away. This
+	 * is necessary since we cannot "reach" them to enable checksums. Any temp
+	 * tables created after we started will already have checksums in them
+	 * (due to the "inprogress-on" state), so no need to wait for those.
+	 */
+	for (;;)
+	{
+		List	   *CurrentTempTables;
+		ListCell   *lc;
+		int			numleft;
+		char		activity[64];
+		const int index[] = {
+			PROGRESS_DATACHECKSUMS_PHASE,
+			PROGRESS_DATACHECKSUMS_TOTAL_REL
+		};
+		int64 vals[2];
+
+		CurrentTempTables = BuildRelationList(true, false);
+		numleft = 0;
+		foreach(lc, InitialTempTableList)
+		{
+			if (list_member_oid(CurrentTempTables, lfirst_oid(lc)))
+				numleft++;
+		}
+		list_free(CurrentTempTables);
+
+		if (numleft == 0)
+			break;
+
+		/*
+		 * At least one temp table is left to wait for, indicate in pgstat
+		 * activity and progress reporting.
+		 */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for %d temp tables to be removed", numleft);
+		pgstat_report_activity(STATE_RUNNING, activity);
+		vals[0] = PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL;
+		vals[1] = numleft;
+		pgstat_progress_update_multi_param(2, index, vals);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		(void) WaitLatch(MyLatch,
+						 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+						 5000,
+						 WAIT_EVENT_CHECKSUM_ENABLE_FINISHCONDITION);
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		aborted = DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums;
+		LWLockRelease(DataChecksumsWorkerLock);
+
+		if (aborted || abort_requested)
+		{
+			DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+			ereport(DEBUG1,
+					(errmsg("data checksum processing aborted in database OID %u",
+							dboid)));
+			return;
+		}
+	}
+
+	list_free(InitialTempTableList);
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
+
+	pgstat_progress_end_command();
+}
diff --git a/src/backend/postmaster/meson.build b/src/backend/postmaster/meson.build
index 0ea4bbe084..3aacb2e0e7 100644
--- a/src/backend/postmaster/meson.build
+++ b/src/backend/postmaster/meson.build
@@ -6,6 +6,7 @@ backend_sources += files(
   'bgworker.c',
   'bgwriter.c',
   'checkpointer.c',
+  'datachecksumsworker.c',
   'fork_process.c',
   'interrupt.c',
   'launch_backend.c',
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index d687ceee33..6c38a57a20 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -186,6 +186,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 		case XLOG_FPW_CHANGE:
 		case XLOG_FPI_FOR_HINT:
 		case XLOG_FPI:
+		case XLOG_CHECKSUMS:
 		case XLOG_OVERWRITE_CONTRECORD:
 		case XLOG_CHECKPOINT_REDO:
 			break;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 4852044300..93e198f19e 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1516,7 +1516,9 @@ WaitReadBuffers(ReadBuffersOperation *operation)
 				bufBlock = BufHdrGetBlock(bufHdr);
 			}
 
-			/* check for garbage data */
+			/*
+			 * Check for garbage data.
+			 */
 			if (!PageIsVerifiedExtended((Page) bufBlock, io_first_block + j,
 										PIV_LOG_WARNING | PIV_REPORT_STAT))
 			{
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 10fc18f252..b9d98d0ada 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -31,6 +31,7 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "postmaster/walsummarizer.h"
 #include "replication/logicallauncher.h"
@@ -152,6 +153,7 @@ CalculateShmemSize(int *num_semaphores)
 	size = add_size(size, InjectionPointShmemSize());
 	size = add_size(size, SlotSyncShmemSize());
 	size = add_size(size, WaitLSNShmemSize());
+	size = add_size(size, DataChecksumsWorkerShmemSize());
 
 	/* include additional requested shmem from preload libraries */
 	size = add_size(size, total_addin_request);
@@ -334,6 +336,7 @@ CreateOrAttachShmemStructs(void)
 	PgArchShmemInit();
 	ApplyLauncherShmemInit();
 	SlotSyncShmemInit();
+	DataChecksumsWorkerShmemInit();
 
 	/*
 	 * Set up other modules that need some shared memory space
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 87027f27eb..e51cfb3a98 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -18,6 +18,7 @@
 #include <unistd.h>
 
 #include "access/parallel.h"
+#include "access/xlog.h"
 #include "commands/async.h"
 #include "miscadmin.h"
 #include "pgstat.h"
@@ -573,6 +574,18 @@ ProcessProcSignalBarrier(void)
 					case PROCSIGNAL_BARRIER_SMGRRELEASE:
 						processed = ProcessBarrierSmgrRelease();
 						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON:
+						processed = AbsorbChecksumsOnInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_ON:
+						processed = AbsorbChecksumsOnBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF:
+						processed = AbsorbChecksumsOffInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_OFF:
+						processed = AbsorbChecksumsOffBarrier();
+						break;
 				}
 
 				/*
diff --git a/src/backend/storage/page/README b/src/backend/storage/page/README
index e30d7ac59a..73c36a6390 100644
--- a/src/backend/storage/page/README
+++ b/src/backend/storage/page/README
@@ -10,7 +10,9 @@ http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf, discussed
 2010/12/22 on -hackers list.
 
 Current implementation requires this be enabled system-wide at initdb time, or
-by using the pg_checksums tool on an offline cluster.
+by using the pg_checksums tool on an offline cluster.  Checksums can also be
+enabled at runtime using pg_enable_data_checksums(), and disabled by using
+pg_disable_data_checksums().
 
 The checksum is not valid at all times on a data page!!
 The checksum is valid when the page leaves the shared pool and is checked
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index be6f1f62d2..112c05ee7e 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -100,7 +100,7 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 	 */
 	if (!PageIsNew(page))
 	{
-		if (DataChecksumsEnabled())
+		if (DataChecksumsNeedVerify())
 		{
 			checksum = pg_checksum_page((char *) page, blkno);
 
@@ -1512,7 +1512,7 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 	static char *pageCopy = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return (char *) page;
 
 	/*
@@ -1542,7 +1542,7 @@ void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
 	/* If we don't need a checksum, just return */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return;
 
 	((PageHeader) page)->pd_checksum = pg_checksum_page((char *) page, blkno);
diff --git a/src/backend/storage/smgr/bulk_write.c b/src/backend/storage/smgr/bulk_write.c
index 1a5f3ce96e..1981ed768d 100644
--- a/src/backend/storage/smgr/bulk_write.c
+++ b/src/backend/storage/smgr/bulk_write.c
@@ -36,6 +36,7 @@
 
 #include "access/xloginsert.h"
 #include "access/xlogrecord.h"
+#include "miscadmin.h"
 #include "storage/bufmgr.h"
 #include "storage/bufpage.h"
 #include "storage/bulk_write.h"
@@ -303,6 +304,7 @@ smgr_bulk_flush(BulkWriteState *bulkstate)
 		}
 		else
 			smgrwrite(bulkstate->smgr, bulkstate->forknum, blkno, page, true);
+
 		pfree(page);
 	}
 
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index d1768a89f6..b612d9d0fc 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -353,7 +353,6 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
 		.reset_timestamp_cb = pgstat_subscription_reset_timestamp_cb,
 	},
 
-
 	/* stats for fixed-numbered (mostly 1) objects */
 
 	[PGSTAT_KIND_ARCHIVER] = {
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index cc2ffc78aa..2b5671e2ca 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -359,6 +359,8 @@ pgstat_tracks_io_bktype(BackendType bktype)
 		case B_WAL_SUMMARIZER:
 			return false;
 
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 		case B_AUTOVAC_LAUNCHER:
 		case B_AUTOVAC_WORKER:
 		case B_BACKEND:
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index 8efb4044d6..4ed6ec157a 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -115,6 +115,8 @@ CHECKPOINT_DELAY_COMPLETE	"Waiting for a backend that blocks a checkpoint from c
 CHECKPOINT_DELAY_START	"Waiting for a backend that blocks a checkpoint from starting."
 CHECKPOINT_DONE	"Waiting for a checkpoint to complete."
 CHECKPOINT_START	"Waiting for a checkpoint to start."
+CHECKSUM_ENABLE_STARTCONDITION	"Waiting for data checksums enabling to start."
+CHECKSUM_ENABLE_FINISHCONDITION	"Waiting for data checksums to be enabled."
 EXECUTE_GATHER	"Waiting for activity from a child process while executing a <literal>Gather</literal> plan node."
 HASH_BATCH_ALLOCATE	"Waiting for an elected Parallel Hash participant to allocate a hash table."
 HASH_BATCH_ELECT	"Waiting to elect a Parallel Hash participant to allocate a hash table."
@@ -347,6 +349,7 @@ DSMRegistry	"Waiting to read or update the dynamic shared memory registry."
 InjectionPoint	"Waiting to read or update information related to injection points."
 SerialControl	"Waiting to read or update shared <filename>pg_serial</filename> state."
 WaitLSN	"Waiting to read or update shared Wait-for-LSN state."
+DataChecksumsWorker	"Waiting for data checksumsworker."
 
 #
 # END OF PREDEFINED LWLOCKS (DO NOT CHANGE THIS LINE)
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index f7b50e0b5a..9f52541eda 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -246,6 +246,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		cmdtype = PROGRESS_COMMAND_BASEBACKUP;
 	else if (pg_strcasecmp(cmd, "COPY") == 0)
 		cmdtype = PROGRESS_COMMAND_COPY;
+	else if (pg_strcasecmp(cmd, "DATACHECKSUMS") == 0)
+		cmdtype = PROGRESS_COMMAND_DATACHECKSUMS;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -1115,9 +1117,6 @@ pg_stat_get_db_checksum_failures(PG_FUNCTION_ARGS)
 	int64		result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
@@ -1133,9 +1132,6 @@ pg_stat_get_db_checksum_last_failure(PG_FUNCTION_ARGS)
 	TimestampTz result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index 537d92c0cf..6571c187cc 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -290,6 +290,12 @@ GetBackendTypeDesc(BackendType backendType)
 		case B_CHECKPOINTER:
 			backendDesc = "checkpointer";
 			break;
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+			backendDesc = "datachecksumsworker launcher";
+			break;
+		case B_DATACHECKSUMSWORKER_WORKER:
+			backendDesc = "datachecksumsworker worker";
+			break;
 		case B_LOGGER:
 			backendDesc = "logger";
 			break;
@@ -841,7 +847,8 @@ InitializeSessionUserIdStandalone(void)
 	 * workers, in slot sync worker and in background workers.
 	 */
 	Assert(!IsUnderPostmaster || AmAutoVacuumWorkerProcess() ||
-		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess());
+		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess() ||
+		   AmDataChecksumsWorkerProcess());
 
 	/* call only once */
 	Assert(!OidIsValid(AuthenticatedUserId));
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index a024b1151d..513b163ed7 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -720,6 +720,11 @@ InitPostgres(const char *in_dbname, Oid dboid,
 	 */
 	SharedInvalBackendInit(false);
 
+	/*
+	 * Set up backend local cache of Controldata values.
+	 */
+	InitLocalControldata();
+
 	ProcSignalInit(MyCancelKeyValid, MyCancelKey);
 
 	/*
@@ -864,7 +869,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
 					 errhint("You should immediately run CREATE USER \"%s\" SUPERUSER;.",
 							 username != NULL ? username : "postgres")));
 	}
-	else if (AmBackgroundWorkerProcess())
+	else if (AmBackgroundWorkerProcess() || AmDataChecksumsWorkerProcess())
 	{
 		if (username == NULL && !OidIsValid(useroid))
 		{
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 686309db58..e56e43f701 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -474,6 +474,14 @@ static const struct config_enum_entry wal_compression_options[] = {
 	{NULL, 0, false}
 };
 
+static const struct config_enum_entry data_checksums_options[] = {
+	{"on", DATA_CHECKSUMS_ON, true},
+	{"off", DATA_CHECKSUMS_OFF, true},
+	{"inprogress-on", DATA_CHECKSUMS_INPROGRESS_ON, true},
+	{"inprogress-off", DATA_CHECKSUMS_INPROGRESS_OFF, true},
+	{NULL, 0, false}
+};
+
 /*
  * Options for enum values stored in other modules
  */
@@ -599,7 +607,7 @@ static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
 static int	num_os_semaphores;
-static bool data_checksums;
+static int data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
@@ -1919,17 +1927,6 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
-	{
-		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
-			gettext_noop("Shows whether data checksums are turned on for this cluster."),
-			NULL,
-			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
-		},
-		&data_checksums,
-		false,
-		NULL, NULL, NULL
-	},
-
 	{
 		{"syslog_sequence_numbers", PGC_SIGHUP, LOGGING_WHERE,
 			gettext_noop("Add sequence number to syslog messages to avoid duplicate suppression."),
@@ -5196,6 +5193,17 @@ struct config_enum ConfigureNamesEnum[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows whether data checksums are turned on for this cluster."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+		},
+		&data_checksums,
+		DATA_CHECKSUMS_OFF, data_checksums_options,
+		NULL, NULL, show_data_checksums
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index f5f7ff1045..3bda3adb04 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -577,7 +577,7 @@ main(int argc, char *argv[])
 		mode == PG_MODE_DISABLE)
 		pg_fatal("data checksums are already disabled in cluster");
 
-	if (ControlFile->data_checksum_version > 0 &&
+	if (ControlFile->data_checksum_version == DATA_CHECKSUMS_ON &&
 		mode == PG_MODE_ENABLE)
 		pg_fatal("data checksums are already enabled in cluster");
 
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index 854c6887a2..527c807f1c 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -693,6 +693,15 @@ check_control_data(ControlData *oldctrl,
 	 * check_for_isn_and_int8_passing_mismatch().
 	 */
 
+	/*
+	 * If data checksums have been turned on in the old cluster, but the
+	 * datachecksumsworker have yet to finish, then disallow the upgrade. The
+	 * user should either let the process finish, or turn off checksums,
+	 * before retrying.
+	 */
+	if (oldctrl->data_checksum_version == 2)
+		pg_fatal("checksums are being enabled in the old cluster");
+
 	/*
 	 * We might eventually allow upgrades from checksum to no-checksum
 	 * clusters.
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 34ad46c067..dd7c75abe3 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -117,7 +117,7 @@ extern PGDLLIMPORT int wal_level;
  * of the bits make it to disk, but the checksum wouldn't match.  Also WAL-log
  * them if forced by wal_log_hints=on.
  */
-#define XLogHintBitIsNeeded() (DataChecksumsEnabled() || wal_log_hints)
+#define XLogHintBitIsNeeded() (wal_log_hints || DataChecksumsNeedWrite())
 
 /* Do we need to WAL-log information required only for Hot Standby and logical replication? */
 #define XLogStandbyInfoActive() (wal_level >= WAL_LEVEL_REPLICA)
@@ -230,7 +230,19 @@ extern XLogRecPtr GetXLogWriteRecPtr(void);
 
 extern uint64 GetSystemIdentifier(void);
 extern char *GetMockAuthenticationNonce(void);
-extern bool DataChecksumsEnabled(void);
+extern bool DataChecksumsNeedWrite(void);
+extern bool DataChecksumsNeedVerify(void);
+extern bool DataChecksumsOnInProgress(void);
+extern bool DataChecksumsOffInProgress(void);
+extern void SetDataChecksumsOnInProgress(void);
+extern void SetDataChecksumsOn(void);
+extern void SetDataChecksumsOff(void);
+extern bool AbsorbChecksumsOnInProgressBarrier(void);
+extern bool AbsorbChecksumsOffInProgressBarrier(void);
+extern bool AbsorbChecksumsOnBarrier(void);
+extern bool AbsorbChecksumsOffBarrier(void);
+extern const char *show_data_checksums(void);
+extern void InitLocalControldata(void);
 extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
 extern Size XLOGShmemSize(void);
 extern void XLOGShmemInit(void);
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index 5ef244bcdb..21edc8b737 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,6 +25,7 @@
 #include "lib/stringinfo.h"
 #include "pgtime.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/relfilelocator.h"
 
 
@@ -289,6 +290,12 @@ typedef struct xl_restore_point
 	char		rp_name[MAXFNAMELEN];
 } xl_restore_point;
 
+/* Information logged when data checksum level is changed */
+typedef struct xl_checksum_state
+{
+	ChecksumType new_checksumtype;
+}			xl_checksum_state;
+
 /* Overwrite of prior contrecord */
 typedef struct xl_overwrite_contrecord
 {
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index e80ff8e414..17da375aa6 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -80,6 +80,7 @@ typedef struct CheckPoint
 /* 0xC0 is used in Postgres 9.5-11 */
 #define XLOG_OVERWRITE_CONTRECORD		0xD0
 #define XLOG_CHECKPOINT_REDO			0xE0
+#define XLOG_CHECKSUMS					0xF0
 
 
 /*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 77f54a79e6..272ed8edc2 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12145,6 +12145,23 @@
   proname => 'jsonb_subscript_handler', prorettype => 'internal',
   proargtypes => 'internal', prosrc => 'jsonb_subscript_handler' },
 
+# data checksum management functions
+{ oid => '9258',
+  descr => 'disable data checksums',
+  proname => 'pg_disable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => '',
+  prosrc => 'disable_data_checksums' },
+
+{ oid => '9257',
+  descr => 'enable data checksums',
+  proname => 'pg_enable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'int4 int4 bool', proallargtypes => '{int4,int4,bool}',
+  proargmodes => '{i,i,i}',
+  proargnames => '{cost_delay,cost_limit,fast}',
+  prosrc => 'enable_data_checksums' },
+
 # collation management functions
 { oid => '3445', descr => 'import collations from operating system',
   proname => 'pg_import_system_collations', procost => '100',
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 5616d64523..8e3c180392 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -155,4 +155,22 @@
 #define PROGRESS_COPY_TYPE_PIPE 3
 #define PROGRESS_COPY_TYPE_CALLBACK 4
 
+/* Progress parameters for PROGRESS_DATACHECKSUMS */
+#define PROGRESS_DATACHECKSUMS_PHASE 0
+#define PROGRESS_DATACHECKSUMS_TOTAL_DB 1
+#define PROGRESS_DATACHECKSUMS_TOTAL_REL 2
+#define PROGRESS_DATACHECKSUMS_PROCESSED_DB 3
+#define PROGRESS_DATACHECKSUMS_PROCESSED_REL 4
+#define PROGRESS_DATACHECKSUMS_CUR_DB 5
+#define PROGRESS_DATACHECKSUMS_CUR_REL 6
+#define PROGRESS_DATACHECKSUMS_CUR_REL_TOTAL_BLOCKS 7
+#define PROGRESS_DATACHECKSUMS_CUR_REL_PROCESSED_BLOCKS 8
+
+/* Phases of datachecksumsworker operation */
+#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING 0
+#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING 1
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS 2
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL 3
+#define PROGRESS_DATACHECKSUMS_DONE 4
+
 #endif
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index e26d108a47..55e507683a 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -357,6 +357,9 @@ typedef enum BackendType
 	B_WAL_SUMMARIZER,
 	B_WAL_WRITER,
 
+	B_DATACHECKSUMSWORKER_LAUNCHER,
+	B_DATACHECKSUMSWORKER_WORKER,
+
 	/*
 	 * Logger is not connected to shared memory and does not have a PGPROC
 	 * entry.
@@ -380,6 +383,7 @@ extern PGDLLIMPORT BackendType MyBackendType;
 #define AmWalReceiverProcess()		(MyBackendType == B_WAL_RECEIVER)
 #define AmWalSummarizerProcess()	(MyBackendType == B_WAL_SUMMARIZER)
 #define AmWalWriterProcess()		(MyBackendType == B_WAL_WRITER)
+#define AmDataChecksumsWorkerProcess()	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
 
 extern const char *GetBackendTypeDesc(BackendType backendType);
 
diff --git a/src/include/postmaster/datachecksumsworker.h b/src/include/postmaster/datachecksumsworker.h
new file mode 100644
index 0000000000..59c9000d64
--- /dev/null
+++ b/src/include/postmaster/datachecksumsworker.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.h
+ *	  header file for checksum helper background worker
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/postmaster/datachecksumsworker.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef DATACHECKSUMSWORKER_H
+#define DATACHECKSUMSWORKER_H
+
+/* Shared memory */
+extern Size DataChecksumsWorkerShmemSize(void);
+extern void DataChecksumsWorkerShmemInit(void);
+
+/* Start the background processes for enabling or disabling checksums */
+void		StartDataChecksumsWorkerLauncher(bool enable_checksums,
+											 int cost_delay,
+											 int cost_limit,
+											 bool fast);
+
+/* Background worker entrypoints */
+void		DataChecksumsWorkerLauncherMain(Datum arg);
+void		DataChecksumsWorkerMain(Datum arg);
+
+#endif							/* DATACHECKSUMSWORKER_H */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 6222d46e53..9ff4127e58 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -204,7 +204,17 @@ typedef PageHeaderData *PageHeader;
  * handling pages.
  */
 #define PG_PAGE_LAYOUT_VERSION		4
+
+/*
+ * Checksum version 0 is used for when data checksums are disabled.
+ * PG_DATA_CHECKSUM_VERSION defines that data checksums are enabled in the
+ * cluster and PG_DATA_CHECKSUM_INPROGRESS_{ON|OFF}_VERSION defines that data
+ * checksums are either currently being enabled or disabled.
+ */
 #define PG_DATA_CHECKSUM_VERSION	1
+#define PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION	2
+#define PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION	3
+
 
 /* ----------------------------------------------------------------
  *						page support functions
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 08e9d598ce..e3e6ec3d4a 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -15,6 +15,14 @@
 
 #include "storage/block.h"
 
+typedef enum ChecksumType
+{
+	DATA_CHECKSUMS_OFF = 0,
+	DATA_CHECKSUMS_ON,
+	DATA_CHECKSUMS_INPROGRESS_ON,
+	DATA_CHECKSUMS_INPROGRESS_OFF
+} ChecksumType;
+
 /*
  * Compute the checksum for a Postgres page.  The page must be aligned on a
  * 4-byte boundary.
diff --git a/src/include/storage/lwlocklist.h b/src/include/storage/lwlocklist.h
index 88dc79b2bd..c3b3011f72 100644
--- a/src/include/storage/lwlocklist.h
+++ b/src/include/storage/lwlocklist.h
@@ -84,3 +84,4 @@ PG_LWLOCK(50, DSMRegistry)
 PG_LWLOCK(51, InjectionPoint)
 PG_LWLOCK(52, SerialControl)
 PG_LWLOCK(53, WaitLSN)
+PG_LWLOCK(54, DataChecksumsWorker)
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 221073def3..dbe68d7eed 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -54,6 +54,11 @@ typedef enum
 typedef enum
 {
 	PROCSIGNAL_BARRIER_SMGRRELEASE, /* ask smgr to close files */
+
+	PROCSIGNAL_BARRIER_CHECKSUM_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_ON,
 } ProcSignalBarrierType;
 
 /*
diff --git a/src/include/utils/backend_progress.h b/src/include/utils/backend_progress.h
index 7b63d38f97..063d3ef86c 100644
--- a/src/include/utils/backend_progress.h
+++ b/src/include/utils/backend_progress.h
@@ -28,6 +28,7 @@ typedef enum ProgressCommandType
 	PROGRESS_COMMAND_CREATE_INDEX,
 	PROGRESS_COMMAND_BASEBACKUP,
 	PROGRESS_COMMAND_COPY,
+	PROGRESS_COMMAND_DATACHECKSUMS,
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	20
diff --git a/src/test/Makefile b/src/test/Makefile
index dbd3192874..36023c1878 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,15 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = perl regress isolation modules authentication recovery subscription
+SUBDIRS = \
+		perl \
+		regress \
+		isolation \
+		modules \
+		authentication \
+		recovery \
+		subscription \
+		checksum
 
 ifeq ($(with_icu),yes)
 SUBDIRS += icu
diff --git a/src/test/checksum/.gitignore b/src/test/checksum/.gitignore
new file mode 100644
index 0000000000..871e943d50
--- /dev/null
+++ b/src/test/checksum/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/checksum/Makefile b/src/test/checksum/Makefile
new file mode 100644
index 0000000000..fd03bf73df
--- /dev/null
+++ b/src/test/checksum/Makefile
@@ -0,0 +1,23 @@
+#-------------------------------------------------------------------------
+#
+# Makefile for src/test/checksum
+#
+# Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/checksum/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/test/checksum
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
+
+clean distclean maintainer-clean:
+	rm -rf tmp_check
diff --git a/src/test/checksum/README b/src/test/checksum/README
new file mode 100644
index 0000000000..0f0317060b
--- /dev/null
+++ b/src/test/checksum/README
@@ -0,0 +1,22 @@
+src/test/checksum/README
+
+Regression tests for data checksums
+===================================
+
+This directory contains a test suite for enabling data checksums
+in a running cluster.
+
+Running the tests
+=================
+
+    make check
+
+or
+
+    make installcheck
+
+NOTE: This creates a temporary installation (in the case of "check"),
+with multiple nodes, be they master or standby(s) for the purpose of
+the tests.
+
+NOTE: This requires the --enable-tap-tests argument to configure.
diff --git a/src/test/checksum/meson.build b/src/test/checksum/meson.build
new file mode 100644
index 0000000000..5f96b5c246
--- /dev/null
+++ b/src/test/checksum/meson.build
@@ -0,0 +1,15 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+tests += {
+  'name': 'checksums',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'tests': [
+      't/001_basic.pl',
+      't/002_restarts.pl',
+      't/003_standby_restarts.pl',
+      't/004_offline.pl',
+    ],
+  },
+}
diff --git a/src/test/checksum/t/001_basic.pl b/src/test/checksum/t/001_basic.pl
new file mode 100644
index 0000000000..f16cf78b91
--- /dev/null
+++ b/src/test/checksum/t/001_basic.pl
@@ -0,0 +1,85 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init();
+$node->start();
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are turned off
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable data checksums
+$node->safe_psql('postgres', "SELECT pg_enable_data_checksums($enable_params);");
+
+# Wait for checksums to become enabled
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Enable data checksums again which should be a no-op..
+$node->safe_psql('postgres', "SELECT pg_enable_data_checksums($enable_params);");
+# ..and make sure we can still read/write data
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums again
+$node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are disabled');
+
+# Test reading again
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure previously checksummed pages can be read back');
+
+# Re-enable checksums and make sure that the underlying data has changed to
+# ensure that checksums will be different.
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+
+$node->safe_psql('postgres', "SELECT pg_enable_data_checksums($enable_params);");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back the data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$node->stop;
+
+done_testing();
diff --git a/src/test/checksum/t/002_restarts.pl b/src/test/checksum/t/002_restarts.pl
new file mode 100644
index 0000000000..dea0ec31df
--- /dev/null
+++ b/src/test/checksum/t/002_restarts.pl
@@ -0,0 +1,90 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# restarting the processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init;
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres', "SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, '1', "ensure checksums aren't enabled yet");
+
+$bsession->quit;
+$node->stop;
+$node->start;
+
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'inprogress-on', "ensure checksums aren't enabled yet");
+
+$node->safe_psql('postgres', "SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are turned on');
+
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+$result = $node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are turned off');
+
+done_testing();
diff --git a/src/test/checksum/t/003_standby_restarts.pl b/src/test/checksum/t/003_standby_restarts.pl
new file mode 100644
index 0000000000..26ad93f86e
--- /dev/null
+++ b/src/test/checksum/t/003_standby_restarts.pl
@@ -0,0 +1,130 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# streaming replication
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize primary node
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+$node_primary->init(allows_streaming => 1);
+$node_primary->start;
+my $backup_name = 'my_backup';
+
+# Take backup
+$node_primary->backup($backup_name);
+
+# Create streaming standby linking to primary
+my $node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $backup_name,
+	has_streaming => 1);
+$node_standby_1->start;
+
+# Create some content on the primary to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Wait for standbys to catch up
+$node_primary->wait_for_catchup($node_standby_1, 'replay',
+	$node_primary->lsn('insert'));
+
+# Check that checksums are turned off on all nodes
+my $result = $node_primary->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on primary');
+
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on standby_1');
+
+# Enable checksums for the cluster
+$node_primary->safe_psql('postgres', "SELECT pg_enable_data_checksums($enable_params);");
+
+# Ensure that the primary switches to "inprogress-on"
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	"inprogress-on");
+is($result, 1, 'ensure checksums are in progress on primary');
+
+# Wait for checksum enable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Ensure that the standby has switched to "inprogress-on" or "on".  Normally it
+# would be "inprogress-on", but it is theoretically possible for the primary to
+# complete the checksum enabling *and* have the standby replay that record
+# before we reach the check below.
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting = 'off' FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'f');
+is($result, 1, 'ensure standby has absorbed the inprogress-on barrier');
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+
+is(($result eq 'inprogress-on' || $result eq 'on'),
+	1, 'ensure checksums are on, or in progress, on standby_1');
+
+# Insert some more data which should be checksummed on INSERT
+$node_primary->safe_psql('postgres',
+	"INSERT INTO t VALUES (generate_series(1, 10000));");
+
+# Wait for checksums enabled on the primary
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the primary');
+
+# Wait for checksums enabled on the standby
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the standby');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, '20000', 'ensure we can safely read all data with checksums');
+
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+# Disable checksums and ensure it's propagated to standby and that we can
+# still read all data
+$node_primary->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+# Wait for checksum disable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure data checksums are disabled on the primary 2');
+
+# Ensure that the standby has switched to off
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are off on standby_1');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, "20000", 'ensure we can safely read all data without checksums');
+
+done_testing();
diff --git a/src/test/checksum/t/004_offline.pl b/src/test/checksum/t/004_offline.pl
new file mode 100644
index 0000000000..b1f585ec7c
--- /dev/null
+++ b/src/test/checksum/t/004_offline.pl
@@ -0,0 +1,100 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums offline from various states
+# of checksum processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init();
+$node->start();
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable checksums offline using pg_checksums
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are enabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums offline again using pg_checksums
+$node->stop;
+$node->checksum_disable_offline;
+$node->start;
+
+# Ensure that checksums are disabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres', "SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, 1, 'ensure checksums are in the process of being enabled');
+
+# Turn the cluster off and enable checksums offline, then start back up
+$bsession->quit;
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are now enabled even though processing wasn't
+# restarted
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+done_testing();
diff --git a/src/test/meson.build b/src/test/meson.build
index c3d0dfedf1..b07d5d2d00 100644
--- a/src/test/meson.build
+++ b/src/test/meson.build
@@ -7,6 +7,7 @@ subdir('authentication')
 subdir('recovery')
 subdir('subscription')
 subdir('modules')
+subdir('checksum')
 
 if ssl.found()
   subdir('ssl')
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 30857f34bf..9e513611ca 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3481,6 +3481,40 @@ sub advance_wal
 	}
 }
 
+=item $node->checksum_enable_offline()
+
+Enable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_enable_offline
+{
+	my ($self) = @_;
+
+	print "### Enabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-e');
+	return;
+}
+
+=item checksum_disable_offline
+
+Disable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_disable_offline
+{
+	my ($self) = @_;
+
+	print "### Disabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-d');
+	return;
+}
+
 =pod
 
 =back
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index c4de597b1f..f8d2ab53fd 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -396,6 +396,7 @@ CheckPointStmt
 CheckpointStatsData
 CheckpointerRequest
 CheckpointerShmemStruct
+ChecksumType
 Chromosome
 CkptSortItem
 CkptTsStatus
@@ -582,6 +583,10 @@ DataPageDeleteStack
 DataTypesUsageChecks
 DataTypesUsageVersionCheck
 DatabaseInfo
+DataChecksumsWorkerDatabase
+DataChecksumsWorkerResult
+DataChecksumsWorkerResultEntry
+DataChecksumsWorkerShmemStruct
 DateADT
 DateTimeErrorExtra
 Datum
-- 
2.39.3 (Apple Git-146)

#8Tomas Vondra
tomas@vondra.me
In reply to: Daniel Gustafsson (#7)
Re: Changing the state of data checksums in a running cluster

Hi,

I did a quick review today. First a couple minor comments:

1) monitoring.sgml

typos: number of database -> number of databases
calcuated -> calculated

2) unnecessary newline in heapam.c (no other changes)

3) unnecessary ListCell in DataChecksumsWorkerMain() on line 1345,
shadowing earlier variable

4) unnecessary comment change in bufmgr.c (no other changes)

5) unnecessary include and newline in bulk_write.c (no other changes)

6) unnecessary newline in pgstat.c (no other changes)

7) controldata.c - maybe this

if (oldctrl->data_checksum_version == 2)

should use PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION instead of the magic
constant? For "off" we use "0" which seems somewhat acceptable, but for
other values it's less obvious what the meaning is.

8) xlog_internal.h - xl_checksum_state should be added to typedefs

9) system_functions.sql - Isn't it weird that this only creates the new
pg_enable_data_checksums function, but not pg_disable_data_checksums? It
also means it doesn't revoke EXECUTE from public on it, which I guess it
probably should? Or why should this be different for the two functions?

Also the error message seems to differ:

test=> select pg_enable_data_checksums();
ERROR: permission denied for function pg_enable_data_checksums
test=> select pg_disable_data_checksums();
ERROR: must be superuser

Probably harmless, but seems a bit strange.

But there also seems to be a more serious problem with recovery. I did a
simple script that does a loop of

* start a cluster
* initialize a small pgbench database (scale 1 - 100)
* run "long" pgbench
* call pg_enable_data_checksums(), wait for it to complete
* stop the cluster with "-m immediate"
* start the cluster

And this unfortunately hits this assert:

bool
AbsorbChecksumsOnBarrier(void)
{
Assert(LocalDataChecksumVersion ==
PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
LocalDataChecksumVersion = PG_DATA_CHECKSUM_VERSION;
return true;
}

Based on our short discussion about this, the controlfile gets updated
right after pg_enable_data_checksums() completes. The immediate stop
however forces a recovery since the last checkpoint, which means we see
the XLOG_CHECKSUMS WAL message again, and set the barrier. And then we
exit recovery, try to start checkpointer and it trips over this, because
the control file already has the "on" value :-(

I'm not sure what's the best way to fix this. Would it be possible to
remember we saw the XLOG_CHECKSUMS during recovery, and make the assert
noop in that case? Or not set the barrier when exiting recovery. I'm not
sure the relaxed assert would remain meaningful, though. What would it
check on standbys, for example?

Maybe a better way would be to wait for a checkpoint before updating the
controlfile, similar to what we do at the beginning? Possibly even with
the same "fast=true/false" logic. That would prevent us from seeing the
XLOG_CHECKSUMS wal record with the updated flag. It would extend the
"window" where a crash would mean we have to redo the checksums, but I
don't think that matters much. For small databases who cares, and for
large databases it should not be a meaningful difference (setting the
checksums already ran over multiple checkpoints, so one checkpoint is
not a big difference).

regards

--
Tomas Vondra

In reply to: Tomas Vondra (#8)
Re: Changing the state of data checksums in a running cluster

Tomas Vondra <tomas@vondra.me> writes:

3) unnecessary ListCell in DataChecksumsWorkerMain() on line 1345,
shadowing earlier variable

All the ListCell variables can be eliminated by using the foreach_ptr
and foreach_oid macros instead of plain foreach.

- ilmari

#10Daniel Gustafsson
daniel@yesql.se
In reply to: Tomas Vondra (#8)
1 attachment(s)
Re: Changing the state of data checksums in a running cluster

On 7 Oct 2024, at 16:46, Tomas Vondra <tomas@vondra.me> wrote:

I did a quick review today. First a couple minor comments:

Thanks for looking! 1-6 are all fixed.

7) controldata.c - maybe this

if (oldctrl->data_checksum_version == 2)

should use PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION instead of the magic
constant? For "off" we use "0" which seems somewhat acceptable, but for
other values it's less obvious what the meaning is.

It doesn't seem clean to include storage/bufpage.h in pg_upgrade, I wonder if
we should move (or mirror) the checksum versions to storage/checksum_impl.h to
make them available to frontend and backend tools?

8) xlog_internal.h - xl_checksum_state should be added to typedefs

Fixed.

9) system_functions.sql - Isn't it weird that this only creates the new
pg_enable_data_checksums function, but not pg_disable_data_checksums?

We don't need any DEFAULT values for pg_disable_data_checksums so it doesn't
need to be created there.

It
also means it doesn't revoke EXECUTE from public on it, which I guess it
probably should? Or why should this be different for the two functions?

That should however be done, so fixed.

But there also seems to be a more serious problem with recovery. I did a
simple script that does a loop of

* start a cluster
* initialize a small pgbench database (scale 1 - 100)
* run "long" pgbench
* call pg_enable_data_checksums(), wait for it to complete
* stop the cluster with "-m immediate"
* start the cluster

And this unfortunately hits this assert:

bool
AbsorbChecksumsOnBarrier(void)
{
Assert(LocalDataChecksumVersion ==
PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
LocalDataChecksumVersion = PG_DATA_CHECKSUM_VERSION;
return true;
}

Based on our short discussion about this, the controlfile gets updated
right after pg_enable_data_checksums() completes. The immediate stop
however forces a recovery since the last checkpoint, which means we see
the XLOG_CHECKSUMS WAL message again, and set the barrier. And then we
exit recovery, try to start checkpointer and it trips over this, because
the control file already has the "on" value :-(

I'm not sure what's the best way to fix this. Would it be possible to
remember we saw the XLOG_CHECKSUMS during recovery, and make the assert
noop in that case? Or not set the barrier when exiting recovery. I'm not
sure the relaxed assert would remain meaningful, though. What would it
check on standbys, for example?

Maybe a better way would be to wait for a checkpoint before updating the
controlfile, similar to what we do at the beginning? Possibly even with
the same "fast=true/false" logic. That would prevent us from seeing the
XLOG_CHECKSUMS wal record with the updated flag. It would extend the
"window" where a crash would mean we have to redo the checksums, but I
don't think that matters much. For small databases who cares, and for
large databases it should not be a meaningful difference (setting the
checksums already ran over multiple checkpoints, so one checkpoint is
not a big difference).

The more I think about it the more I think that updating the control file is
the wrong thing to do for this patch, it should only change the state in memory
and let the checkpoints update the controlfile. The attached fixes that and I
can no longer reproduce the assertion failure you hit.

The attached version also contains updates to the documentation, the aux proc
counter and other smaller bits of polish.

I did remove parts of the progress reporting for now since it can't be used
from the dynamic backgroundworker it seems. I need to regroup and figure out a
better way there, but I wanted to address your above find sooner rather than
wait for that.

--
Daniel Gustafsson

Attachments:

v4-0001-Online-enabling-and-disabling-of-data-checksums.patchapplication/octet-stream; name=v4-0001-Online-enabling-and-disabling-of-data-checksums.patch; x-unix-mode=0644Download
From 3493aebfb86ec2090f92e619b5bc778ef9e2cb38 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Tue, 2 Jul 2024 15:20:43 +0200
Subject: [PATCH v4] Online enabling and disabling of data checksums

This allows data checksums to be enabled, or disabled, in a running
cluster without restricting access to the cluster during processing.

Data checksums could prior to this only be enabled during initdb or
when the cluster is offline using the pg_checksums app. This commit
introduce functionality to enable, or disable, data checksums while
the cluster is running regardless of how it was initialized.

A background worker launcher process is responsible for launching a
dynamic per-database background worker which will mark all buffers
dirty for all relation with storage in order for them to have data
checksums calcuated on write.  Once all relations in all databases
have been processed, the data_checksums state will be set to on and
the cluster will at that point be identical to one which had data
checksums enabled during initialization or via offline processing.

While the cluster is writing data checksums on the existing buffers,
data checksums are written but not verified to avoid false negatives.
Disabling checksums will not touch any buffers (but existing checksums
cannot be re-used in case checksums are immediately re-enabled). While
disabling, checksums are again written but not verified to ensure that
concurrent backends which haven't started disabling checksums will
incur a verification error.

New in-progress states are introduced for data_checksums which during
processing ensures that backends know whether to verify and write
checksums. All state changes across backends are synchronized using
procsignalbarriers

This is based on an earlier version of this patch which was reviewed by
among others Heikki Linnakangas, Robert Haas, Andres Freund, Tomas Vondra,
Michael Banck and Andrey Borodin.

Author: Daniel Gustafsson <daniel@yesql.se>
Author: Magnus Hagander <magnus@hagander.net>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com
Discussion: https://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de
Discussion: https://postgr.es/m/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com
---
 doc/src/sgml/func.sgml                        |   71 +
 doc/src/sgml/glossary.sgml                    |   23 +
 doc/src/sgml/monitoring.sgml                  |  206 ++-
 doc/src/sgml/ref/pg_checksums.sgml            |    6 +
 doc/src/sgml/wal.sgml                         |   57 +-
 src/backend/access/rmgrdesc/xlogdesc.c        |   24 +
 src/backend/access/transam/xlog.c             |  450 +++++-
 src/backend/access/transam/xlogfuncs.c        |   41 +
 src/backend/backup/basebackup.c               |    6 +-
 src/backend/catalog/system_functions.sql      |   11 +
 src/backend/catalog/system_views.sql          |   21 +
 src/backend/postmaster/Makefile               |    1 +
 src/backend/postmaster/bgworker.c             |    7 +
 src/backend/postmaster/datachecksumsworker.c  | 1374 +++++++++++++++++
 src/backend/postmaster/meson.build            |    1 +
 src/backend/replication/logical/decode.c      |    1 +
 src/backend/storage/ipc/ipci.c                |    3 +
 src/backend/storage/ipc/procsignal.c          |   13 +
 src/backend/storage/page/README               |    4 +-
 src/backend/storage/page/bufpage.c            |    6 +-
 src/backend/utils/activity/pgstat_io.c        |    2 +
 .../utils/activity/wait_event_names.txt       |    3 +
 src/backend/utils/adt/pgstatfuncs.c           |    8 +-
 src/backend/utils/init/miscinit.c             |    9 +-
 src/backend/utils/init/postinit.c             |    7 +-
 src/backend/utils/misc/guc_tables.c           |   32 +-
 src/bin/pg_checksums/pg_checksums.c           |    2 +-
 src/bin/pg_upgrade/controldata.c              |    9 +
 src/include/access/xlog.h                     |   16 +-
 src/include/access/xlog_internal.h            |    7 +
 src/include/catalog/pg_control.h              |    1 +
 src/include/catalog/pg_proc.dat               |   17 +
 src/include/commands/progress.h               |   18 +
 src/include/miscadmin.h                       |    4 +
 src/include/postmaster/datachecksumsworker.h  |   31 +
 src/include/storage/bufpage.h                 |   10 +
 src/include/storage/checksum.h                |    8 +
 src/include/storage/lwlocklist.h              |    1 +
 src/include/storage/proc.h                    |    5 +-
 src/include/storage/procsignal.h              |    5 +
 src/include/utils/backend_progress.h          |    1 +
 src/test/Makefile                             |   10 +-
 src/test/checksum/.gitignore                  |    2 +
 src/test/checksum/Makefile                    |   23 +
 src/test/checksum/README                      |   22 +
 src/test/checksum/meson.build                 |   15 +
 src/test/checksum/t/001_basic.pl              |   85 +
 src/test/checksum/t/002_restarts.pl           |   90 ++
 src/test/checksum/t/003_standby_restarts.pl   |  130 ++
 src/test/checksum/t/004_offline.pl            |  100 ++
 src/test/meson.build                          |    1 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |   34 +
 src/tools/pgindent/typedefs.list              |    6 +
 53 files changed, 2991 insertions(+), 49 deletions(-)
 create mode 100644 src/backend/postmaster/datachecksumsworker.c
 create mode 100644 src/include/postmaster/datachecksumsworker.h
 create mode 100644 src/test/checksum/.gitignore
 create mode 100644 src/test/checksum/Makefile
 create mode 100644 src/test/checksum/README
 create mode 100644 src/test/checksum/meson.build
 create mode 100644 src/test/checksum/t/001_basic.pl
 create mode 100644 src/test/checksum/t/002_restarts.pl
 create mode 100644 src/test/checksum/t/003_standby_restarts.pl
 create mode 100644 src/test/checksum/t/004_offline.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7b4fbb5047..110f25324c 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29715,6 +29715,77 @@ DETAIL:  Make sure pg_wal_replay_wait() isn't called within a transaction with a
 
   </sect2>
 
+  <sect2 id="functions-admin-checksum">
+   <title>Data Checksum Functions</title>
+
+   <para>
+    The functions shown in <xref linkend="functions-checksums-table" /> can
+    be used to enable or disable data checksums in a running cluster.
+    See <xref linkend="checksums" /> for details.
+   </para>
+
+   <table id="functions-checksums-table">
+    <title>Data Checksum Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_enable_data_checksums</primary>
+        </indexterm>
+        <function>pg_enable_data_checksums</function> ( <optional><parameter>cost_delay</parameter> <type>int</type>, <parameter>cost_limit</parameter> <type>int</type></optional> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Initiates data checksums for the cluster. This will switch the data
+        checksums mode to <literal>inprogress-on</literal> as well as start a
+        background worker that will process all pages in the database and
+        enable checksums on them. When all data pages have had checksums
+        enabled, the cluster will automatically switch data checksums mode to
+        <literal>on</literal>.
+       </para>
+       <para>
+        If <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter> are
+        specified, the speed of the process is throttled using the same principles as
+        <link linkend="runtime-config-resource-vacuum-cost">Cost-based Vacuum Delay</link>.
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_disable_data_checksums</primary>
+        </indexterm>
+        <function>pg_disable_data_checksums</function> ()
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Disables data checksum validation and calculation for the cluster. This
+        will switch the data checksum mode to <literal>inprogress-off</literal>
+        while data checksums are being disabled. When all active backends have
+        stopped validating data checksums, the data checksum mode will be
+        changed to <literal>off</literal>.  At this point the data pages will
+        still have checksums recorded but they are not updated when pages are
+        modified.
+       </para></entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+
+  </sect2>
+
   <sect2 id="functions-admin-dbobject">
    <title>Database Object Management Functions</title>
 
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index f54f25c1c6..57a9188dc1 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -159,6 +159,8 @@
      (but not the autovacuum workers),
      the <glossterm linkend="glossary-background-writer">background writer</glossterm>,
      the <glossterm linkend="glossary-checkpointer">checkpointer</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker">data checksums worker</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker-launcher">data checksums worker launcher</glossterm>,
      the <glossterm linkend="glossary-logger">logger</glossterm>,
      the <glossterm linkend="glossary-startup-process">startup process</glossterm>,
      the <glossterm linkend="glossary-wal-archiver">WAL archiver</glossterm>,
@@ -548,6 +550,27 @@
    </glossdef>
   </glossentry>
 
+  <glossentry id="glossary-data-checksums-worker">
+   <glossterm>Data Checksums Worker</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which enables or disables data checksums in a specific database.
+    </para>
+   </glossdef>
+  </glossentry>
+
+  <glossentry id="glossary-data-checksums-worker-launcher">
+   <glossterm>Data Checksums Worker Launcher</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which starts <glossterm linkend="glossary-data-checksums-worker"> processes</glossterm>
+     for each database.
+    </para>
+   </glossdef>
+  </glossentry>
+
   <glossentry id="glossary-db-cluster">
    <glossterm>Database cluster</glossterm>
    <glossdef>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 331315f8d3..aeb96aeceb 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3496,8 +3496,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Number of data page checksum failures detected in this
-       database (or on a shared object), or NULL if data checksums are not
-       enabled.
+       database. Detected failures are reported regardless of the
+       <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -3507,8 +3507,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Time at which the last data page checksum failure was detected in
-       this database (or on a shared object), or NULL if data checksums are not
-       enabled.
+       this database (or on a shared object). Last failure is reported
+       regardless of the <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -6670,6 +6670,204 @@ FROM pg_stat_get_backend_idset() AS backendid;
 
  </sect2>
 
+ <sect2 id="data-checksum-progress-reporting">
+  <title>Data Checksum Progress Reporting</title>
+
+  <indexterm>
+   <primary>pg_stat_progress_data_checksums</primary>
+  </indexterm>
+
+  <para>
+   When data checksums are being enabled on a running cluster, the
+   <structname>pg_stat_progress_data_checksums</structname> view will contain
+   a row for each background worker which is currently calculating checksums
+   for the data pages.
+  </para>
+
+  <table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
+   <title><structname>pg_stat_progress_data_checksums</structname> View</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        Column Type
+       </para>
+       <para>
+        Description>
+       </para>
+      </entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>pid</structfield> <type>integer</type>
+       </para>
+       <para>
+        Process ID of a datachecksumworker process.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>phase</structfield> <type>text</type>
+       </para>
+       <para>
+        Current processing phase, see <xref linkend="datachecksum-phases"/>
+        for description of the phases.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of databases which will be processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of relations which will be processed, or
+        <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of relations yet.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of databases which have been processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of relations which have been processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_current</structfield> <type>oid</type>
+       </para>
+       <para>
+        OID of the database currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current</structfield> <type>oid</type>
+       </para>
+       <para>
+        OID of the relation currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current_blocks</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of blocks in the relation currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current_blocks_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks which have been processed in the relation currently
+        being processed.
+       </para>
+      </entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+  <table id="datachecksum-phases">
+   <title>Data Checksum Phases</title>
+   <tgroup cols="2">
+    <colspec colname="col1" colwidth="1*"/>
+    <colspec colname="col2" colwidth="2*"/>
+    <thead>
+     <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+    <tbody>
+     <row>
+      <entry><literal>enabling</literal></entry>
+      <entry>
+       The command is currently enabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>disabling</literal></entry>
+      <entry>
+       The command is currently disabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on backends</literal></entry>
+      <entry>
+       The command is currently waiting for backends to acknowledge the data
+       checksum operation.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on temporary tables</literal></entry>
+      <entry>
+       The command is currently waiting for all temporary tables which existed
+       at the time the command was started to be removed.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>done</literal></entry>
+      <entry>
+       The command has finished processing and is exiting.
+      </entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+ </sect2>
+
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index 95043aa329..0343710af5 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -45,6 +45,12 @@ PostgreSQL documentation
    exit status is nonzero if the operation failed.
   </para>
 
+  <para>
+   When enabling checksums, if checksums were in the process of being enabled
+   when the cluster was shut down, <application>pg_checksums</application>
+   will still process all relations regardless of the online processing.
+  </para>
+
   <para>
    When verifying checksums, every file in the cluster is scanned. When
    enabling checksums, each relation file block with a changed checksum is
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index a34cddb5ed..448a9c0c94 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -248,9 +248,10 @@
   <para>
    Checksums are normally enabled when the cluster is initialized using <link
    linkend="app-initdb-data-checksums"><application>initdb</application></link>.
-   They can also be enabled or disabled at a later time as an offline
-   operation. Data checksums are enabled or disabled at the full cluster
-   level, and cannot be specified individually for databases or tables.
+   They can also be enabled or disabled at a later time either as an offline
+   operation or online in a running cluster allowing concurrent access. Data
+   checksums are enabled or disabled at the full cluster level, and cannot be
+   specified individually for databases or tables.
   </para>
 
   <para>
@@ -267,7 +268,7 @@
   </para>
 
   <sect2 id="checksums-offline-enable-disable">
-   <title>Off-line Enabling of Checksums</title>
+   <title>Offline Enabling of Checksums</title>
 
    <para>
     The <link linkend="app-pgchecksums"><application>pg_checksums</application></link>
@@ -276,6 +277,54 @@
    </para>
 
   </sect2>
+
+  <sect2 id="checksums-online-enable-disable">
+   <title>Online Enabling of Checksums</title>
+
+   <para>
+    Checksums can be enabled or disabled online, by calling the appropriate
+    <link linkend="functions-admin-checksum">functions</link>.
+   </para>
+
+   <para>
+    Enabling checksums will put the cluster checksum mode in
+    <literal>inprogress-on</literal> mode.  During this time, checksums will be
+    written but not verified. In addition to this, a background worker process
+    is started that enables checksums on all existing data in the cluster. Once
+    this worker has completed processing all databases in the cluster, the
+    checksum mode will automatically switch to <literal>on</literal>. The
+    processing will consume two background worker processes, make sure that
+    <varname>max_worker_processes</varname> allows for at least two more
+    additional processes.
+   </para>
+
+   <para>
+    The process will initially wait for all open transactions to finish before
+    it starts, so that it can be certain that there are no tables that have been
+    created inside a transaction that has not committed yet and thus would not
+    be visible to the process enabling checksums. It will also, for each database,
+    wait for all pre-existing temporary tables to get removed before it finishes.
+    If long-lived temporary tables are used in the application it may be necessary
+    to terminate these application connections to allow the process to complete.
+   </para>
+
+   <para>
+    If the cluster is stopped while in <literal>inprogress-on</literal> mode, for
+    any reason, then this process must be restarted manually. To do this,
+    re-execute the function <function>pg_enable_data_checksums()</function>
+    once the cluster has been restarted. The background worker will attempt
+    to resume the work from where it was interrupted.
+   </para>
+
+   <note>
+    <para>
+     Enabling checksums can cause significant I/O to the system, as most of the
+     database pages will need to be rewritten, and will be written both to the
+     data files and the WAL.
+    </para>
+   </note>
+
+  </sect2>
  </sect1>
 
   <sect1 id="wal-intro">
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 363294d623..465153e5d1 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -18,6 +18,7 @@
 #include "access/xlog.h"
 #include "access/xlog_internal.h"
 #include "catalog/pg_control.h"
+#include "storage/bufpage.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -167,6 +168,26 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
 		memcpy(&wal_level, rec, sizeof(int));
 		appendStringInfo(buf, "wal_level %s", get_wal_level_string(wal_level));
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state xlrec;
+
+		memcpy(&xlrec, rec, sizeof(xl_checksum_state));
+		switch (xlrec.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_VERSION:
+				appendStringInfoString(buf, "on");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				appendStringInfoString(buf, "inprogress-off");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				appendStringInfoString(buf, "inprogress-on");
+				break;
+			default:
+				appendStringInfoString(buf, "off");
+		}
+	}
 }
 
 const char *
@@ -218,6 +239,9 @@ xlog_identify(uint8 info)
 		case XLOG_CHECKPOINT_REDO:
 			id = "CHECKPOINT_REDO";
 			break;
+		case XLOG_CHECKSUMS:
+			id = "CHECKSUMS";
+			break;
 	}
 
 	return id;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 9102c8d772..e25b071c42 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -646,6 +646,16 @@ static XLogRecPtr LocalMinRecoveryPoint;
 static TimeLineID LocalMinRecoveryPointTLI;
 static bool updateMinRecoveryPoint = true;
 
+/*
+ * Local state fror Controlfile data_checksum_version. After initialization
+ * this is only updated when absorbing a procsignal barrier during interrupt
+ * processing. The reason for keeping a copy in backend-private memory is to
+ * avoid locking for interrogating checksum state. Possible values are the
+ * checksum versions defined in storage/bufpage.h as well as zero when data
+ * checksums are disabled.
+ */
+static uint32 LocalDataChecksumVersion = 0;
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -714,6 +724,8 @@ static void WALInsertLockAcquireExclusive(void);
 static void WALInsertLockRelease(void);
 static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);
 
+static void XLogChecksums(ChecksumType new_type);
+
 /*
  * Insert an XLOG record represented by an already-constructed chain of data
  * chunks.  This is a low-level routine; to construct the WAL record header
@@ -827,9 +839,10 @@ XLogInsertRecord(XLogRecData *rdata,
 		 * only happen just after a checkpoint, so it's better to be slow in
 		 * this case and fast otherwise.
 		 *
-		 * Also check to see if fullPageWrites was just turned on or there's a
-		 * running backup (which forces full-page writes); if we weren't
-		 * already doing full-page writes then go back and recompute.
+		 * Also check to see if fullPageWrites was just turned on, there's a
+		 * running backup or if checksums are enabled (all of which forces
+		 * full-page writes); if we weren't already doing full-page writes
+		 * then go back and recompute.
 		 *
 		 * If we aren't doing full-page writes then RedoRecPtr doesn't
 		 * actually affect the contents of the XLOG record, so we'll update
@@ -842,7 +855,9 @@ XLogInsertRecord(XLogRecData *rdata,
 			Assert(RedoRecPtr < Insert->RedoRecPtr);
 			RedoRecPtr = Insert->RedoRecPtr;
 		}
-		doPageWrites = (Insert->fullPageWrites || Insert->runningBackups > 0);
+		doPageWrites = (Insert->fullPageWrites ||
+						Insert->runningBackups > 0 ||
+						DataChecksumsNeedWrite());
 
 		if (doPageWrites &&
 			(!prevDoPageWrites ||
@@ -4539,9 +4554,7 @@ ReadControlFile(void)
 
 	CalculateCheckpointSegments();
 
-	/* Make the initdb settings visible as GUC variables, too */
-	SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no",
-					PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
+	LocalDataChecksumVersion = ControlFile->data_checksum_version;
 }
 
 /*
@@ -4575,13 +4588,345 @@ GetMockAuthenticationNonce(void)
 }
 
 /*
- * Are checksums enabled for data pages?
+ * DataChecksumsNeedWrite
+ *		Returns whether data checksums must be written or not
+ *
+ * Returns true iff data checksums are enabled or are in the process of being
+ * enabled. During "inprogress-on" and "inprogress-off" states checksums must
+ * be written even though they are not verified (see datachecksumsworker.c for
+ * a longer discussion).
+ *
+ * This function is intended for callsites which are about to write a data page
+ * to storage, and need to know whether to re-calculate the checksum for the
+ * page header. Calling this function must be performed as close to the write
+ * operation as possible to keep the critical section short.
+ */
+bool
+DataChecksumsNeedWrite(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * DataChecksumsNeedVerify
+ *		Returns whether data checksums must be verified or not
+ *
+ * Data checksums are only verified if they are fully enabled in the cluster.
+ * During the "inprogress-on" and "inprogress-off" states they are only
+ * updated, not verified (see datachecksumsworker.c for a longer discussion).
+ *
+ * This function is intended for callsites which have read data and are about
+ * to perform checksum validation based on the result of this.  Calling this
+ * function must be performed as close to the validation call as possible to
+ * keep the critical section short. This is in order to protect against time of
+ * check/time of use situations around data checksum validation.
+ */
+bool
+DataChecksumsNeedVerify(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
+}
+
+/*
+ * DataChecksumsOnInProgress
+ *		Returns whether data checksums are being enabled
+ *
+ * Most operations don't need to worry about the "inprogress" states, and
+ * should use DataChecksumsNeedVerify() or DataChecksumsNeedWrite(). The
+ * "inprogress-on" state for enabling checksums is used when the checksum
+ * worker is setting checksums on all pages, it can thus be used to check for
+ * aborted checksum processing which need to be restarted.
+ */
+inline bool
+DataChecksumsOnInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+}
+
+/*
+ * DataChecksumsOffInProgress
+ *		Returns whether data checksums are being disabled
+ *
+ * The "inprogress-off" state for disabling checksums is used for when the
+ * worker resets the catalog state.  DataChecksumsNeedVerify() or
+ * DataChecksumsNeedWrite() should be used for deciding whether to read/write
+ * checksums.
  */
 bool
-DataChecksumsEnabled(void)
+DataChecksumsOffInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * SetDataChecksumsOnInProgress
+ *		Sets the data checksum state to "inprogress-on" to enable checksums
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on". See
+ * SetDataChecksumsOn below for a description on how this state change works.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOnInProgress(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile != NULL);
+
+	/*
+	 * The state transition is performed in a critical section with
+	 * checkpoints held off to provide crash safety.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state change in all backends to ensure that all backends are in
+	 * "inprogress-on". Once done we know that all backends are writing data
+	 * checksums.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOn
+ *		Enables data checksums cluster-wide
+ *
+ * Enabling data checksums is performed using two barriers, the first one to
+ * set the checksums state to "inprogress-on" (which is performed by
+ * SetDataChecksumsOnInProgress()) and the second one to set the state to "on"
+ * (performed here).
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on".  This state
+ * requires data checksums to be written but not verified. This ensures that
+ * all data pages can be checksummed without the risk of false negatives in
+ * validation during the process.  When all existing pages are guaranteed to
+ * have checksums, and all new pages will be initiated with checksums, the
+ * state can be changed to "on". Once the state is "on" checksums will be both
+ * written and verified. See datachecksumsworker.c for a longer discussion on
+ * how data checksums can be enabled in a running cluster.
+ *
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOn(void)
 {
+	uint64		barrier;
+
 	Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/*
+	 * The only allowed state transition to "on" is from "inprogress-on" since
+	 * that state ensures that all pages will have data checksums written.
+	 */
+	if (ControlFile->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+		elog(ERROR, "checksums not in \"inprogress-on\" mode");
+	}
+
+	LWLockRelease(ControlFileLock);
+
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state transition of "on" in all backends. When done we know that
+	 * data checksums are enabled in all backends and data checksums are both
+	 * written and verified.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOff
+ *		Disables data checksums cluster-wide
+ *
+ * Disabling data checksums must be performed with two sets of barriers, each
+ * carrying a different state. The state is first set to "inprogress-off"
+ * during which checksums are still written but not verified. This ensures that
+ * backends which have yet to observe the state change from "on" won't get
+ * validation errors on concurrently modified pages. Once all backends have
+ * changed to "inprogress-off", the barrier for moving to "off" can be emitted.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOff(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/* If data checksums are already disabled there is nothing to do */
+	if (ControlFile->data_checksum_version == 0)
+	{
+		LWLockRelease(ControlFileLock);
+		return;
+	}
+
+	/*
+	 * If data checksums are currently enabled we first transition to the
+	 * "inprogress-off" state during which backends continue to write
+	 * checksums without verifying them. When all backends are in
+	 * "inprogress-off" the next transition to "off" can be performed, after
+	 * which all data checksum processing is disabled.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+
+		MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+		START_CRIT_SECTION();
+
+		XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+		LWLockRelease(ControlFileLock);
+
+		barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+
+		END_CRIT_SECTION();
+		MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+
+		/*
+		 * Update local state in all backends to ensure that any backend in
+		 * "on" state is changed to "inprogress-off".
+		 */
+		WaitForProcSignalBarrier(barrier);
+
+		/*
+		 * At this point we know that no backends are verifying data checksums
+		 * during reading. Next, we can safely move to state "off" to also
+		 * stop writing checksums.
+		 */
+	}
+	else
+	{
+		/*
+		 * Ending up here implies that the checksums state is "inprogress-on"
+		 * or "inprogress-off" and we can transition directly to "off" from
+		 * there.
+		 */
+		LWLockRelease(ControlFileLock);
+	}
+
+	/*
+	 * Ensure that we don't incur a checkpoint during disabling checksums.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(0);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = 0;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * ProcSignalBarrier absorption functions for enabling and disabling data
+ * checksums in a running cluster. The procsignalbarriers are emitted in the
+ * SetDataChecksums* functions.
+ */
+bool
+AbsorbChecksumsOnInProgressBarrier(void)
+{
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOnBarrier(void)
+{
+	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffInProgressBarrier(void)
+{
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffBarrier(void)
+{
+	LocalDataChecksumVersion = 0;
+	return true;
+}
+
+/*
+ * InitLocalControlData
+ *
+ * Set up backend local caches of controldata variables which may change at
+ * any point during runtime and thus require special cased locking. So far
+ * this only applies to data_checksum_version, but it's intended to be general
+ * purpose enough to handle future cases.
+ */
+void
+InitLocalControldata(void)
+{
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	LocalDataChecksumVersion = ControlFile->data_checksum_version;
+	LWLockRelease(ControlFileLock);
+}
+
+/* guc hook */
+const char *
+show_data_checksums(void)
+{
+	if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)
+		return "on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		return "inprogress-on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+		return "inprogress-off";
+	else
+		return "off";
 }
 
 /*
@@ -6141,6 +6486,33 @@ StartupXLOG(void)
 	 */
 	CompleteCommitTsInitialization();
 
+	/*
+	 * If we reach this point with checksums being enabled ("inprogress-on"
+	 * state), we notify the user that they need to manually restart the
+	 * process to enable checksums. This is because we cannot launch a dynamic
+	 * background worker directly from here, it has to be launched from a
+	 * regular backend.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		ereport(WARNING,
+				(errmsg("data checksums are being enabled, but no worker is running"),
+				 errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
+
+	/*
+	 * If data checksums were being disabled when the cluster was shut down,
+	 * we know that we have a state where all backends have stopped validating
+	 * checksums and we can move to off instead of prompting the user to
+	 * perform any action.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+	{
+		XLogChecksums(0);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = 0;
+		LWLockRelease(ControlFileLock);
+	}
+
 	/*
 	 * All done with end-of-recovery actions.
 	 *
@@ -8154,6 +8526,24 @@ XLogReportParameters(void)
 	}
 }
 
+/*
+ * Log the new state of checksums
+ */
+static void
+XLogChecksums(ChecksumType new_type)
+{
+	xl_checksum_state xlrec;
+	XLogRecPtr	recptr;
+
+	xlrec.new_checksumtype = new_type;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, sizeof(xl_checksum_state));
+
+	recptr = XLogInsert(RM_XLOG_ID, XLOG_CHECKSUMS);
+	XLogFlush(recptr);
+}
+
 /*
  * Update full_page_writes in shared memory, and write an
  * XLOG_FPW_CHANGE record if necessary.
@@ -8582,6 +8972,46 @@ xlog_redo(XLogReaderState *record)
 	{
 		/* nothing to do here, just for informational purposes */
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state state;
+		uint64		barrier;
+
+		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = state.new_checksumtype;
+		LWLockRelease(ControlFileLock);
+
+		/*
+		 * Block on a procsignalbarrier to await all processes having seen the
+		 * change to checksum status. Once the barrier has been passed we can
+		 * initiate the corresponding processing.
+		 */
+		switch (state.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			default:
+				Assert(state.new_checksumtype == 0);
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+		}
+	}
 }
 
 /*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 3e3d2bb618..c1b22e0b9c 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -27,6 +27,7 @@
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "postmaster/datachecksumsworker.h"
 #include "replication/walreceiver.h"
 #include "storage/fd.h"
 #include "storage/proc.h"
@@ -803,3 +804,43 @@ pg_wal_replay_wait(PG_FUNCTION_ARGS)
 
 	PG_RETURN_VOID();
 }
+
+/*
+ * Disables data checksums for the cluster, if applicable. Starts a background
+ * worker which turns off the data checksums.
+ */
+Datum
+disable_data_checksums(PG_FUNCTION_ARGS)
+{
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	StartDataChecksumsWorkerLauncher(false, 0, 0, false);
+	PG_RETURN_VOID();
+}
+
+/*
+ * Enables data checksums for the cluster, if applicable.  Supports vacuum-
+ * like cost based throttling to limit system load. Starts a background worker
+ * which updates data checksums on existing data.
+ */
+Datum
+enable_data_checksums(PG_FUNCTION_ARGS)
+{
+	int			cost_delay = PG_GETARG_INT32(0);
+	int			cost_limit = PG_GETARG_INT32(1);
+	bool		fast = PG_GETARG_BOOL(2);
+
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	if (cost_delay < 0)
+		ereport(ERROR, errmsg("cost delay cannot be a negative value"));
+
+	if (cost_limit <= 0)
+		ereport(ERROR, errmsg("cost limit must be greater than zero"));
+
+	StartDataChecksumsWorkerLauncher(true, cost_delay, cost_limit, fast);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 14e5ba72e9..3c64413dd4 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1612,7 +1612,8 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	 * enabled for this cluster, and if this is a relation file, then verify
 	 * the checksum.
 	 */
-	if (!noverify_checksums && DataChecksumsEnabled() &&
+	if (!noverify_checksums &&
+		DataChecksumsNeedWrite() &&
 		RelFileNumberIsValid(relfilenumber))
 		verify_checksum = true;
 
@@ -2005,6 +2006,9 @@ verify_page_checksum(Page page, XLogRecPtr start_lsn, BlockNumber blkno,
 	if (PageIsNew(page) || PageGetLSN(page) >= start_lsn)
 		return true;
 
+	if (!DataChecksumsNeedVerify())
+		return true;
+
 	/* Perform the actual checksum calculation. */
 	checksum = pg_checksum_page(page, blkno);
 
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index b0d0de051e..c4a6c9a7ab 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,13 @@ LANGUAGE INTERNAL
 CALLED ON NULL INPUT VOLATILE PARALLEL SAFE
 AS 'pg_stat_reset_slru';
 
+CREATE OR REPLACE FUNCTION
+  pg_enable_data_checksums(cost_delay integer DEFAULT 0,
+                           cost_limit integer DEFAULT 100,
+						   fast boolean DEFAULT false)
+  RETURNS void STRICT VOLATILE LANGUAGE internal AS 'enable_data_checksums'
+  PARALLEL RESTRICTED;
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
@@ -760,6 +767,10 @@ REVOKE EXECUTE ON FUNCTION pg_ls_logicalmapdir() FROM PUBLIC;
 
 REVOKE EXECUTE ON FUNCTION pg_ls_replslotdir(text) FROM PUBLIC;
 
+REVOKE EXECUTE ON FUNCTION pg_enable_data_checksums(integer, integer, boolean) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_disable_data_checksums() FROM public;
+
 --
 -- We also set up some things as accessible to standard roles.
 --
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 3456b821bc..841d416fa5 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1323,6 +1323,27 @@ CREATE VIEW pg_stat_progress_copy AS
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_data_checksums AS
+	SELECT
+		S.pid AS pid, S.datid AS datid, D.datname AS datname,
+		CASE S.param1 WHEN 0 THEN 'enabling'
+                      WHEN 1 THEN 'disabling'
+					  WHEN 2 THEN 'waiting'
+					  WHEN 3 THEN 'waiting on backends'
+					  WHEN 4 THEN 'waiting on temporary tables'
+					  WHEN 5 THEN 'done'
+					  END AS phase,
+		CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+		CASE S.param3 WHEN -1 THEN NULL ELSE S.param3 END AS relations_total,
+		S.param4 AS databases_processed,
+		S.param5 AS relations_processed,
+		S.param6 AS databases_current,
+		S.param7 AS relation_current,
+		S.param8 AS relation_current_blocks,
+		S.param9 AS relation_current_blocks_processed
+	FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+		LEFT JOIN pg_database D ON S.datid = D.oid;
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/postmaster/Makefile b/src/backend/postmaster/Makefile
index db08543d19..e112d4b53e 100644
--- a/src/backend/postmaster/Makefile
+++ b/src/backend/postmaster/Makefile
@@ -18,6 +18,7 @@ OBJS = \
 	bgworker.o \
 	bgwriter.o \
 	checkpointer.o \
+	datachecksumsworker.o \
 	fork_process.o \
 	interrupt.o \
 	launch_backend.o \
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 07bc5517fc..662cd12681 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -18,6 +18,7 @@
 #include "pgstat.h"
 #include "port/atomics.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "replication/logicallauncher.h"
 #include "replication/logicalworker.h"
@@ -132,6 +133,12 @@ static const struct
 	},
 	{
 		"TablesyncWorkerMain", TablesyncWorkerMain
+	},
+	{
+		"DataChecksumsWorkerLauncherMain", DataChecksumsWorkerLauncherMain
+	},
+	{
+		"DataChecksumsWorkerMain", DataChecksumsWorkerMain
 	}
 };
 
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
new file mode 100644
index 0000000000..b983ab3922
--- /dev/null
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -0,0 +1,1374 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.c
+ *	  Background worker for enabling or disabling data checksums online
+ *
+ * When enabling data checksums on a database at initdb time or when shut down
+ * with pg_checksums, no extra process is required as each page is checksummed,
+ * and verified, when accessed.  When enabling checksums on an already running
+ * cluster, this worker will ensure that all pages are checksummed before
+ * verification of the checksums is turned on. In the case of disabling
+ * checksums, the state transition is performed only in the control file, no
+ * changes are performed on the data pages.
+ *
+ * Checksums can be either enabled or disabled cluster-wide, with on/off being
+ * the end state for data_checksums.
+ *
+ * Enabling checksums
+ * ------------------
+ * When enabling checksums in an online cluster, data_checksums will be set to
+ * "inprogress-on" which signals that write operations MUST compute and write
+ * the checksum on the data page, but during reading the checksum SHALL NOT be
+ * verified. This ensures that all objects created during checksumming will
+ * have checksums set, but no reads will fail due to incorrect checksum. The
+ * DataChecksumsWorker will compile a list of databases which exist at the
+ * start of checksumming, and all of these which haven't been dropped during
+ * the processing MUST have been processed successfully in order for checksums
+ * to be enabled. Any new relation created during processing will see the
+ * in-progress state and will automatically be checksummed.
+ *
+ * For each database, all relations which have storage are read and every data
+ * page is marked dirty to force a write with the checksum. This will generate
+ * a lot of WAL as the entire database is read and written.
+ *
+ * If the processing is interrupted by a cluster restart, it will be restarted
+ * from the beginning again as state isn't persisted.
+ *
+ * Disabling checksums
+ * -------------------
+ * When disabling checksums, data_checksums will be set to "inprogress-off"
+ * which signals that checksums are written but no longer verified. This ensure
+ * that backends which have yet to move from the "on" state will still be able
+ * to process data checksum validation.
+ *
+ * Synchronization and Correctness
+ * -------------------------------
+ * The processes involved in enabling, or disabling, data checksums in an
+ * online cluster must be properly synchronized with the normal backends
+ * serving concurrent queries to ensure correctness. Correctness is defined
+ * as the following:
+ *
+ *    - Backends SHALL NOT violate local data_checksums state
+ *    - Data checksums SHALL NOT be considered enabled cluster-wide until all
+ *      currently connected backends have the local state "enabled"
+ *
+ * There are two levels of synchronization required for enabling data checksums
+ * in an online cluster: (i) changing state in the active backends ("on",
+ * "off", "inprogress-on" and "inprogress-off"), and (ii) ensuring no
+ * incompatible objects and processes are left in a database when workers end.
+ * The former deals with cluster-wide agreement on data checksum state and the
+ * latter with ensuring that any concurrent activity cannot break the data
+ * checksum contract during processing.
+ *
+ * Synchronizing the state change is done with procsignal barriers, where the
+ * WAL logging backend updating the global state in the controlfile will wait
+ * for all other backends to absorb the barrier. Barrier absorption will happen
+ * during interrupt processing, which means that connected backends will change
+ * state at different times. To prevent data checksum state changes when
+ * writing and verifying checksums, interrupts shall be held off before
+ * interrogating state and resumed when the IO operation has been performed.
+ *
+ *   When Enabling Data Checksums
+ *   ----------------------------
+ *   A process which fails to observe data checksums being enabled can induce
+ *   two types of errors: failing to write the checksum when modifying the page
+ *   and failing to validate the data checksum on the page when reading it.
+ *
+ *   When processing starts all backends belong to one of the below sets, with
+ *   one set being empty:
+ *
+ *   Bd: Backends in "off" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   If processing is started in an online cluster then all backends are in Bd.
+ *   If processing was halted by the cluster shutting down, the controlfile
+ *   state "inprogress-on" will be observed on system startup and all backends
+ *   will be in Bd. Backends transition Bd -> Bi via a procsignalbarrier.  When
+ *   the DataChecksumsWorker has finished writing checksums on all pages and
+ *   enables data checksums cluster-wide, there are four sets of backends where
+ *   Bd shall be an empty set:
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   Backends in Bi and Be will write checksums when modifying a page, but only
+ *   backends in Be will verify the checksum during reading. The Bg backend is
+ *   blocked waiting for all backends in Bi to process interrupts and move to
+ *   Be. Any backend starting while Bg is waiting on the procsignalbarrier will
+ *   observe the global state being "on" and will thus automatically belong to
+ *   Be.  Checksums are enabled cluster-wide when Bi is an empty set. Bi and Be
+ *   are compatible sets while still operating based on their local state as
+ *   both write data checksums.
+ *
+ *   When Disabling Data Checksums
+ *   -----------------------------
+ *   A process which fails to observe that data checksums have been disabled
+ *   can induce two types of errors: writing the checksum when modifying the
+ *   page and validating a data checksum which is no longer correct due to
+ *   modifications to the page.
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bo: Backends in "inprogress-off" state
+ *
+ *   Backends transition from the Be state to Bd like so: Be -> Bo -> Bd
+ *
+ *   The goal is to transition all backends to Bd making the others empty sets.
+ *   Backends in Bo write data checksums, but don't validate them, such that
+ *   backends still in Be can continue to validate pages until the barrier has
+ *   been absorbed such that they are in Bo. Once all backends are in Bo, the
+ *   barrier to transition to "off" can be raised and all backends can safely
+ *   stop writing data checksums as no backend is enforcing data checksum
+ *   validation any longer.
+ *
+ *
+ * Potential optimizations
+ * -----------------------
+ * Below are some potential optimizations and improvements which were brought
+ * up during reviews of this feature, but which weren't implemented in the
+ * initial version. These are ideas listed without any validation on their
+ * feasibility or potential payoff. More discussion on these can be found on
+ * the -hackers threads linked to in the commit message of this feature.
+ *
+ *   * Launching datachecksumsworker for resuming operation from the startup
+ *     process: Currently users have to restart processing manually after a
+ *     restart since dynamic background worker cannot be started from the
+ *     postmaster. Changing the startup process could make restarting the
+ *     processing automatic on cluster restart.
+ *   * Avoid dirtying the page when checksums already match: Iff the checksum
+ *     on the page happens to already match we still dirty the page. It should
+ *     be enough to only do the log_newpage_buffer() call in that case.
+ *   * Invent a lightweight WAL record that doesn't contain the full-page
+ *     image but just the block number: On replay, the redo routine would read
+ *     the page from disk.
+ *   * Teach pg_checksums to avoid checksummed pages when pg_checksums is used
+ *     to enable checksums on a cluster which is in inprogress-on state and
+ *     may have checksummed pages (make pg_checksums be able to resume an
+ *     online operation).
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/postmaster/datachecksumsworker.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/htup_details.h"
+#include "access/xact.h"
+#include "access/xloginsert.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_database.h"
+#include "commands/progress.h"
+#include "commands/vacuum.h"
+#include "common/relpath.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "postmaster/bgworker.h"
+#include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/bufmgr.h"
+#include "storage/checksum.h"
+#include "storage/ipc.h"
+#include "storage/lmgr.h"
+#include "storage/lwlock.h"
+#include "storage/procarray.h"
+#include "storage/smgr.h"
+#include "tcop/tcopprot.h"
+#include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
+#include "utils/ps_status.h"
+#include "utils/syscache.h"
+
+/*
+ * Number of times we retry to open a database before giving up and consider
+ * it to have failed processing.
+ */
+#define DATACHECKSUMSWORKER_MAX_DB_RETRIES 5
+
+typedef enum
+{
+	DATACHECKSUMSWORKER_SUCCESSFUL = 0,
+	DATACHECKSUMSWORKER_ABORTED,
+	DATACHECKSUMSWORKER_FAILED,
+	DATACHECKSUMSWORKER_RETRYDB,
+} DataChecksumsWorkerResult;
+
+/*
+ * Signaling between backends calling pg_enable/disable_data_checksums, the
+ * checksums launcher process, and the checksums worker process.
+ *
+ * This struct is protected by DataChecksumsWorkerLock
+ */
+typedef struct DataChecksumsWorkerShmemStruct
+{
+	/*
+	 * These are set by pg_enable/disable_data_checksums, to tell the launcher
+	 * what the target state is.
+	 */
+	bool		launch_enable_checksums;	/* True if checksums are being
+											 * enabled, else false */
+	int			launch_cost_delay;
+	int			launch_cost_limit;
+	bool		launch_fast;
+
+	/*
+	 * Is a launcher process is currently running?
+	 *
+	 * This is set by the launcher process, after it has read the above
+	 * launch_* parameters.
+	 */
+	bool		launcher_running;
+
+	/*
+	 * These fields indicate the target state that the launcher is currently
+	 * working towards. They can be different from the corresponding launch_*
+	 * fields, if a new pg_enable/disable_data_checksums() call was made while
+	 * the launcher/worker was already running.
+	 *
+	 * The below members are set when the launcher starts, and are only
+	 * accessed read-only by the single worker. Thus, we can access these
+	 * without a lock. If multiple workers, or dynamic cost parameters, are
+	 * supported at some point then this would need to be revisited.
+	 */
+	bool		enabling_checksums; /* True if checksums are being enabled,
+									 * else false */
+	int			cost_delay;
+	int			cost_limit;
+	bool		immediate_checkpoint;
+
+	/*
+	 * Signaling between the launcher and the worker process.
+	 *
+	 * As there is only a single worker, and the launcher won't read these
+	 * until the worker exits, they can be accessed without the need for a
+	 * lock. If multiple workers are supported then this will have to be
+	 * revisited.
+	 */
+
+	/* result, set by worker before exiting */
+	DataChecksumsWorkerResult success;
+
+	/*
+	 * tells the worker process whether it should also process the shared
+	 * catalogs
+	 */
+	bool		process_shared_catalogs;
+} DataChecksumsWorkerShmemStruct;
+
+/* Shared memory segment for datachecksumsworker */
+static DataChecksumsWorkerShmemStruct *DataChecksumsWorkerShmem;
+
+/* Bookkeeping for work to do */
+typedef struct DataChecksumsWorkerDatabase
+{
+	Oid			dboid;
+	char	   *dbname;
+} DataChecksumsWorkerDatabase;
+
+typedef struct DataChecksumsWorkerResultEntry
+{
+	Oid			dboid;
+	DataChecksumsWorkerResult result;
+	int			retries;
+} DataChecksumsWorkerResultEntry;
+
+
+/*
+ * Flag set by the interrupt handler
+ */
+static volatile sig_atomic_t abort_requested = false;
+
+/*
+ * Have we set the DataChecksumsWorkerShmemStruct->launcher_running flag?
+ * If we have, we need to clear it before exiting!
+ */
+static volatile sig_atomic_t launcher_running = false;
+
+/*
+ * Are we enabling data checksums, or disabling them?
+ */
+static bool enabling_checksums;
+
+/* Prototypes */
+static List *BuildDatabaseList(void);
+static List *BuildRelationList(bool temp_relations, bool include_shared);
+static void FreeDatabaseList(List *dblist);
+static DataChecksumsWorkerResult ProcessDatabase(DataChecksumsWorkerDatabase *db);
+static bool ProcessAllDatabases(bool immediate_checkpoint);
+static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy);
+static void launcher_cancel_handler(SIGNAL_ARGS);
+static void WaitForAllTransactionsToFinish(void);
+
+/*
+ * StartDataChecksumsWorkerLauncher
+ *		Main entry point for datachecksumsworker launcher process
+ *
+ * The main entrypoint for starting data checksums processing for enabling as
+ * well as disabling.
+ */
+void
+StartDataChecksumsWorkerLauncher(bool enable_checksums,
+								 int cost_delay,
+								 int cost_limit,
+								 bool fast)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	bool		launcher_running;
+
+	/* the cost delay settings have no effect when disabling */
+	Assert(enable_checksums || cost_delay == 0);
+	Assert(enable_checksums || cost_limit == 0);
+
+	/* store the desired state in shared memory */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	DataChecksumsWorkerShmem->launch_enable_checksums = enable_checksums;
+	DataChecksumsWorkerShmem->launch_cost_delay = cost_delay;
+	DataChecksumsWorkerShmem->launch_cost_limit = cost_limit;
+	DataChecksumsWorkerShmem->launch_fast = fast;
+
+	/* is the launcher already running? */
+	launcher_running = DataChecksumsWorkerShmem->launcher_running;
+
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * Launch a new launcher process, if it's not running already.
+	 *
+	 * If the launcher is currently busy enabling the checksums, and we want
+	 * them disabled (or vice versa), the launcher will notice that at latest
+	 * when it's about to exit, and will loop back process the new request. So
+	 * if the launcher is already running, we don't need to do anything more
+	 * here to abort it.
+	 *
+	 * If you call pg_enable/disable_data_checksums() twice in a row, before
+	 * the launcher has had a chance to start up, we still end up launching it
+	 * twice.  That's OK, the second invocation will see that a launcher is
+	 * already running and exit quickly.
+	 *
+	 * TODO: We could optimize here and skip launching the launcher, if we are
+	 * already in the desired state, i.e. if the checksums are already enabled
+	 * and you call pg_enable_data_checksums().
+	 */
+	if (!launcher_running)
+	{
+		/*
+		 * Prepare the BackgroundWorker and launch it.
+		 */
+		memset(&bgw, 0, sizeof(bgw));
+		bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+		bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+		snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+		snprintf(bgw.bgw_function_name, BGW_MAXLEN, "DataChecksumsWorkerLauncherMain");
+		snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker launcher");
+		snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker launcher");
+		bgw.bgw_restart_time = BGW_NEVER_RESTART;
+		bgw.bgw_notify_pid = MyProcPid;
+		bgw.bgw_main_arg = (Datum) 0;
+
+		if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+			ereport(ERROR,
+					(errmsg("failed to start background worker to process data checksums")));
+	}
+}
+
+/*
+ * ProcessSingleRelationFork
+ *		Enable data checksums in a single relation/fork.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy)
+{
+	BlockNumber numblocks = RelationGetNumberOfBlocksInFork(reln, forkNum);
+	BlockNumber blknum;
+	char		activity[NAMEDATALEN * 2 + 128];
+	char	   *relns;
+
+	relns = get_namespace_name(RelationGetNamespace(reln));
+
+	if (!relns)
+		return false;
+
+	/* Report the current relation to pgstat_activity */
+	snprintf(activity, sizeof(activity) - 1, "processing: %s.%s (%s, %dblocks)",
+			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	/*
+	 * We are looping over the blocks which existed at the time of process
+	 * start, which is safe since new blocks are created with checksums set
+	 * already due to the state being "inprogress-on".
+	 */
+	for (blknum = 0; blknum < numblocks; blknum++)
+	{
+		Buffer		buf = ReadBufferExtended(reln, forkNum, blknum, RBM_NORMAL, strategy);
+
+		/* Need to get an exclusive lock before we can flag as dirty */
+		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+
+		/*
+		 * Mark the buffer as dirty and force a full page write.  We have to
+		 * re-write the page to WAL even if the checksum hasn't changed,
+		 * because if there is a replica it might have a slightly different
+		 * version of the page with an invalid checksum, caused by unlogged
+		 * changes (e.g. hintbits) on the master happening while checksums
+		 * were off. This can happen if there was a valid checksum on the page
+		 * at one point in the past, so only when checksums are first on, then
+		 * off, and then turned on again. Iff wal_level is set to "minimal",
+		 * this could be avoided iff the checksum is calculated to be correct.
+		 */
+		START_CRIT_SECTION();
+		MarkBufferDirty(buf);
+		log_newpage_buffer(buf, false);
+		END_CRIT_SECTION();
+
+		UnlockReleaseBuffer(buf);
+
+		/*
+		 * This is the only place where we check if we are asked to abort, the
+		 * abortion will bubble up from here. It's safe to check this without
+		 * a lock, because if we miss it being set, we will try again soon.
+		 */
+		Assert(enabling_checksums);
+		if (!DataChecksumsWorkerShmem->launch_enable_checksums)
+			abort_requested = true;
+		if (abort_requested)
+			return false;
+
+		vacuum_delay_point();
+	}
+
+	pfree(relns);
+	return true;
+}
+
+/*
+ * ProcessSingleRelationByOid
+ *		Process a single relation based on oid.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationByOid(Oid relationId, BufferAccessStrategy strategy)
+{
+	Relation	rel;
+	ForkNumber	fnum;
+	bool		aborted = false;
+
+	StartTransactionCommand();
+
+	rel = try_relation_open(relationId, AccessShareLock);
+	if (rel == NULL)
+	{
+		/*
+		 * Relation no longer exists. We don't consider this an error since
+		 * there are no pages in it that need data checksums, and thus return
+		 * true. The worker operates off a list of relations generated at the
+		 * start of processing, so relations being dropped in the meantime is
+		 * to be expected.
+		 */
+		CommitTransactionCommand();
+		pgstat_report_activity(STATE_IDLE, NULL);
+		return true;
+	}
+	RelationGetSmgr(rel);
+
+	for (fnum = 0; fnum <= MAX_FORKNUM; fnum++)
+	{
+		if (smgrexists(rel->rd_smgr, fnum))
+		{
+			if (!ProcessSingleRelationFork(rel, fnum, strategy))
+			{
+				aborted = true;
+				break;
+			}
+		}
+	}
+	relation_close(rel, AccessShareLock);
+	elog(DEBUG2,
+		 "data checksum processing done for relation with OID %u: %s",
+		 relationId, (aborted ? "aborted" : "finished"));
+
+	CommitTransactionCommand();
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return !aborted;
+}
+
+/*
+ * ProcessDatabase
+ *		Enable data checksums in a single database.
+ *
+ * We do this by launching a dynamic background worker into this database, and
+ * waiting for it to finish.  We have to do this in a separate worker, since
+ * each process can only be connected to one database during its lifetime.
+ */
+static DataChecksumsWorkerResult
+ProcessDatabase(DataChecksumsWorkerDatabase *db)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	BgwHandleStatus status;
+	pid_t		pid;
+	char		activity[NAMEDATALEN + 64];
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_FAILED;
+
+	memset(&bgw, 0, sizeof(bgw));
+	bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+	bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+	snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+	snprintf(bgw.bgw_function_name, BGW_MAXLEN, "%s", "DataChecksumsWorkerMain");
+	snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker worker");
+	snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker worker");
+	bgw.bgw_restart_time = BGW_NEVER_RESTART;
+	bgw.bgw_notify_pid = MyProcPid;
+	bgw.bgw_main_arg = ObjectIdGetDatum(db->dboid);
+
+	/*
+	 * If there are no worker slots available, make sure we retry processing
+	 * this database. This will make the datachecksumsworker move on to the
+	 * next database and quite likely fail with the same problem. TODO: Maybe
+	 * we need a backoff to avoid running through all the databases here in
+	 * short order.
+	 */
+	if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+	{
+		ereport(WARNING,
+				(errmsg("failed to start worker for enabling data checksums in database \"%s\", retrying",
+						db->dbname),
+				 errhint("The max_worker_processes setting might be too low.")));
+		return DATACHECKSUMSWORKER_RETRYDB;
+	}
+
+	status = WaitForBackgroundWorkerStartup(bgw_handle, &pid);
+	if (status == BGWH_STOPPED)
+	{
+		ereport(WARNING,
+				(errmsg("could not start background worker for enabling data checksums in database \"%s\"",
+						db->dbname),
+				 errhint("More details on the error might be found in the server log.")));
+		return DATACHECKSUMSWORKER_FAILED;
+	}
+
+	/*
+	 * If the postmaster crashed we cannot end up with a processed database so
+	 * we have no alternative other than exiting. When enabling checksums we
+	 * won't at this time have changed the pg_control version to enabled so
+	 * when the cluster comes back up processing will have to be restarted.
+	 * When disabling, the pg_control version will be set to off before this
+	 * so when the cluster comes up checksums will be off as expected.
+	 */
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("cannot enable data checksums without the postmaster process"),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	Assert(status == BGWH_STARTED);
+	ereport(DEBUG1,
+			(errmsg("initiating data checksum processing in database \"%s\"",
+					db->dbname)));
+
+	snprintf(activity, sizeof(activity) - 1,
+			 "Waiting for worker in database %s (pid %d)", db->dbname, pid);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	status = WaitForBackgroundWorkerShutdown(bgw_handle);
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("postmaster exited during data checksum processing in \"%s\"",
+						db->dbname),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	if (DataChecksumsWorkerShmem->success == DATACHECKSUMSWORKER_ABORTED)
+		ereport(LOG,
+				(errmsg("data checksums processing was aborted in database \"%s\"",
+						db->dbname)));
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return DataChecksumsWorkerShmem->success;
+}
+
+/*
+ * launcher_exit
+ *
+ * Internal routine for cleaning up state when the launcher process exits. We
+ * need to clean up the abort flag to ensure that processing can be restarted
+ * again after it was previously aborted.
+ */
+static void
+launcher_exit(int code, Datum arg)
+{
+	if (launcher_running)
+	{
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		launcher_running = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		LWLockRelease(DataChecksumsWorkerLock);
+	}
+}
+
+/*
+ * launcher_cancel_handler
+ *
+ * Internal routine for reacting to SIGINT and flagging the worker to abort.
+ * The worker won't be interrupted immediately but will check for abort flag
+ * between each block in a relation.
+ */
+static void
+launcher_cancel_handler(SIGNAL_ARGS)
+{
+	int			save_errno = errno;
+
+	abort_requested = true;
+
+	/*
+	 * There is no sleeping in the main loop, the flag will be checked
+	 * periodically in ProcessSingleRelationFork. The worker does however
+	 * sleep when waiting for concurrent transactions to end so we still need
+	 * to set the latch.
+	 */
+	SetLatch(MyLatch);
+
+	errno = save_errno;
+}
+
+/*
+ * WaitForAllTransactionsToFinish
+ *		Blocks awaiting all current transactions to finish
+ *
+ * Returns when all transactions which are active at the call of the function
+ * have ended, or if the postmaster dies while waiting. If the postmaster dies
+ * the abort flag will be set to indicate that the caller of this shouldn't
+ * proceed.
+ *
+ * NB: this will return early, if aborted by SIGINT or if the target state
+ * is changed while we're running.
+ */
+static void
+WaitForAllTransactionsToFinish(void)
+{
+	TransactionId waitforxid;
+
+	LWLockAcquire(XidGenLock, LW_SHARED);
+	waitforxid = XidFromFullTransactionId(TransamVariables->nextXid);
+	LWLockRelease(XidGenLock);
+
+	while (TransactionIdPrecedes(GetOldestActiveTransactionId(), waitforxid))
+	{
+		char		activity[64];
+		int			rc;
+
+		/* Oldest running xid is older than us, so wait */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for current transactions to finish (waiting for %u)",
+				 waitforxid);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		rc = WaitLatch(MyLatch,
+					   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+					   5000,
+					   WAIT_EVENT_CHECKSUM_ENABLE_STARTCONDITION);
+
+		/*
+		 * If the postmaster died we won't be able to enable checksums
+		 * cluster-wide so abort and hope to continue when restarted.
+		 */
+		if (rc & WL_POSTMASTER_DEATH)
+			ereport(FATAL,
+					(errmsg("postmaster exited during data checksum processing"),
+					 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_SHARED);
+		if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			abort_requested = true;
+		LWLockRelease(DataChecksumsWorkerLock);
+		if (abort_requested)
+			break;
+	}
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+	return;
+}
+
+/*
+ * DataChecksumsWorkerLauncherMain
+ *
+ * Main function for launching dynamic background workers for processing data
+ * checksums in databases. This function has the bgworker management, with
+ * ProcessAllDatabases being responsible for looping over the databases and
+ * initiating processing.
+ */
+void
+DataChecksumsWorkerLauncherMain(Datum arg)
+{
+	on_shmem_exit(launcher_exit, 0);
+
+	ereport(DEBUG1,
+			errmsg("background worker \"datachecksumsworker\" launcher started"));
+
+	pqsignal(SIGTERM, die);
+	pqsignal(SIGINT, launcher_cancel_handler);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_LAUNCHER;
+	init_ps_display(NULL);
+
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	if (DataChecksumsWorkerShmem->launcher_running)
+	{
+		/* Launcher was already running, let it finish */
+		LWLockRelease(DataChecksumsWorkerLock);
+		return;
+	}
+
+	launcher_running = true;
+
+	/*
+	 * Initialize a connection to shared catalogs only.
+	 */
+	BackgroundWorkerInitializeConnectionByOid(InvalidOid, InvalidOid, 0);
+
+	/* Initialize backend status information */
+	pgstat_bestart();
+
+	enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+	DataChecksumsWorkerShmem->launcher_running = true;
+	DataChecksumsWorkerShmem->enabling_checksums = enabling_checksums;
+	DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+	DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+	DataChecksumsWorkerShmem->immediate_checkpoint = DataChecksumsWorkerShmem->launch_fast;
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * The target state can change while we are busy enabling/disabling
+	 * checksums, if the user calls pg_disable/enable_data_checksums() before
+	 * we are finished with the previous request. In that case, we will loop
+	 * back here, to process the new request.
+	 */
+again:
+
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+	/*
+	 * If we're asked to enable checksums, we need to check if processing was
+	 * previously interrupted such that we should resume rather than start
+	 * from scratch.
+	 */
+	if (enabling_checksums)
+	{
+		/*
+		 * If we are asked to enable checksums in a cluster which already has
+		 * checksums enabled, exit immediately as there is nothing more to do.
+		 * Hold interrupts to make sure state doesn't change during checking.
+		 */
+		HOLD_INTERRUPTS();
+		if (DataChecksumsNeedVerify())
+		{
+			RESUME_INTERRUPTS();
+			goto done;
+		}
+		RESUME_INTERRUPTS();
+
+		/*
+		 * Initialize progress and indicate that we are waiting on the other
+		 * backends to clear the procsignalbarrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS);
+
+		/*
+		 * Set the state to inprogress-on and wait on the procsignal barrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_ENABLING);
+		SetDataChecksumsOnInProgress();
+
+		if (!ProcessAllDatabases(DataChecksumsWorkerShmem->immediate_checkpoint))
+		{
+			/*
+			 * If the target state changed during processing then it's not a
+			 * failure, so restart processing instead.
+			 */
+			LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+			if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			{
+				LWLockRelease(DataChecksumsWorkerLock);
+				goto done;
+			}
+			LWLockRelease(DataChecksumsWorkerLock);
+			ereport(ERROR,
+					(errmsg("unable to enable data checksums in cluster")));
+		}
+
+		SetDataChecksumsOn();
+	}
+	else
+	{
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_DISABLING);
+		SetDataChecksumsOff();
+	}
+
+done:
+
+	/*
+	 * All done. But before we exit, check if the target state was changed
+	 * while we were running. In that case we will have to start all over
+	 * again.
+	 */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+	if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+	{
+		DataChecksumsWorkerShmem->enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+		DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+		LWLockRelease(DataChecksumsWorkerLock);
+		goto again;
+	}
+
+	/* Shut down progress reporting as we are done */
+	pgstat_progress_end_command();
+
+	launcher_running = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	LWLockRelease(DataChecksumsWorkerLock);
+}
+
+/*
+ * ProcessAllDatabases
+ *		Compute the list of all databases and process checksums in each
+ *
+ * This will repeatedly generate a list of databases to process for enabling
+ * checksums. Until no new databases are found, this will loop around computing
+ * a new list and comparing it to the already seen ones.
+ *
+ * If immediate_checkpoint is set to true then a CHECKPOINT_IMMEDIATE will be
+ * issued. This is useful for testing but should be avoided in production use
+ * as it may affect cluster performance drastically.
+ */
+static bool
+ProcessAllDatabases(bool immediate_checkpoint)
+{
+	List	   *DatabaseList;
+	HTAB	   *ProcessedDatabases = NULL;
+	HASHCTL		hash_ctl;
+	bool		found_failed = false;
+	int			flags;
+
+	/* Initialize a hash tracking all processed databases */
+	memset(&hash_ctl, 0, sizeof(hash_ctl));
+	hash_ctl.keysize = sizeof(Oid);
+	hash_ctl.entrysize = sizeof(DataChecksumsWorkerResultEntry);
+	ProcessedDatabases = hash_create("Processed databases",
+									 64,
+									 &hash_ctl,
+									 HASH_ELEM | HASH_BLOBS);
+	/*
+	 * Set up so first run processes shared catalogs, but not once in every
+	 * db.
+	 */
+	DataChecksumsWorkerShmem->process_shared_catalogs = true;
+
+	/*
+	 * Get a list of all databases to process. This may include databases that
+	 * were created during our runtime.  Since a database can be created as a
+	 * copy of any other database (which may not have existed in our last
+	 * run), we have to repeat this loop until no new databases show up in the
+	 * list.
+	 */
+	DatabaseList = BuildDatabaseList();
+
+	/*
+	 * Update progress reporting with the total number of databases we need
+	 * to process. This number should not be changed during processing, the
+	 * columns for processed databases is instead increased such that it can
+	 * be compared against the total.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_DB,
+								 list_length(DatabaseList));
+
+	while (true)
+	{
+		int			processed_databases = 0;
+
+		foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+		{
+			DataChecksumsWorkerResult result;
+			DataChecksumsWorkerResultEntry *entry;
+			bool		found;
+
+			/*
+			 * Indicate which database is being processed set the number of
+			 * relations to -1 to clear field from previous values. -1 will
+			 * translate to NULL in the progress view.
+			 */
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_CUR_DB, db->dboid);
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_REL, -1);
+
+			/*
+			 * Check if this database has been processed already, and if so
+			 * whether it should be retried or skipped.
+			 */
+			entry = (DataChecksumsWorkerResultEntry *) hash_search(ProcessedDatabases, &db->dboid,
+																   HASH_FIND, NULL);
+
+			if (entry)
+			{
+				if (entry->result == DATACHECKSUMSWORKER_RETRYDB)
+				{
+					/*
+					 * Limit the number of retries to avoid infinite looping
+					 * in case there simply won't be enough workers in the
+					 * cluster to finish this operation.
+					 */
+					if (entry->retries > DATACHECKSUMSWORKER_MAX_DB_RETRIES)
+						entry->result = DATACHECKSUMSWORKER_FAILED;
+				}
+
+				/* Skip if this database has been processed already */
+				if (entry->result != DATACHECKSUMSWORKER_RETRYDB)
+					continue;
+			}
+
+			result = ProcessDatabase(db);
+			processed_databases++;
+
+			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
+			{
+				/*
+				 * If one database has completed shared catalogs, we don't
+				 * have to process them again.
+				 */
+				if (DataChecksumsWorkerShmem->process_shared_catalogs)
+					DataChecksumsWorkerShmem->process_shared_catalogs = false;
+			}
+			else if (result == DATACHECKSUMSWORKER_ABORTED)
+			{
+				/* Abort flag set, so exit the whole process */
+				return false;
+			}
+
+			entry = hash_search(ProcessedDatabases, &db->dboid, HASH_ENTER, &found);
+			entry->dboid = db->dboid;
+			entry->result = result;
+			if (!found)
+				entry->retries = 0;
+			else
+				entry->retries++;
+		}
+
+		elog(DEBUG1,
+			 "%i databases processed for data checksum enabling, %s",
+			 processed_databases,
+			 (processed_databases ? "process with restart" : "process completed"));
+
+		FreeDatabaseList(DatabaseList);
+
+		/*
+		 * If no databases were processed in this run of the loop, we have now
+		 * finished all databases and no concurrently created ones can exist.
+		 */
+		if (processed_databases == 0)
+			break;
+
+		/*
+		 * Re-generate the list of databases for another pass. Since we wait
+		 * for all pre-existing transactions finish, this way we can be
+		 * certain that there are no databases left without checksums.
+		 */
+		WaitForAllTransactionsToFinish();
+		DatabaseList = BuildDatabaseList();
+	}
+
+	/*
+	 * ProcessedDatabases now has all databases and the results of their
+	 * processing. Failure to enable checksums for a database can be because
+	 * they actually failed for some reason, or because the database was
+	 * dropped between us getting the database list and trying to process it.
+	 * Get a fresh list of databases to detect the second case where the
+	 * database was dropped before we had started processing it. If a database
+	 * still exists, but enabling checksums failed then we fail the entire
+	 * checksumming process and exit with an error.
+	 */
+	WaitForAllTransactionsToFinish();
+	DatabaseList = BuildDatabaseList();
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+	{
+		DataChecksumsWorkerResult *entry;
+		bool		found;
+
+		entry = hash_search(ProcessedDatabases, (void *) &db->dboid,
+							HASH_FIND, &found);
+
+		/*
+		 * We are only interested in the databases where the failed database
+		 * still exists.
+		 */
+		if (found && *entry == DATACHECKSUMSWORKER_FAILED)
+		{
+			ereport(WARNING,
+					(errmsg("failed to enable data checksums in \"%s\"",
+							db->dbname)));
+			found_failed = found;
+			continue;
+		}
+	}
+
+	FreeDatabaseList(DatabaseList);
+
+	if (found_failed)
+	{
+		/* Disable checksums on cluster, because we failed */
+		SetDataChecksumsOff();
+		ereport(ERROR,
+				(errmsg("data checksums failed to get enabled in all databases, aborting"),
+				 errhint("The server log might have more information on the cause of the error.")));
+	}
+
+	/*
+	 * Force a checkpoint to get everything out to disk. The use of immediate
+	 * checkpoints is for running tests, as they would otherwise not execute
+	 * in such a way that they can reliably be placed under timeout control.
+	 */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_IMMEDIATE;
+	RequestCheckpoint(flags);
+
+	return true;
+}
+
+/*
+ * DataChecksumsWorkerShmemSize
+ *		Compute required space for datachecksumsworker-related shared memory
+ */
+Size
+DataChecksumsWorkerShmemSize(void)
+{
+	Size		size;
+
+	size = sizeof(DataChecksumsWorkerShmemStruct);
+	size = MAXALIGN(size);
+
+	return size;
+}
+
+/*
+ * DataChecksumsWorkerShmemInit
+ *		Allocate and initialize datachecksumsworker-related shared memory
+ */
+void
+DataChecksumsWorkerShmemInit(void)
+{
+	bool		found;
+
+	DataChecksumsWorkerShmem = (DataChecksumsWorkerShmemStruct *)
+		ShmemInitStruct("DataChecksumsWorker Data",
+						DataChecksumsWorkerShmemSize(),
+						&found);
+
+	MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
+
+	/*
+	 * Even if this is a redundant assignment, we want to be explicit about
+	 * our intent for readability, since we want to be able to query this
+	 * state in case of restartability.
+	 */
+	DataChecksumsWorkerShmem->launch_enable_checksums = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	DataChecksumsWorkerShmem->launch_fast = false;
+}
+
+/*
+ * BuildDatabaseList
+ *		Compile a list of all currently available databases in the cluster
+ *
+ * This creates the list of databases for the datachecksumsworker workers to
+ * add checksums to. If the caller wants to ensure that no concurrently
+ * running CREATE DATABASE calls exist, this needs to be preceded by a call
+ * to WaitForAllTransactionsToFinish().
+ */
+static List *
+BuildDatabaseList(void)
+{
+	List	   *DatabaseList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(DatabaseRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_database pgdb = (Form_pg_database) GETSTRUCT(tup);
+		DataChecksumsWorkerDatabase *db;
+
+		oldctx = MemoryContextSwitchTo(ctx);
+
+		db = (DataChecksumsWorkerDatabase *) palloc0(sizeof(DataChecksumsWorkerDatabase));
+
+		db->dboid = pgdb->oid;
+		db->dbname = pstrdup(NameStr(pgdb->datname));
+
+		DatabaseList = lappend(DatabaseList, db);
+
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return DatabaseList;
+}
+
+static void
+FreeDatabaseList(List *dblist)
+{
+	if (!dblist)
+		return;
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, dblist)
+	{
+		if (db->dbname != NULL)
+			pfree(db->dbname);
+	}
+
+	list_free_deep(dblist);
+}
+
+/*
+ * BuildRelationList
+ *		Compile a list of relations in the database
+ *
+ * Returns a list of OIDs for the request relation types. If temp_relations
+ * is True then only temporary relations are returned. If temp_relations is
+ * False then non-temporary relations which have data checksums are returned.
+ * If include_shared is True then shared relations are included as well in a
+ * non-temporary list. include_shared has no relevance when building a list of
+ * temporary relations.
+ */
+static List *
+BuildRelationList(bool temp_relations, bool include_shared)
+{
+	List	   *RelationList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(RelationRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_class pgc = (Form_pg_class) GETSTRUCT(tup);
+
+		/*
+		 * Only include temporary relations when asked for a temp relation
+		 * list.
+		 */
+		if (pgc->relpersistence == RELPERSISTENCE_TEMP)
+		{
+			if (!temp_relations)
+				continue;
+		}
+		else
+		{
+			/*
+			 * If we are only interested in temp relations then continue
+			 * immediately as the current relation isn't a temp relation.
+			 */
+			if (temp_relations)
+				continue;
+
+			if (!RELKIND_HAS_STORAGE(pgc->relkind))
+				continue;
+
+			if (pgc->relisshared && !include_shared)
+				continue;
+		}
+
+		oldctx = MemoryContextSwitchTo(ctx);
+		RelationList = lappend_oid(RelationList, pgc->oid);
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return RelationList;
+}
+
+/*
+ * DataChecksumsWorkerMain
+ *
+ * Main function for enabling checksums in a single database, This is the
+ * function set as the bgw_function_name in the dynamic background worker
+ * process initiated for each database by the worker launcher. After enabling
+ * data checksums in each applicable relation in the database, it will wait for
+ * all temporary relations that were present when the function started to
+ * disappear before returning. This is required since we cannot rewrite
+ * existing temporary relations with data checksums.
+ */
+void
+DataChecksumsWorkerMain(Datum arg)
+{
+	Oid			dboid = DatumGetObjectId(arg);
+	List	   *RelationList = NIL;
+	List	   *InitialTempTableList = NIL;
+	BufferAccessStrategy strategy;
+	bool		aborted = false;
+
+	enabling_checksums = true;
+
+	pqsignal(SIGTERM, die);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_WORKER;
+	init_ps_display(NULL);
+
+	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
+											  BGWORKER_BYPASS_ALLOWCONN);
+
+	/*
+	 * Get a list of all temp tables present as we start in this database. We
+	 * need to wait until they are all gone until we are done, since we cannot
+	 * access these relations and modify them.
+	 */
+	InitialTempTableList = BuildRelationList(true, false);
+
+	/*
+	 * Enable vacuum cost delay, if any.
+	 */
+	Assert(DataChecksumsWorkerShmem->enabling_checksums);
+	VacuumCostDelay = DataChecksumsWorkerShmem->cost_delay;
+	VacuumCostLimit = DataChecksumsWorkerShmem->cost_limit;
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumCostPageHit = 0;
+	VacuumCostPageMiss = 0;
+	VacuumCostPageDirty = 0;
+
+	/*
+	 * Create and set the vacuum strategy as our buffer strategy.
+	 */
+	strategy = GetAccessStrategy(BAS_VACUUM);
+
+	RelationList = BuildRelationList(false,
+									 DataChecksumsWorkerShmem->process_shared_catalogs);
+	foreach_oid(reloid, RelationList)
+	{
+		if (!ProcessSingleRelationByOid(reloid, strategy))
+		{
+			aborted = true;
+			break;
+		}
+	}
+	list_free(RelationList);
+
+	if (aborted)
+	{
+		DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+		ereport(DEBUG1,
+				(errmsg("data checksum processing aborted in database OID %u",
+						dboid)));
+		return;
+	}
+
+	/*
+	 * Wait for all temp tables that existed when we started to go away. This
+	 * is necessary since we cannot "reach" them to enable checksums. Any temp
+	 * tables created after we started will already have checksums in them
+	 * (due to the "inprogress-on" state), so no need to wait for those.
+	 */
+	for (;;)
+	{
+		List	   *CurrentTempTables;
+		int			numleft;
+		char		activity[64];
+
+		CurrentTempTables = BuildRelationList(true, false);
+		numleft = 0;
+		foreach_oid(tmptbloid, InitialTempTableList)
+		{
+			if (list_member_oid(CurrentTempTables, tmptbloid))
+				numleft++;
+		}
+		list_free(CurrentTempTables);
+
+		if (numleft == 0)
+			break;
+
+		/*
+		 * At least one temp table is left to wait for, indicate in pgstat
+		 * activity and progress reporting.
+		 */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for %d temp tables to be removed", numleft);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		(void) WaitLatch(MyLatch,
+						 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+						 5000,
+						 WAIT_EVENT_CHECKSUM_ENABLE_FINISHCONDITION);
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		aborted = DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums;
+		LWLockRelease(DataChecksumsWorkerLock);
+
+		if (aborted || abort_requested)
+		{
+			DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+			ereport(DEBUG1,
+					(errmsg("data checksum processing aborted in database OID %u",
+							dboid)));
+			return;
+		}
+	}
+
+	list_free(InitialTempTableList);
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
+}
diff --git a/src/backend/postmaster/meson.build b/src/backend/postmaster/meson.build
index 0ea4bbe084..3aacb2e0e7 100644
--- a/src/backend/postmaster/meson.build
+++ b/src/backend/postmaster/meson.build
@@ -6,6 +6,7 @@ backend_sources += files(
   'bgworker.c',
   'bgwriter.c',
   'checkpointer.c',
+  'datachecksumsworker.c',
   'fork_process.c',
   'interrupt.c',
   'launch_backend.c',
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index d687ceee33..6c38a57a20 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -186,6 +186,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 		case XLOG_FPW_CHANGE:
 		case XLOG_FPI_FOR_HINT:
 		case XLOG_FPI:
+		case XLOG_CHECKSUMS:
 		case XLOG_OVERWRITE_CONTRECORD:
 		case XLOG_CHECKPOINT_REDO:
 			break;
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 10fc18f252..b9d98d0ada 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -31,6 +31,7 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "postmaster/walsummarizer.h"
 #include "replication/logicallauncher.h"
@@ -152,6 +153,7 @@ CalculateShmemSize(int *num_semaphores)
 	size = add_size(size, InjectionPointShmemSize());
 	size = add_size(size, SlotSyncShmemSize());
 	size = add_size(size, WaitLSNShmemSize());
+	size = add_size(size, DataChecksumsWorkerShmemSize());
 
 	/* include additional requested shmem from preload libraries */
 	size = add_size(size, total_addin_request);
@@ -334,6 +336,7 @@ CreateOrAttachShmemStructs(void)
 	PgArchShmemInit();
 	ApplyLauncherShmemInit();
 	SlotSyncShmemInit();
+	DataChecksumsWorkerShmemInit();
 
 	/*
 	 * Set up other modules that need some shared memory space
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 87027f27eb..e51cfb3a98 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -18,6 +18,7 @@
 #include <unistd.h>
 
 #include "access/parallel.h"
+#include "access/xlog.h"
 #include "commands/async.h"
 #include "miscadmin.h"
 #include "pgstat.h"
@@ -573,6 +574,18 @@ ProcessProcSignalBarrier(void)
 					case PROCSIGNAL_BARRIER_SMGRRELEASE:
 						processed = ProcessBarrierSmgrRelease();
 						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON:
+						processed = AbsorbChecksumsOnInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_ON:
+						processed = AbsorbChecksumsOnBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF:
+						processed = AbsorbChecksumsOffInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_OFF:
+						processed = AbsorbChecksumsOffBarrier();
+						break;
 				}
 
 				/*
diff --git a/src/backend/storage/page/README b/src/backend/storage/page/README
index e30d7ac59a..73c36a6390 100644
--- a/src/backend/storage/page/README
+++ b/src/backend/storage/page/README
@@ -10,7 +10,9 @@ http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf, discussed
 2010/12/22 on -hackers list.
 
 Current implementation requires this be enabled system-wide at initdb time, or
-by using the pg_checksums tool on an offline cluster.
+by using the pg_checksums tool on an offline cluster.  Checksums can also be
+enabled at runtime using pg_enable_data_checksums(), and disabled by using
+pg_disable_data_checksums().
 
 The checksum is not valid at all times on a data page!!
 The checksum is valid when the page leaves the shared pool and is checked
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index be6f1f62d2..112c05ee7e 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -100,7 +100,7 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 	 */
 	if (!PageIsNew(page))
 	{
-		if (DataChecksumsEnabled())
+		if (DataChecksumsNeedVerify())
 		{
 			checksum = pg_checksum_page((char *) page, blkno);
 
@@ -1512,7 +1512,7 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 	static char *pageCopy = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return (char *) page;
 
 	/*
@@ -1542,7 +1542,7 @@ void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
 	/* If we don't need a checksum, just return */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return;
 
 	((PageHeader) page)->pd_checksum = pg_checksum_page((char *) page, blkno);
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index cc2ffc78aa..2b5671e2ca 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -359,6 +359,8 @@ pgstat_tracks_io_bktype(BackendType bktype)
 		case B_WAL_SUMMARIZER:
 			return false;
 
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 		case B_AUTOVAC_LAUNCHER:
 		case B_AUTOVAC_WORKER:
 		case B_BACKEND:
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index 8efb4044d6..4ed6ec157a 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -115,6 +115,8 @@ CHECKPOINT_DELAY_COMPLETE	"Waiting for a backend that blocks a checkpoint from c
 CHECKPOINT_DELAY_START	"Waiting for a backend that blocks a checkpoint from starting."
 CHECKPOINT_DONE	"Waiting for a checkpoint to complete."
 CHECKPOINT_START	"Waiting for a checkpoint to start."
+CHECKSUM_ENABLE_STARTCONDITION	"Waiting for data checksums enabling to start."
+CHECKSUM_ENABLE_FINISHCONDITION	"Waiting for data checksums to be enabled."
 EXECUTE_GATHER	"Waiting for activity from a child process while executing a <literal>Gather</literal> plan node."
 HASH_BATCH_ALLOCATE	"Waiting for an elected Parallel Hash participant to allocate a hash table."
 HASH_BATCH_ELECT	"Waiting to elect a Parallel Hash participant to allocate a hash table."
@@ -347,6 +349,7 @@ DSMRegistry	"Waiting to read or update the dynamic shared memory registry."
 InjectionPoint	"Waiting to read or update information related to injection points."
 SerialControl	"Waiting to read or update shared <filename>pg_serial</filename> state."
 WaitLSN	"Waiting to read or update shared Wait-for-LSN state."
+DataChecksumsWorker	"Waiting for data checksumsworker."
 
 #
 # END OF PREDEFINED LWLOCKS (DO NOT CHANGE THIS LINE)
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index f7b50e0b5a..9f52541eda 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -246,6 +246,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		cmdtype = PROGRESS_COMMAND_BASEBACKUP;
 	else if (pg_strcasecmp(cmd, "COPY") == 0)
 		cmdtype = PROGRESS_COMMAND_COPY;
+	else if (pg_strcasecmp(cmd, "DATACHECKSUMS") == 0)
+		cmdtype = PROGRESS_COMMAND_DATACHECKSUMS;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -1115,9 +1117,6 @@ pg_stat_get_db_checksum_failures(PG_FUNCTION_ARGS)
 	int64		result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
@@ -1133,9 +1132,6 @@ pg_stat_get_db_checksum_last_failure(PG_FUNCTION_ARGS)
 	TimestampTz result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index ef60f41b8c..8d15c15edb 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -290,6 +290,12 @@ GetBackendTypeDesc(BackendType backendType)
 		case B_CHECKPOINTER:
 			backendDesc = "checkpointer";
 			break;
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+			backendDesc = "datachecksumsworker launcher";
+			break;
+		case B_DATACHECKSUMSWORKER_WORKER:
+			backendDesc = "datachecksumsworker worker";
+			break;
 		case B_LOGGER:
 			backendDesc = "logger";
 			break;
@@ -841,7 +847,8 @@ InitializeSessionUserIdStandalone(void)
 	 * workers, in slot sync worker and in background workers.
 	 */
 	Assert(!IsUnderPostmaster || AmAutoVacuumWorkerProcess() ||
-		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess());
+		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess() ||
+		   AmDataChecksumsWorkerProcess());
 
 	/* call only once */
 	Assert(!OidIsValid(AuthenticatedUserId));
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index a024b1151d..513b163ed7 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -720,6 +720,11 @@ InitPostgres(const char *in_dbname, Oid dboid,
 	 */
 	SharedInvalBackendInit(false);
 
+	/*
+	 * Set up backend local cache of Controldata values.
+	 */
+	InitLocalControldata();
+
 	ProcSignalInit(MyCancelKeyValid, MyCancelKey);
 
 	/*
@@ -864,7 +869,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
 					 errhint("You should immediately run CREATE USER \"%s\" SUPERUSER;.",
 							 username != NULL ? username : "postgres")));
 	}
-	else if (AmBackgroundWorkerProcess())
+	else if (AmBackgroundWorkerProcess() || AmDataChecksumsWorkerProcess())
 	{
 		if (username == NULL && !OidIsValid(useroid))
 		{
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 686309db58..e56e43f701 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -474,6 +474,14 @@ static const struct config_enum_entry wal_compression_options[] = {
 	{NULL, 0, false}
 };
 
+static const struct config_enum_entry data_checksums_options[] = {
+	{"on", DATA_CHECKSUMS_ON, true},
+	{"off", DATA_CHECKSUMS_OFF, true},
+	{"inprogress-on", DATA_CHECKSUMS_INPROGRESS_ON, true},
+	{"inprogress-off", DATA_CHECKSUMS_INPROGRESS_OFF, true},
+	{NULL, 0, false}
+};
+
 /*
  * Options for enum values stored in other modules
  */
@@ -599,7 +607,7 @@ static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
 static int	num_os_semaphores;
-static bool data_checksums;
+static int data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
@@ -1919,17 +1927,6 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
-	{
-		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
-			gettext_noop("Shows whether data checksums are turned on for this cluster."),
-			NULL,
-			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
-		},
-		&data_checksums,
-		false,
-		NULL, NULL, NULL
-	},
-
 	{
 		{"syslog_sequence_numbers", PGC_SIGHUP, LOGGING_WHERE,
 			gettext_noop("Add sequence number to syslog messages to avoid duplicate suppression."),
@@ -5196,6 +5193,17 @@ struct config_enum ConfigureNamesEnum[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows whether data checksums are turned on for this cluster."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+		},
+		&data_checksums,
+		DATA_CHECKSUMS_OFF, data_checksums_options,
+		NULL, NULL, show_data_checksums
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index f5f7ff1045..3bda3adb04 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -577,7 +577,7 @@ main(int argc, char *argv[])
 		mode == PG_MODE_DISABLE)
 		pg_fatal("data checksums are already disabled in cluster");
 
-	if (ControlFile->data_checksum_version > 0 &&
+	if (ControlFile->data_checksum_version == DATA_CHECKSUMS_ON &&
 		mode == PG_MODE_ENABLE)
 		pg_fatal("data checksums are already enabled in cluster");
 
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index 854c6887a2..527c807f1c 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -693,6 +693,15 @@ check_control_data(ControlData *oldctrl,
 	 * check_for_isn_and_int8_passing_mismatch().
 	 */
 
+	/*
+	 * If data checksums have been turned on in the old cluster, but the
+	 * datachecksumsworker have yet to finish, then disallow the upgrade. The
+	 * user should either let the process finish, or turn off checksums,
+	 * before retrying.
+	 */
+	if (oldctrl->data_checksum_version == 2)
+		pg_fatal("checksums are being enabled in the old cluster");
+
 	/*
 	 * We might eventually allow upgrades from checksum to no-checksum
 	 * clusters.
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 34ad46c067..dd7c75abe3 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -117,7 +117,7 @@ extern PGDLLIMPORT int wal_level;
  * of the bits make it to disk, but the checksum wouldn't match.  Also WAL-log
  * them if forced by wal_log_hints=on.
  */
-#define XLogHintBitIsNeeded() (DataChecksumsEnabled() || wal_log_hints)
+#define XLogHintBitIsNeeded() (wal_log_hints || DataChecksumsNeedWrite())
 
 /* Do we need to WAL-log information required only for Hot Standby and logical replication? */
 #define XLogStandbyInfoActive() (wal_level >= WAL_LEVEL_REPLICA)
@@ -230,7 +230,19 @@ extern XLogRecPtr GetXLogWriteRecPtr(void);
 
 extern uint64 GetSystemIdentifier(void);
 extern char *GetMockAuthenticationNonce(void);
-extern bool DataChecksumsEnabled(void);
+extern bool DataChecksumsNeedWrite(void);
+extern bool DataChecksumsNeedVerify(void);
+extern bool DataChecksumsOnInProgress(void);
+extern bool DataChecksumsOffInProgress(void);
+extern void SetDataChecksumsOnInProgress(void);
+extern void SetDataChecksumsOn(void);
+extern void SetDataChecksumsOff(void);
+extern bool AbsorbChecksumsOnInProgressBarrier(void);
+extern bool AbsorbChecksumsOffInProgressBarrier(void);
+extern bool AbsorbChecksumsOnBarrier(void);
+extern bool AbsorbChecksumsOffBarrier(void);
+extern const char *show_data_checksums(void);
+extern void InitLocalControldata(void);
 extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
 extern Size XLOGShmemSize(void);
 extern void XLOGShmemInit(void);
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index 5ef244bcdb..21edc8b737 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,6 +25,7 @@
 #include "lib/stringinfo.h"
 #include "pgtime.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/relfilelocator.h"
 
 
@@ -289,6 +290,12 @@ typedef struct xl_restore_point
 	char		rp_name[MAXFNAMELEN];
 } xl_restore_point;
 
+/* Information logged when data checksum level is changed */
+typedef struct xl_checksum_state
+{
+	ChecksumType new_checksumtype;
+}			xl_checksum_state;
+
 /* Overwrite of prior contrecord */
 typedef struct xl_overwrite_contrecord
 {
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index e80ff8e414..17da375aa6 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -80,6 +80,7 @@ typedef struct CheckPoint
 /* 0xC0 is used in Postgres 9.5-11 */
 #define XLOG_OVERWRITE_CONTRECORD		0xD0
 #define XLOG_CHECKPOINT_REDO			0xE0
+#define XLOG_CHECKSUMS					0xF0
 
 
 /*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 77f54a79e6..272ed8edc2 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12145,6 +12145,23 @@
   proname => 'jsonb_subscript_handler', prorettype => 'internal',
   proargtypes => 'internal', prosrc => 'jsonb_subscript_handler' },
 
+# data checksum management functions
+{ oid => '9258',
+  descr => 'disable data checksums',
+  proname => 'pg_disable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => '',
+  prosrc => 'disable_data_checksums' },
+
+{ oid => '9257',
+  descr => 'enable data checksums',
+  proname => 'pg_enable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'int4 int4 bool', proallargtypes => '{int4,int4,bool}',
+  proargmodes => '{i,i,i}',
+  proargnames => '{cost_delay,cost_limit,fast}',
+  prosrc => 'enable_data_checksums' },
+
 # collation management functions
 { oid => '3445', descr => 'import collations from operating system',
   proname => 'pg_import_system_collations', procost => '100',
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 5616d64523..8e3c180392 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -155,4 +155,22 @@
 #define PROGRESS_COPY_TYPE_PIPE 3
 #define PROGRESS_COPY_TYPE_CALLBACK 4
 
+/* Progress parameters for PROGRESS_DATACHECKSUMS */
+#define PROGRESS_DATACHECKSUMS_PHASE 0
+#define PROGRESS_DATACHECKSUMS_TOTAL_DB 1
+#define PROGRESS_DATACHECKSUMS_TOTAL_REL 2
+#define PROGRESS_DATACHECKSUMS_PROCESSED_DB 3
+#define PROGRESS_DATACHECKSUMS_PROCESSED_REL 4
+#define PROGRESS_DATACHECKSUMS_CUR_DB 5
+#define PROGRESS_DATACHECKSUMS_CUR_REL 6
+#define PROGRESS_DATACHECKSUMS_CUR_REL_TOTAL_BLOCKS 7
+#define PROGRESS_DATACHECKSUMS_CUR_REL_PROCESSED_BLOCKS 8
+
+/* Phases of datachecksumsworker operation */
+#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING 0
+#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING 1
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS 2
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL 3
+#define PROGRESS_DATACHECKSUMS_DONE 4
+
 #endif
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index e26d108a47..55e507683a 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -357,6 +357,9 @@ typedef enum BackendType
 	B_WAL_SUMMARIZER,
 	B_WAL_WRITER,
 
+	B_DATACHECKSUMSWORKER_LAUNCHER,
+	B_DATACHECKSUMSWORKER_WORKER,
+
 	/*
 	 * Logger is not connected to shared memory and does not have a PGPROC
 	 * entry.
@@ -380,6 +383,7 @@ extern PGDLLIMPORT BackendType MyBackendType;
 #define AmWalReceiverProcess()		(MyBackendType == B_WAL_RECEIVER)
 #define AmWalSummarizerProcess()	(MyBackendType == B_WAL_SUMMARIZER)
 #define AmWalWriterProcess()		(MyBackendType == B_WAL_WRITER)
+#define AmDataChecksumsWorkerProcess()	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
 
 extern const char *GetBackendTypeDesc(BackendType backendType);
 
diff --git a/src/include/postmaster/datachecksumsworker.h b/src/include/postmaster/datachecksumsworker.h
new file mode 100644
index 0000000000..59c9000d64
--- /dev/null
+++ b/src/include/postmaster/datachecksumsworker.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.h
+ *	  header file for checksum helper background worker
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/postmaster/datachecksumsworker.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef DATACHECKSUMSWORKER_H
+#define DATACHECKSUMSWORKER_H
+
+/* Shared memory */
+extern Size DataChecksumsWorkerShmemSize(void);
+extern void DataChecksumsWorkerShmemInit(void);
+
+/* Start the background processes for enabling or disabling checksums */
+void		StartDataChecksumsWorkerLauncher(bool enable_checksums,
+											 int cost_delay,
+											 int cost_limit,
+											 bool fast);
+
+/* Background worker entrypoints */
+void		DataChecksumsWorkerLauncherMain(Datum arg);
+void		DataChecksumsWorkerMain(Datum arg);
+
+#endif							/* DATACHECKSUMSWORKER_H */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 6222d46e53..9ff4127e58 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -204,7 +204,17 @@ typedef PageHeaderData *PageHeader;
  * handling pages.
  */
 #define PG_PAGE_LAYOUT_VERSION		4
+
+/*
+ * Checksum version 0 is used for when data checksums are disabled.
+ * PG_DATA_CHECKSUM_VERSION defines that data checksums are enabled in the
+ * cluster and PG_DATA_CHECKSUM_INPROGRESS_{ON|OFF}_VERSION defines that data
+ * checksums are either currently being enabled or disabled.
+ */
 #define PG_DATA_CHECKSUM_VERSION	1
+#define PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION	2
+#define PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION	3
+
 
 /* ----------------------------------------------------------------
  *						page support functions
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 08e9d598ce..e3e6ec3d4a 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -15,6 +15,14 @@
 
 #include "storage/block.h"
 
+typedef enum ChecksumType
+{
+	DATA_CHECKSUMS_OFF = 0,
+	DATA_CHECKSUMS_ON,
+	DATA_CHECKSUMS_INPROGRESS_ON,
+	DATA_CHECKSUMS_INPROGRESS_OFF
+} ChecksumType;
+
 /*
  * Compute the checksum for a Postgres page.  The page must be aligned on a
  * 4-byte boundary.
diff --git a/src/include/storage/lwlocklist.h b/src/include/storage/lwlocklist.h
index 88dc79b2bd..c3b3011f72 100644
--- a/src/include/storage/lwlocklist.h
+++ b/src/include/storage/lwlocklist.h
@@ -84,3 +84,4 @@ PG_LWLOCK(50, DSMRegistry)
 PG_LWLOCK(51, InjectionPoint)
 PG_LWLOCK(52, SerialControl)
 PG_LWLOCK(53, WaitLSN)
+PG_LWLOCK(54, DataChecksumsWorker)
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index ebcf0ad403..902a9b51a9 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -445,9 +445,10 @@ extern PGDLLIMPORT PGPROC *PreparedXactProcs;
  * Background writer, checkpointer, WAL writer, WAL summarizer, and archiver
  * run during normal operation.  Startup process and WAL receiver also consume
  * 2 slots, but WAL writer is launched only after startup has exited, so we
- * only need 6 slots.
+ * only need 6 slots to cover these. The DataChecksums worker and launcher
+ * can consume 2 slots when data checksums are enabled or disabled.
  */
-#define NUM_AUXILIARY_PROCS		6
+#define NUM_AUXILIARY_PROCS		8
 
 /* configurable options */
 extern PGDLLIMPORT int DeadlockTimeout;
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 221073def3..dbe68d7eed 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -54,6 +54,11 @@ typedef enum
 typedef enum
 {
 	PROCSIGNAL_BARRIER_SMGRRELEASE, /* ask smgr to close files */
+
+	PROCSIGNAL_BARRIER_CHECKSUM_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_ON,
 } ProcSignalBarrierType;
 
 /*
diff --git a/src/include/utils/backend_progress.h b/src/include/utils/backend_progress.h
index 7b63d38f97..063d3ef86c 100644
--- a/src/include/utils/backend_progress.h
+++ b/src/include/utils/backend_progress.h
@@ -28,6 +28,7 @@ typedef enum ProgressCommandType
 	PROGRESS_COMMAND_CREATE_INDEX,
 	PROGRESS_COMMAND_BASEBACKUP,
 	PROGRESS_COMMAND_COPY,
+	PROGRESS_COMMAND_DATACHECKSUMS,
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	20
diff --git a/src/test/Makefile b/src/test/Makefile
index dbd3192874..36023c1878 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,15 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = perl regress isolation modules authentication recovery subscription
+SUBDIRS = \
+		perl \
+		regress \
+		isolation \
+		modules \
+		authentication \
+		recovery \
+		subscription \
+		checksum
 
 ifeq ($(with_icu),yes)
 SUBDIRS += icu
diff --git a/src/test/checksum/.gitignore b/src/test/checksum/.gitignore
new file mode 100644
index 0000000000..871e943d50
--- /dev/null
+++ b/src/test/checksum/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/checksum/Makefile b/src/test/checksum/Makefile
new file mode 100644
index 0000000000..fd03bf73df
--- /dev/null
+++ b/src/test/checksum/Makefile
@@ -0,0 +1,23 @@
+#-------------------------------------------------------------------------
+#
+# Makefile for src/test/checksum
+#
+# Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/checksum/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/test/checksum
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
+
+clean distclean maintainer-clean:
+	rm -rf tmp_check
diff --git a/src/test/checksum/README b/src/test/checksum/README
new file mode 100644
index 0000000000..0f0317060b
--- /dev/null
+++ b/src/test/checksum/README
@@ -0,0 +1,22 @@
+src/test/checksum/README
+
+Regression tests for data checksums
+===================================
+
+This directory contains a test suite for enabling data checksums
+in a running cluster.
+
+Running the tests
+=================
+
+    make check
+
+or
+
+    make installcheck
+
+NOTE: This creates a temporary installation (in the case of "check"),
+with multiple nodes, be they master or standby(s) for the purpose of
+the tests.
+
+NOTE: This requires the --enable-tap-tests argument to configure.
diff --git a/src/test/checksum/meson.build b/src/test/checksum/meson.build
new file mode 100644
index 0000000000..5f96b5c246
--- /dev/null
+++ b/src/test/checksum/meson.build
@@ -0,0 +1,15 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+tests += {
+  'name': 'checksums',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'tests': [
+      't/001_basic.pl',
+      't/002_restarts.pl',
+      't/003_standby_restarts.pl',
+      't/004_offline.pl',
+    ],
+  },
+}
diff --git a/src/test/checksum/t/001_basic.pl b/src/test/checksum/t/001_basic.pl
new file mode 100644
index 0000000000..f16cf78b91
--- /dev/null
+++ b/src/test/checksum/t/001_basic.pl
@@ -0,0 +1,85 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init();
+$node->start();
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are turned off
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable data checksums
+$node->safe_psql('postgres', "SELECT pg_enable_data_checksums($enable_params);");
+
+# Wait for checksums to become enabled
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Enable data checksums again which should be a no-op..
+$node->safe_psql('postgres', "SELECT pg_enable_data_checksums($enable_params);");
+# ..and make sure we can still read/write data
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums again
+$node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are disabled');
+
+# Test reading again
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure previously checksummed pages can be read back');
+
+# Re-enable checksums and make sure that the underlying data has changed to
+# ensure that checksums will be different.
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+
+$node->safe_psql('postgres', "SELECT pg_enable_data_checksums($enable_params);");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back the data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$node->stop;
+
+done_testing();
diff --git a/src/test/checksum/t/002_restarts.pl b/src/test/checksum/t/002_restarts.pl
new file mode 100644
index 0000000000..dea0ec31df
--- /dev/null
+++ b/src/test/checksum/t/002_restarts.pl
@@ -0,0 +1,90 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# restarting the processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init;
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres', "SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, '1', "ensure checksums aren't enabled yet");
+
+$bsession->quit;
+$node->stop;
+$node->start;
+
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'inprogress-on', "ensure checksums aren't enabled yet");
+
+$node->safe_psql('postgres', "SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are turned on');
+
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+$result = $node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are turned off');
+
+done_testing();
diff --git a/src/test/checksum/t/003_standby_restarts.pl b/src/test/checksum/t/003_standby_restarts.pl
new file mode 100644
index 0000000000..26ad93f86e
--- /dev/null
+++ b/src/test/checksum/t/003_standby_restarts.pl
@@ -0,0 +1,130 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# streaming replication
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize primary node
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+$node_primary->init(allows_streaming => 1);
+$node_primary->start;
+my $backup_name = 'my_backup';
+
+# Take backup
+$node_primary->backup($backup_name);
+
+# Create streaming standby linking to primary
+my $node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $backup_name,
+	has_streaming => 1);
+$node_standby_1->start;
+
+# Create some content on the primary to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Wait for standbys to catch up
+$node_primary->wait_for_catchup($node_standby_1, 'replay',
+	$node_primary->lsn('insert'));
+
+# Check that checksums are turned off on all nodes
+my $result = $node_primary->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on primary');
+
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on standby_1');
+
+# Enable checksums for the cluster
+$node_primary->safe_psql('postgres', "SELECT pg_enable_data_checksums($enable_params);");
+
+# Ensure that the primary switches to "inprogress-on"
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	"inprogress-on");
+is($result, 1, 'ensure checksums are in progress on primary');
+
+# Wait for checksum enable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Ensure that the standby has switched to "inprogress-on" or "on".  Normally it
+# would be "inprogress-on", but it is theoretically possible for the primary to
+# complete the checksum enabling *and* have the standby replay that record
+# before we reach the check below.
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting = 'off' FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'f');
+is($result, 1, 'ensure standby has absorbed the inprogress-on barrier');
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+
+is(($result eq 'inprogress-on' || $result eq 'on'),
+	1, 'ensure checksums are on, or in progress, on standby_1');
+
+# Insert some more data which should be checksummed on INSERT
+$node_primary->safe_psql('postgres',
+	"INSERT INTO t VALUES (generate_series(1, 10000));");
+
+# Wait for checksums enabled on the primary
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the primary');
+
+# Wait for checksums enabled on the standby
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the standby');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, '20000', 'ensure we can safely read all data with checksums');
+
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+# Disable checksums and ensure it's propagated to standby and that we can
+# still read all data
+$node_primary->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+# Wait for checksum disable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure data checksums are disabled on the primary 2');
+
+# Ensure that the standby has switched to off
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are off on standby_1');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, "20000", 'ensure we can safely read all data without checksums');
+
+done_testing();
diff --git a/src/test/checksum/t/004_offline.pl b/src/test/checksum/t/004_offline.pl
new file mode 100644
index 0000000000..b1f585ec7c
--- /dev/null
+++ b/src/test/checksum/t/004_offline.pl
@@ -0,0 +1,100 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums offline from various states
+# of checksum processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init();
+$node->start();
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable checksums offline using pg_checksums
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are enabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums offline again using pg_checksums
+$node->stop;
+$node->checksum_disable_offline;
+$node->start;
+
+# Ensure that checksums are disabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres', "SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, 1, 'ensure checksums are in the process of being enabled');
+
+# Turn the cluster off and enable checksums offline, then start back up
+$bsession->quit;
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are now enabled even though processing wasn't
+# restarted
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+done_testing();
diff --git a/src/test/meson.build b/src/test/meson.build
index c3d0dfedf1..b07d5d2d00 100644
--- a/src/test/meson.build
+++ b/src/test/meson.build
@@ -7,6 +7,7 @@ subdir('authentication')
 subdir('recovery')
 subdir('subscription')
 subdir('modules')
+subdir('checksum')
 
 if ssl.found()
   subdir('ssl')
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 30857f34bf..9e513611ca 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3481,6 +3481,40 @@ sub advance_wal
 	}
 }
 
+=item $node->checksum_enable_offline()
+
+Enable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_enable_offline
+{
+	my ($self) = @_;
+
+	print "### Enabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-e');
+	return;
+}
+
+=item checksum_disable_offline
+
+Disable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_disable_offline
+{
+	my ($self) = @_;
+
+	print "### Disabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-d');
+	return;
+}
+
 =pod
 
 =back
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a65e1c07c5..aa2e41004c 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -396,6 +396,7 @@ CheckPointStmt
 CheckpointStatsData
 CheckpointerRequest
 CheckpointerShmemStruct
+ChecksumType
 Chromosome
 CkptSortItem
 CkptTsStatus
@@ -582,6 +583,10 @@ DataPageDeleteStack
 DataTypesUsageChecks
 DataTypesUsageVersionCheck
 DatabaseInfo
+DataChecksumsWorkerDatabase
+DataChecksumsWorkerResult
+DataChecksumsWorkerResultEntry
+DataChecksumsWorkerShmemStruct
 DateADT
 DateTimeErrorExtra
 Datum
@@ -4076,6 +4081,7 @@ xl_btree_split
 xl_btree_unlink_page
 xl_btree_update
 xl_btree_vacuum
+xl_checksum_state
 xl_clog_truncate
 xl_commit_ts_truncate
 xl_dbase_create_file_copy_rec
-- 
2.39.3 (Apple Git-146)

#11Daniel Gustafsson
daniel@yesql.se
In reply to: Dagfinn Ilmari Mannsåker (#9)
Re: Changing the state of data checksums in a running cluster

On 7 Oct 2024, at 20:42, Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> wrote:

Tomas Vondra <tomas@vondra.me> writes:

3) unnecessary ListCell in DataChecksumsWorkerMain() on line 1345,
shadowing earlier variable

All the ListCell variables can be eliminated by using the foreach_ptr
and foreach_oid macros instead of plain foreach.

Fair point, done in the v4 attached upthread.

--
Daniel Gustafsson

#12Tomas Vondra
tomas@vondra.me
In reply to: Daniel Gustafsson (#10)
Re: Changing the state of data checksums in a running cluster

On 10/8/24 22:38, Daniel Gustafsson wrote:

7) controldata.c - maybe this

if (oldctrl->data_checksum_version == 2)

should use PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION instead of the magic
constant? For "off" we use "0" which seems somewhat acceptable, but for
other values it's less obvious what the meaning is.

It doesn't seem clean to include storage/bufpage.h in pg_upgrade, I wonder if
we should move (or mirror) the checksum versions to storage/checksum_impl.h to
make them available to frontend and backend tools?

+1 to have checksum_impl.h

But there also seems to be a more serious problem with recovery. I did a
simple script that does a loop of

* start a cluster
* initialize a small pgbench database (scale 1 - 100)
* run "long" pgbench
* call pg_enable_data_checksums(), wait for it to complete
* stop the cluster with "-m immediate"
* start the cluster

And this unfortunately hits this assert:

bool
AbsorbChecksumsOnBarrier(void)
{
Assert(LocalDataChecksumVersion ==
PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
LocalDataChecksumVersion = PG_DATA_CHECKSUM_VERSION;
return true;
}

Based on our short discussion about this, the controlfile gets updated
right after pg_enable_data_checksums() completes. The immediate stop
however forces a recovery since the last checkpoint, which means we see
the XLOG_CHECKSUMS WAL message again, and set the barrier. And then we
exit recovery, try to start checkpointer and it trips over this, because
the control file already has the "on" value :-(

I'm not sure what's the best way to fix this. Would it be possible to
remember we saw the XLOG_CHECKSUMS during recovery, and make the assert
noop in that case? Or not set the barrier when exiting recovery. I'm not
sure the relaxed assert would remain meaningful, though. What would it
check on standbys, for example?

Maybe a better way would be to wait for a checkpoint before updating the
controlfile, similar to what we do at the beginning? Possibly even with
the same "fast=true/false" logic. That would prevent us from seeing the
XLOG_CHECKSUMS wal record with the updated flag. It would extend the
"window" where a crash would mean we have to redo the checksums, but I
don't think that matters much. For small databases who cares, and for
large databases it should not be a meaningful difference (setting the
checksums already ran over multiple checkpoints, so one checkpoint is
not a big difference).

The more I think about it the more I think that updating the control file is
the wrong thing to do for this patch, it should only change the state in memory
and let the checkpoints update the controlfile. The attached fixes that and I
can no longer reproduce the assertion failure you hit.

I think leaving the update of controlfile to checkpointer is correct,
and probably the only way to make this correct (without race
conditions). We need to do that automatically with the checkpoint (which
updates the redo LSN, guaranteeing we won't see the XLOG_CHECKSUMS
record again).

I ran the tests with this new patch, and I haven't reproduced the
crashes. I'll let it run a bit longer, and improve it to test some more
stuff, but it looks good.

regards

--
Tomas Vondra

#13Daniel Gustafsson
daniel@yesql.se
In reply to: Tomas Vondra (#12)
1 attachment(s)
Re: Changing the state of data checksums in a running cluster

On 9 Oct 2024, at 12:41, Tomas Vondra <tomas@vondra.me> wrote:

should use PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION instead of the magic
constant? For "off" we use "0" which seems somewhat acceptable, but for
other values it's less obvious what the meaning is.

It doesn't seem clean to include storage/bufpage.h in pg_upgrade, I wonder if
we should move (or mirror) the checksum versions to storage/checksum_impl.h to
make them available to frontend and backend tools?

+1 to have checksum_impl.h

I tried various different ways of breaking out the checksum version into
another header file but all of them ended up messier than the current state due
to how various tools include the checksum code. In the end I opted for doing
the bufpage include to keep it simple. This patch is big enough as it is
without additional refactoring of checksum (header) code, that can be done
separately from this.

I ran the tests with this new patch, and I haven't reproduced the
crashes. I'll let it run a bit longer, and improve it to test some more
stuff, but it looks good.

Thanks for testing, I am too unable to reproduce that error.

The attached v5 has the above include fix as well as pgindent and pgperltidy
runs and some tweaking to the commit message to make it concise. It's also
rebased to handle a recent conflict in the makefiles.

--
Daniel Gustafsson

Attachments:

v5-0001-Online-enabling-and-disabling-of-data-checksums.patchapplication/octet-stream; name=v5-0001-Online-enabling-and-disabling-of-data-checksums.patch; x-unix-mode=0644Download
From a1123d73050ad4084b9afa3dd5f449d2d70b3554 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Fri, 11 Oct 2024 09:26:13 +0200
Subject: [PATCH v5] Online enabling and disabling of data checksums

This allows data checksums to be enabled, or disabled, in a running
cluster without restricting access to the cluster during processing.

Data checksums could prior to this only be enabled during initdb or
when the cluster is offline using the pg_checksums app. This commit
introduce functionality to enable, or disable, data checksums while
the cluster is running regardless of how it was initialized.

A background worker launcher process is responsible for launching a
dynamic per-database background worker which will mark all buffers
dirty for all relation with storage in order for them to have data
checksums calcuated on write.  Once all relations in all databases
have been processed, the data_checksums state will be set to on and
the cluster will at that point be identical to one which had data
checksums enabled during initialization or via offline processing.

When data checksums are being enabled, concurrent I/O operations
from backends other than the data checksums worker will write the
checksums but not verify them on reading.  Only when all backends
have absorbed the procsignalbarrier for setting data_checksums to
on will they also start verifying checksums on reading.  The same
process is repeated during disabling; all backends write checksums
but do not verify them until the barrier for setting the state to
off has been absorbed by all.  This in-progress state is used to
ensure there are no false negatives (or positives) due to reading
a checksum which is not in sync with the page.

This work is based on an earlier version of this patch which was
reviewed by among others Heikki Linnakangas, Robert Haas, Andres
Freund, Tomas Vondra, Michael Banck and Andrey Borodin.

Author: Daniel Gustafsson <daniel@yesql.se>
Author: Magnus Hagander <magnus@hagander.net>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com
Discussion: https://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de
Discussion: https://postgr.es/m/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com
---
 doc/src/sgml/func.sgml                        |   71 +
 doc/src/sgml/glossary.sgml                    |   23 +
 doc/src/sgml/monitoring.sgml                  |  206 ++-
 doc/src/sgml/ref/pg_checksums.sgml            |    6 +
 doc/src/sgml/wal.sgml                         |   57 +-
 src/backend/access/rmgrdesc/xlogdesc.c        |   24 +
 src/backend/access/transam/xlog.c             |  450 +++++-
 src/backend/access/transam/xlogfuncs.c        |   41 +
 src/backend/backup/basebackup.c               |    6 +-
 src/backend/catalog/system_functions.sql      |   11 +
 src/backend/catalog/system_views.sql          |   21 +
 src/backend/postmaster/Makefile               |    1 +
 src/backend/postmaster/bgworker.c             |    7 +
 src/backend/postmaster/datachecksumsworker.c  | 1376 +++++++++++++++++
 src/backend/postmaster/meson.build            |    1 +
 src/backend/replication/logical/decode.c      |    1 +
 src/backend/storage/ipc/ipci.c                |    3 +
 src/backend/storage/ipc/procsignal.c          |   13 +
 src/backend/storage/page/README               |    4 +-
 src/backend/storage/page/bufpage.c            |    6 +-
 src/backend/utils/activity/pgstat_io.c        |    2 +
 .../utils/activity/wait_event_names.txt       |    3 +
 src/backend/utils/adt/pgstatfuncs.c           |    8 +-
 src/backend/utils/init/miscinit.c             |    9 +-
 src/backend/utils/init/postinit.c             |    7 +-
 src/backend/utils/misc/guc_tables.c           |   32 +-
 src/bin/pg_checksums/pg_checksums.c           |    2 +-
 src/bin/pg_upgrade/controldata.c              |    9 +
 src/include/access/xlog.h                     |   16 +-
 src/include/access/xlog_internal.h            |    7 +
 src/include/catalog/pg_control.h              |    1 +
 src/include/catalog/pg_proc.dat               |   17 +
 src/include/commands/progress.h               |   18 +
 src/include/miscadmin.h                       |    4 +
 src/include/postmaster/datachecksumsworker.h  |   31 +
 src/include/storage/bufpage.h                 |   10 +
 src/include/storage/checksum.h                |    8 +
 src/include/storage/lwlocklist.h              |    1 +
 src/include/storage/proc.h                    |    5 +-
 src/include/storage/procsignal.h              |    5 +
 src/include/utils/backend_progress.h          |    1 +
 src/test/Makefile                             |   11 +-
 src/test/checksum/.gitignore                  |    2 +
 src/test/checksum/Makefile                    |   23 +
 src/test/checksum/README                      |   22 +
 src/test/checksum/meson.build                 |   15 +
 src/test/checksum/t/001_basic.pl              |   88 ++
 src/test/checksum/t/002_restarts.pl           |   92 ++
 src/test/checksum/t/003_standby_restarts.pl   |  131 ++
 src/test/checksum/t/004_offline.pl            |  101 ++
 src/test/meson.build                          |    1 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |   34 +
 src/tools/pgindent/typedefs.list              |    6 +
 53 files changed, 3001 insertions(+), 49 deletions(-)
 create mode 100644 src/backend/postmaster/datachecksumsworker.c
 create mode 100644 src/include/postmaster/datachecksumsworker.h
 create mode 100644 src/test/checksum/.gitignore
 create mode 100644 src/test/checksum/Makefile
 create mode 100644 src/test/checksum/README
 create mode 100644 src/test/checksum/meson.build
 create mode 100644 src/test/checksum/t/001_basic.pl
 create mode 100644 src/test/checksum/t/002_restarts.pl
 create mode 100644 src/test/checksum/t/003_standby_restarts.pl
 create mode 100644 src/test/checksum/t/004_offline.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index b26db3b04b..49653609bf 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29715,6 +29715,77 @@ DETAIL:  Make sure pg_wal_replay_wait() isn't called within a transaction with a
 
   </sect2>
 
+  <sect2 id="functions-admin-checksum">
+   <title>Data Checksum Functions</title>
+
+   <para>
+    The functions shown in <xref linkend="functions-checksums-table" /> can
+    be used to enable or disable data checksums in a running cluster.
+    See <xref linkend="checksums" /> for details.
+   </para>
+
+   <table id="functions-checksums-table">
+    <title>Data Checksum Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_enable_data_checksums</primary>
+        </indexterm>
+        <function>pg_enable_data_checksums</function> ( <optional><parameter>cost_delay</parameter> <type>int</type>, <parameter>cost_limit</parameter> <type>int</type></optional> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Initiates data checksums for the cluster. This will switch the data
+        checksums mode to <literal>inprogress-on</literal> as well as start a
+        background worker that will process all pages in the database and
+        enable checksums on them. When all data pages have had checksums
+        enabled, the cluster will automatically switch data checksums mode to
+        <literal>on</literal>.
+       </para>
+       <para>
+        If <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter> are
+        specified, the speed of the process is throttled using the same principles as
+        <link linkend="runtime-config-resource-vacuum-cost">Cost-based Vacuum Delay</link>.
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_disable_data_checksums</primary>
+        </indexterm>
+        <function>pg_disable_data_checksums</function> ()
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Disables data checksum validation and calculation for the cluster. This
+        will switch the data checksum mode to <literal>inprogress-off</literal>
+        while data checksums are being disabled. When all active backends have
+        stopped validating data checksums, the data checksum mode will be
+        changed to <literal>off</literal>.  At this point the data pages will
+        still have checksums recorded but they are not updated when pages are
+        modified.
+       </para></entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+
+  </sect2>
+
   <sect2 id="functions-admin-dbobject">
    <title>Database Object Management Functions</title>
 
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index f54f25c1c6..57a9188dc1 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -159,6 +159,8 @@
      (but not the autovacuum workers),
      the <glossterm linkend="glossary-background-writer">background writer</glossterm>,
      the <glossterm linkend="glossary-checkpointer">checkpointer</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker">data checksums worker</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker-launcher">data checksums worker launcher</glossterm>,
      the <glossterm linkend="glossary-logger">logger</glossterm>,
      the <glossterm linkend="glossary-startup-process">startup process</glossterm>,
      the <glossterm linkend="glossary-wal-archiver">WAL archiver</glossterm>,
@@ -548,6 +550,27 @@
    </glossdef>
   </glossentry>
 
+  <glossentry id="glossary-data-checksums-worker">
+   <glossterm>Data Checksums Worker</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which enables or disables data checksums in a specific database.
+    </para>
+   </glossdef>
+  </glossentry>
+
+  <glossentry id="glossary-data-checksums-worker-launcher">
+   <glossterm>Data Checksums Worker Launcher</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which starts <glossterm linkend="glossary-data-checksums-worker"> processes</glossterm>
+     for each database.
+    </para>
+   </glossdef>
+  </glossentry>
+
   <glossentry id="glossary-db-cluster">
    <glossterm>Database cluster</glossterm>
    <glossdef>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 331315f8d3..aeb96aeceb 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3496,8 +3496,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Number of data page checksum failures detected in this
-       database (or on a shared object), or NULL if data checksums are not
-       enabled.
+       database. Detected failures are reported regardless of the
+       <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -3507,8 +3507,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Time at which the last data page checksum failure was detected in
-       this database (or on a shared object), or NULL if data checksums are not
-       enabled.
+       this database (or on a shared object). Last failure is reported
+       regardless of the <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -6670,6 +6670,204 @@ FROM pg_stat_get_backend_idset() AS backendid;
 
  </sect2>
 
+ <sect2 id="data-checksum-progress-reporting">
+  <title>Data Checksum Progress Reporting</title>
+
+  <indexterm>
+   <primary>pg_stat_progress_data_checksums</primary>
+  </indexterm>
+
+  <para>
+   When data checksums are being enabled on a running cluster, the
+   <structname>pg_stat_progress_data_checksums</structname> view will contain
+   a row for each background worker which is currently calculating checksums
+   for the data pages.
+  </para>
+
+  <table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
+   <title><structname>pg_stat_progress_data_checksums</structname> View</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        Column Type
+       </para>
+       <para>
+        Description>
+       </para>
+      </entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>pid</structfield> <type>integer</type>
+       </para>
+       <para>
+        Process ID of a datachecksumworker process.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>phase</structfield> <type>text</type>
+       </para>
+       <para>
+        Current processing phase, see <xref linkend="datachecksum-phases"/>
+        for description of the phases.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of databases which will be processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of relations which will be processed, or
+        <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of relations yet.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of databases which have been processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of relations which have been processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_current</structfield> <type>oid</type>
+       </para>
+       <para>
+        OID of the database currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current</structfield> <type>oid</type>
+       </para>
+       <para>
+        OID of the relation currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current_blocks</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of blocks in the relation currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current_blocks_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks which have been processed in the relation currently
+        being processed.
+       </para>
+      </entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+  <table id="datachecksum-phases">
+   <title>Data Checksum Phases</title>
+   <tgroup cols="2">
+    <colspec colname="col1" colwidth="1*"/>
+    <colspec colname="col2" colwidth="2*"/>
+    <thead>
+     <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+    <tbody>
+     <row>
+      <entry><literal>enabling</literal></entry>
+      <entry>
+       The command is currently enabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>disabling</literal></entry>
+      <entry>
+       The command is currently disabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on backends</literal></entry>
+      <entry>
+       The command is currently waiting for backends to acknowledge the data
+       checksum operation.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on temporary tables</literal></entry>
+      <entry>
+       The command is currently waiting for all temporary tables which existed
+       at the time the command was started to be removed.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>done</literal></entry>
+      <entry>
+       The command has finished processing and is exiting.
+      </entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+ </sect2>
+
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index 95043aa329..0343710af5 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -45,6 +45,12 @@ PostgreSQL documentation
    exit status is nonzero if the operation failed.
   </para>
 
+  <para>
+   When enabling checksums, if checksums were in the process of being enabled
+   when the cluster was shut down, <application>pg_checksums</application>
+   will still process all relations regardless of the online processing.
+  </para>
+
   <para>
    When verifying checksums, every file in the cluster is scanned. When
    enabling checksums, each relation file block with a changed checksum is
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index a34cddb5ed..448a9c0c94 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -248,9 +248,10 @@
   <para>
    Checksums are normally enabled when the cluster is initialized using <link
    linkend="app-initdb-data-checksums"><application>initdb</application></link>.
-   They can also be enabled or disabled at a later time as an offline
-   operation. Data checksums are enabled or disabled at the full cluster
-   level, and cannot be specified individually for databases or tables.
+   They can also be enabled or disabled at a later time either as an offline
+   operation or online in a running cluster allowing concurrent access. Data
+   checksums are enabled or disabled at the full cluster level, and cannot be
+   specified individually for databases or tables.
   </para>
 
   <para>
@@ -267,7 +268,7 @@
   </para>
 
   <sect2 id="checksums-offline-enable-disable">
-   <title>Off-line Enabling of Checksums</title>
+   <title>Offline Enabling of Checksums</title>
 
    <para>
     The <link linkend="app-pgchecksums"><application>pg_checksums</application></link>
@@ -276,6 +277,54 @@
    </para>
 
   </sect2>
+
+  <sect2 id="checksums-online-enable-disable">
+   <title>Online Enabling of Checksums</title>
+
+   <para>
+    Checksums can be enabled or disabled online, by calling the appropriate
+    <link linkend="functions-admin-checksum">functions</link>.
+   </para>
+
+   <para>
+    Enabling checksums will put the cluster checksum mode in
+    <literal>inprogress-on</literal> mode.  During this time, checksums will be
+    written but not verified. In addition to this, a background worker process
+    is started that enables checksums on all existing data in the cluster. Once
+    this worker has completed processing all databases in the cluster, the
+    checksum mode will automatically switch to <literal>on</literal>. The
+    processing will consume two background worker processes, make sure that
+    <varname>max_worker_processes</varname> allows for at least two more
+    additional processes.
+   </para>
+
+   <para>
+    The process will initially wait for all open transactions to finish before
+    it starts, so that it can be certain that there are no tables that have been
+    created inside a transaction that has not committed yet and thus would not
+    be visible to the process enabling checksums. It will also, for each database,
+    wait for all pre-existing temporary tables to get removed before it finishes.
+    If long-lived temporary tables are used in the application it may be necessary
+    to terminate these application connections to allow the process to complete.
+   </para>
+
+   <para>
+    If the cluster is stopped while in <literal>inprogress-on</literal> mode, for
+    any reason, then this process must be restarted manually. To do this,
+    re-execute the function <function>pg_enable_data_checksums()</function>
+    once the cluster has been restarted. The background worker will attempt
+    to resume the work from where it was interrupted.
+   </para>
+
+   <note>
+    <para>
+     Enabling checksums can cause significant I/O to the system, as most of the
+     database pages will need to be rewritten, and will be written both to the
+     data files and the WAL.
+    </para>
+   </note>
+
+  </sect2>
  </sect1>
 
   <sect1 id="wal-intro">
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 363294d623..465153e5d1 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -18,6 +18,7 @@
 #include "access/xlog.h"
 #include "access/xlog_internal.h"
 #include "catalog/pg_control.h"
+#include "storage/bufpage.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -167,6 +168,26 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
 		memcpy(&wal_level, rec, sizeof(int));
 		appendStringInfo(buf, "wal_level %s", get_wal_level_string(wal_level));
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state xlrec;
+
+		memcpy(&xlrec, rec, sizeof(xl_checksum_state));
+		switch (xlrec.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_VERSION:
+				appendStringInfoString(buf, "on");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				appendStringInfoString(buf, "inprogress-off");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				appendStringInfoString(buf, "inprogress-on");
+				break;
+			default:
+				appendStringInfoString(buf, "off");
+		}
+	}
 }
 
 const char *
@@ -218,6 +239,9 @@ xlog_identify(uint8 info)
 		case XLOG_CHECKPOINT_REDO:
 			id = "CHECKPOINT_REDO";
 			break;
+		case XLOG_CHECKSUMS:
+			id = "CHECKSUMS";
+			break;
 	}
 
 	return id;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 9102c8d772..e25b071c42 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -646,6 +646,16 @@ static XLogRecPtr LocalMinRecoveryPoint;
 static TimeLineID LocalMinRecoveryPointTLI;
 static bool updateMinRecoveryPoint = true;
 
+/*
+ * Local state fror Controlfile data_checksum_version. After initialization
+ * this is only updated when absorbing a procsignal barrier during interrupt
+ * processing. The reason for keeping a copy in backend-private memory is to
+ * avoid locking for interrogating checksum state. Possible values are the
+ * checksum versions defined in storage/bufpage.h as well as zero when data
+ * checksums are disabled.
+ */
+static uint32 LocalDataChecksumVersion = 0;
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -714,6 +724,8 @@ static void WALInsertLockAcquireExclusive(void);
 static void WALInsertLockRelease(void);
 static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);
 
+static void XLogChecksums(ChecksumType new_type);
+
 /*
  * Insert an XLOG record represented by an already-constructed chain of data
  * chunks.  This is a low-level routine; to construct the WAL record header
@@ -827,9 +839,10 @@ XLogInsertRecord(XLogRecData *rdata,
 		 * only happen just after a checkpoint, so it's better to be slow in
 		 * this case and fast otherwise.
 		 *
-		 * Also check to see if fullPageWrites was just turned on or there's a
-		 * running backup (which forces full-page writes); if we weren't
-		 * already doing full-page writes then go back and recompute.
+		 * Also check to see if fullPageWrites was just turned on, there's a
+		 * running backup or if checksums are enabled (all of which forces
+		 * full-page writes); if we weren't already doing full-page writes
+		 * then go back and recompute.
 		 *
 		 * If we aren't doing full-page writes then RedoRecPtr doesn't
 		 * actually affect the contents of the XLOG record, so we'll update
@@ -842,7 +855,9 @@ XLogInsertRecord(XLogRecData *rdata,
 			Assert(RedoRecPtr < Insert->RedoRecPtr);
 			RedoRecPtr = Insert->RedoRecPtr;
 		}
-		doPageWrites = (Insert->fullPageWrites || Insert->runningBackups > 0);
+		doPageWrites = (Insert->fullPageWrites ||
+						Insert->runningBackups > 0 ||
+						DataChecksumsNeedWrite());
 
 		if (doPageWrites &&
 			(!prevDoPageWrites ||
@@ -4539,9 +4554,7 @@ ReadControlFile(void)
 
 	CalculateCheckpointSegments();
 
-	/* Make the initdb settings visible as GUC variables, too */
-	SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no",
-					PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
+	LocalDataChecksumVersion = ControlFile->data_checksum_version;
 }
 
 /*
@@ -4575,13 +4588,345 @@ GetMockAuthenticationNonce(void)
 }
 
 /*
- * Are checksums enabled for data pages?
+ * DataChecksumsNeedWrite
+ *		Returns whether data checksums must be written or not
+ *
+ * Returns true iff data checksums are enabled or are in the process of being
+ * enabled. During "inprogress-on" and "inprogress-off" states checksums must
+ * be written even though they are not verified (see datachecksumsworker.c for
+ * a longer discussion).
+ *
+ * This function is intended for callsites which are about to write a data page
+ * to storage, and need to know whether to re-calculate the checksum for the
+ * page header. Calling this function must be performed as close to the write
+ * operation as possible to keep the critical section short.
+ */
+bool
+DataChecksumsNeedWrite(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * DataChecksumsNeedVerify
+ *		Returns whether data checksums must be verified or not
+ *
+ * Data checksums are only verified if they are fully enabled in the cluster.
+ * During the "inprogress-on" and "inprogress-off" states they are only
+ * updated, not verified (see datachecksumsworker.c for a longer discussion).
+ *
+ * This function is intended for callsites which have read data and are about
+ * to perform checksum validation based on the result of this.  Calling this
+ * function must be performed as close to the validation call as possible to
+ * keep the critical section short. This is in order to protect against time of
+ * check/time of use situations around data checksum validation.
+ */
+bool
+DataChecksumsNeedVerify(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
+}
+
+/*
+ * DataChecksumsOnInProgress
+ *		Returns whether data checksums are being enabled
+ *
+ * Most operations don't need to worry about the "inprogress" states, and
+ * should use DataChecksumsNeedVerify() or DataChecksumsNeedWrite(). The
+ * "inprogress-on" state for enabling checksums is used when the checksum
+ * worker is setting checksums on all pages, it can thus be used to check for
+ * aborted checksum processing which need to be restarted.
+ */
+inline bool
+DataChecksumsOnInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+}
+
+/*
+ * DataChecksumsOffInProgress
+ *		Returns whether data checksums are being disabled
+ *
+ * The "inprogress-off" state for disabling checksums is used for when the
+ * worker resets the catalog state.  DataChecksumsNeedVerify() or
+ * DataChecksumsNeedWrite() should be used for deciding whether to read/write
+ * checksums.
  */
 bool
-DataChecksumsEnabled(void)
+DataChecksumsOffInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * SetDataChecksumsOnInProgress
+ *		Sets the data checksum state to "inprogress-on" to enable checksums
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on". See
+ * SetDataChecksumsOn below for a description on how this state change works.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOnInProgress(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile != NULL);
+
+	/*
+	 * The state transition is performed in a critical section with
+	 * checkpoints held off to provide crash safety.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state change in all backends to ensure that all backends are in
+	 * "inprogress-on". Once done we know that all backends are writing data
+	 * checksums.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOn
+ *		Enables data checksums cluster-wide
+ *
+ * Enabling data checksums is performed using two barriers, the first one to
+ * set the checksums state to "inprogress-on" (which is performed by
+ * SetDataChecksumsOnInProgress()) and the second one to set the state to "on"
+ * (performed here).
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on".  This state
+ * requires data checksums to be written but not verified. This ensures that
+ * all data pages can be checksummed without the risk of false negatives in
+ * validation during the process.  When all existing pages are guaranteed to
+ * have checksums, and all new pages will be initiated with checksums, the
+ * state can be changed to "on". Once the state is "on" checksums will be both
+ * written and verified. See datachecksumsworker.c for a longer discussion on
+ * how data checksums can be enabled in a running cluster.
+ *
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOn(void)
 {
+	uint64		barrier;
+
 	Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/*
+	 * The only allowed state transition to "on" is from "inprogress-on" since
+	 * that state ensures that all pages will have data checksums written.
+	 */
+	if (ControlFile->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+		elog(ERROR, "checksums not in \"inprogress-on\" mode");
+	}
+
+	LWLockRelease(ControlFileLock);
+
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state transition of "on" in all backends. When done we know that
+	 * data checksums are enabled in all backends and data checksums are both
+	 * written and verified.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOff
+ *		Disables data checksums cluster-wide
+ *
+ * Disabling data checksums must be performed with two sets of barriers, each
+ * carrying a different state. The state is first set to "inprogress-off"
+ * during which checksums are still written but not verified. This ensures that
+ * backends which have yet to observe the state change from "on" won't get
+ * validation errors on concurrently modified pages. Once all backends have
+ * changed to "inprogress-off", the barrier for moving to "off" can be emitted.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOff(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/* If data checksums are already disabled there is nothing to do */
+	if (ControlFile->data_checksum_version == 0)
+	{
+		LWLockRelease(ControlFileLock);
+		return;
+	}
+
+	/*
+	 * If data checksums are currently enabled we first transition to the
+	 * "inprogress-off" state during which backends continue to write
+	 * checksums without verifying them. When all backends are in
+	 * "inprogress-off" the next transition to "off" can be performed, after
+	 * which all data checksum processing is disabled.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+
+		MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+		START_CRIT_SECTION();
+
+		XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+		LWLockRelease(ControlFileLock);
+
+		barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+
+		END_CRIT_SECTION();
+		MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+
+		/*
+		 * Update local state in all backends to ensure that any backend in
+		 * "on" state is changed to "inprogress-off".
+		 */
+		WaitForProcSignalBarrier(barrier);
+
+		/*
+		 * At this point we know that no backends are verifying data checksums
+		 * during reading. Next, we can safely move to state "off" to also
+		 * stop writing checksums.
+		 */
+	}
+	else
+	{
+		/*
+		 * Ending up here implies that the checksums state is "inprogress-on"
+		 * or "inprogress-off" and we can transition directly to "off" from
+		 * there.
+		 */
+		LWLockRelease(ControlFileLock);
+	}
+
+	/*
+	 * Ensure that we don't incur a checkpoint during disabling checksums.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(0);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = 0;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * ProcSignalBarrier absorption functions for enabling and disabling data
+ * checksums in a running cluster. The procsignalbarriers are emitted in the
+ * SetDataChecksums* functions.
+ */
+bool
+AbsorbChecksumsOnInProgressBarrier(void)
+{
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOnBarrier(void)
+{
+	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffInProgressBarrier(void)
+{
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffBarrier(void)
+{
+	LocalDataChecksumVersion = 0;
+	return true;
+}
+
+/*
+ * InitLocalControlData
+ *
+ * Set up backend local caches of controldata variables which may change at
+ * any point during runtime and thus require special cased locking. So far
+ * this only applies to data_checksum_version, but it's intended to be general
+ * purpose enough to handle future cases.
+ */
+void
+InitLocalControldata(void)
+{
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	LocalDataChecksumVersion = ControlFile->data_checksum_version;
+	LWLockRelease(ControlFileLock);
+}
+
+/* guc hook */
+const char *
+show_data_checksums(void)
+{
+	if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)
+		return "on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		return "inprogress-on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+		return "inprogress-off";
+	else
+		return "off";
 }
 
 /*
@@ -6141,6 +6486,33 @@ StartupXLOG(void)
 	 */
 	CompleteCommitTsInitialization();
 
+	/*
+	 * If we reach this point with checksums being enabled ("inprogress-on"
+	 * state), we notify the user that they need to manually restart the
+	 * process to enable checksums. This is because we cannot launch a dynamic
+	 * background worker directly from here, it has to be launched from a
+	 * regular backend.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		ereport(WARNING,
+				(errmsg("data checksums are being enabled, but no worker is running"),
+				 errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
+
+	/*
+	 * If data checksums were being disabled when the cluster was shut down,
+	 * we know that we have a state where all backends have stopped validating
+	 * checksums and we can move to off instead of prompting the user to
+	 * perform any action.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+	{
+		XLogChecksums(0);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = 0;
+		LWLockRelease(ControlFileLock);
+	}
+
 	/*
 	 * All done with end-of-recovery actions.
 	 *
@@ -8154,6 +8526,24 @@ XLogReportParameters(void)
 	}
 }
 
+/*
+ * Log the new state of checksums
+ */
+static void
+XLogChecksums(ChecksumType new_type)
+{
+	xl_checksum_state xlrec;
+	XLogRecPtr	recptr;
+
+	xlrec.new_checksumtype = new_type;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, sizeof(xl_checksum_state));
+
+	recptr = XLogInsert(RM_XLOG_ID, XLOG_CHECKSUMS);
+	XLogFlush(recptr);
+}
+
 /*
  * Update full_page_writes in shared memory, and write an
  * XLOG_FPW_CHANGE record if necessary.
@@ -8582,6 +8972,46 @@ xlog_redo(XLogReaderState *record)
 	{
 		/* nothing to do here, just for informational purposes */
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state state;
+		uint64		barrier;
+
+		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = state.new_checksumtype;
+		LWLockRelease(ControlFileLock);
+
+		/*
+		 * Block on a procsignalbarrier to await all processes having seen the
+		 * change to checksum status. Once the barrier has been passed we can
+		 * initiate the corresponding processing.
+		 */
+		switch (state.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			default:
+				Assert(state.new_checksumtype == 0);
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+		}
+	}
 }
 
 /*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 3e3d2bb618..c1b22e0b9c 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -27,6 +27,7 @@
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "postmaster/datachecksumsworker.h"
 #include "replication/walreceiver.h"
 #include "storage/fd.h"
 #include "storage/proc.h"
@@ -803,3 +804,43 @@ pg_wal_replay_wait(PG_FUNCTION_ARGS)
 
 	PG_RETURN_VOID();
 }
+
+/*
+ * Disables data checksums for the cluster, if applicable. Starts a background
+ * worker which turns off the data checksums.
+ */
+Datum
+disable_data_checksums(PG_FUNCTION_ARGS)
+{
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	StartDataChecksumsWorkerLauncher(false, 0, 0, false);
+	PG_RETURN_VOID();
+}
+
+/*
+ * Enables data checksums for the cluster, if applicable.  Supports vacuum-
+ * like cost based throttling to limit system load. Starts a background worker
+ * which updates data checksums on existing data.
+ */
+Datum
+enable_data_checksums(PG_FUNCTION_ARGS)
+{
+	int			cost_delay = PG_GETARG_INT32(0);
+	int			cost_limit = PG_GETARG_INT32(1);
+	bool		fast = PG_GETARG_BOOL(2);
+
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	if (cost_delay < 0)
+		ereport(ERROR, errmsg("cost delay cannot be a negative value"));
+
+	if (cost_limit <= 0)
+		ereport(ERROR, errmsg("cost limit must be greater than zero"));
+
+	StartDataChecksumsWorkerLauncher(true, cost_delay, cost_limit, fast);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 0f8cddcbee..1f051d9843 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1614,7 +1614,8 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	 * enabled for this cluster, and if this is a relation file, then verify
 	 * the checksum.
 	 */
-	if (!noverify_checksums && DataChecksumsEnabled() &&
+	if (!noverify_checksums &&
+		DataChecksumsNeedWrite() &&
 		RelFileNumberIsValid(relfilenumber))
 		verify_checksum = true;
 
@@ -2007,6 +2008,9 @@ verify_page_checksum(Page page, XLogRecPtr start_lsn, BlockNumber blkno,
 	if (PageIsNew(page) || PageGetLSN(page) >= start_lsn)
 		return true;
 
+	if (!DataChecksumsNeedVerify())
+		return true;
+
 	/* Perform the actual checksum calculation. */
 	checksum = pg_checksum_page(page, blkno);
 
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index b0d0de051e..c4a6c9a7ab 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -639,6 +639,13 @@ LANGUAGE INTERNAL
 CALLED ON NULL INPUT VOLATILE PARALLEL SAFE
 AS 'pg_stat_reset_slru';
 
+CREATE OR REPLACE FUNCTION
+  pg_enable_data_checksums(cost_delay integer DEFAULT 0,
+                           cost_limit integer DEFAULT 100,
+						   fast boolean DEFAULT false)
+  RETURNS void STRICT VOLATILE LANGUAGE internal AS 'enable_data_checksums'
+  PARALLEL RESTRICTED;
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
@@ -760,6 +767,10 @@ REVOKE EXECUTE ON FUNCTION pg_ls_logicalmapdir() FROM PUBLIC;
 
 REVOKE EXECUTE ON FUNCTION pg_ls_replslotdir(text) FROM PUBLIC;
 
+REVOKE EXECUTE ON FUNCTION pg_enable_data_checksums(integer, integer, boolean) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_disable_data_checksums() FROM public;
+
 --
 -- We also set up some things as accessible to standard roles.
 --
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 3456b821bc..841d416fa5 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1323,6 +1323,27 @@ CREATE VIEW pg_stat_progress_copy AS
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_data_checksums AS
+	SELECT
+		S.pid AS pid, S.datid AS datid, D.datname AS datname,
+		CASE S.param1 WHEN 0 THEN 'enabling'
+                      WHEN 1 THEN 'disabling'
+					  WHEN 2 THEN 'waiting'
+					  WHEN 3 THEN 'waiting on backends'
+					  WHEN 4 THEN 'waiting on temporary tables'
+					  WHEN 5 THEN 'done'
+					  END AS phase,
+		CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+		CASE S.param3 WHEN -1 THEN NULL ELSE S.param3 END AS relations_total,
+		S.param4 AS databases_processed,
+		S.param5 AS relations_processed,
+		S.param6 AS databases_current,
+		S.param7 AS relation_current,
+		S.param8 AS relation_current_blocks,
+		S.param9 AS relation_current_blocks_processed
+	FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+		LEFT JOIN pg_database D ON S.datid = D.oid;
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/postmaster/Makefile b/src/backend/postmaster/Makefile
index db08543d19..e112d4b53e 100644
--- a/src/backend/postmaster/Makefile
+++ b/src/backend/postmaster/Makefile
@@ -18,6 +18,7 @@ OBJS = \
 	bgworker.o \
 	bgwriter.o \
 	checkpointer.o \
+	datachecksumsworker.o \
 	fork_process.o \
 	interrupt.o \
 	launch_backend.o \
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 07bc5517fc..662cd12681 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -18,6 +18,7 @@
 #include "pgstat.h"
 #include "port/atomics.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "replication/logicallauncher.h"
 #include "replication/logicalworker.h"
@@ -132,6 +133,12 @@ static const struct
 	},
 	{
 		"TablesyncWorkerMain", TablesyncWorkerMain
+	},
+	{
+		"DataChecksumsWorkerLauncherMain", DataChecksumsWorkerLauncherMain
+	},
+	{
+		"DataChecksumsWorkerMain", DataChecksumsWorkerMain
 	}
 };
 
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
new file mode 100644
index 0000000000..de7a077f9c
--- /dev/null
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -0,0 +1,1376 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.c
+ *	  Background worker for enabling or disabling data checksums online
+ *
+ * When enabling data checksums on a database at initdb time or when shut down
+ * with pg_checksums, no extra process is required as each page is checksummed,
+ * and verified, when accessed.  When enabling checksums on an already running
+ * cluster, this worker will ensure that all pages are checksummed before
+ * verification of the checksums is turned on. In the case of disabling
+ * checksums, the state transition is performed only in the control file, no
+ * changes are performed on the data pages.
+ *
+ * Checksums can be either enabled or disabled cluster-wide, with on/off being
+ * the end state for data_checksums.
+ *
+ * Enabling checksums
+ * ------------------
+ * When enabling checksums in an online cluster, data_checksums will be set to
+ * "inprogress-on" which signals that write operations MUST compute and write
+ * the checksum on the data page, but during reading the checksum SHALL NOT be
+ * verified. This ensures that all objects created during checksumming will
+ * have checksums set, but no reads will fail due to incorrect checksum. The
+ * DataChecksumsWorker will compile a list of databases which exist at the
+ * start of checksumming, and all of these which haven't been dropped during
+ * the processing MUST have been processed successfully in order for checksums
+ * to be enabled. Any new relation created during processing will see the
+ * in-progress state and will automatically be checksummed.
+ *
+ * For each database, all relations which have storage are read and every data
+ * page is marked dirty to force a write with the checksum. This will generate
+ * a lot of WAL as the entire database is read and written.
+ *
+ * If the processing is interrupted by a cluster restart, it will be restarted
+ * from the beginning again as state isn't persisted.
+ *
+ * Disabling checksums
+ * -------------------
+ * When disabling checksums, data_checksums will be set to "inprogress-off"
+ * which signals that checksums are written but no longer verified. This ensure
+ * that backends which have yet to move from the "on" state will still be able
+ * to process data checksum validation.
+ *
+ * Synchronization and Correctness
+ * -------------------------------
+ * The processes involved in enabling, or disabling, data checksums in an
+ * online cluster must be properly synchronized with the normal backends
+ * serving concurrent queries to ensure correctness. Correctness is defined
+ * as the following:
+ *
+ *    - Backends SHALL NOT violate local data_checksums state
+ *    - Data checksums SHALL NOT be considered enabled cluster-wide until all
+ *      currently connected backends have the local state "enabled"
+ *
+ * There are two levels of synchronization required for enabling data checksums
+ * in an online cluster: (i) changing state in the active backends ("on",
+ * "off", "inprogress-on" and "inprogress-off"), and (ii) ensuring no
+ * incompatible objects and processes are left in a database when workers end.
+ * The former deals with cluster-wide agreement on data checksum state and the
+ * latter with ensuring that any concurrent activity cannot break the data
+ * checksum contract during processing.
+ *
+ * Synchronizing the state change is done with procsignal barriers, where the
+ * WAL logging backend updating the global state in the controlfile will wait
+ * for all other backends to absorb the barrier. Barrier absorption will happen
+ * during interrupt processing, which means that connected backends will change
+ * state at different times. To prevent data checksum state changes when
+ * writing and verifying checksums, interrupts shall be held off before
+ * interrogating state and resumed when the IO operation has been performed.
+ *
+ *   When Enabling Data Checksums
+ *   ----------------------------
+ *   A process which fails to observe data checksums being enabled can induce
+ *   two types of errors: failing to write the checksum when modifying the page
+ *   and failing to validate the data checksum on the page when reading it.
+ *
+ *   When processing starts all backends belong to one of the below sets, with
+ *   one set being empty:
+ *
+ *   Bd: Backends in "off" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   If processing is started in an online cluster then all backends are in Bd.
+ *   If processing was halted by the cluster shutting down, the controlfile
+ *   state "inprogress-on" will be observed on system startup and all backends
+ *   will be in Bd. Backends transition Bd -> Bi via a procsignalbarrier.  When
+ *   the DataChecksumsWorker has finished writing checksums on all pages and
+ *   enables data checksums cluster-wide, there are four sets of backends where
+ *   Bd shall be an empty set:
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   Backends in Bi and Be will write checksums when modifying a page, but only
+ *   backends in Be will verify the checksum during reading. The Bg backend is
+ *   blocked waiting for all backends in Bi to process interrupts and move to
+ *   Be. Any backend starting while Bg is waiting on the procsignalbarrier will
+ *   observe the global state being "on" and will thus automatically belong to
+ *   Be.  Checksums are enabled cluster-wide when Bi is an empty set. Bi and Be
+ *   are compatible sets while still operating based on their local state as
+ *   both write data checksums.
+ *
+ *   When Disabling Data Checksums
+ *   -----------------------------
+ *   A process which fails to observe that data checksums have been disabled
+ *   can induce two types of errors: writing the checksum when modifying the
+ *   page and validating a data checksum which is no longer correct due to
+ *   modifications to the page.
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bo: Backends in "inprogress-off" state
+ *
+ *   Backends transition from the Be state to Bd like so: Be -> Bo -> Bd
+ *
+ *   The goal is to transition all backends to Bd making the others empty sets.
+ *   Backends in Bo write data checksums, but don't validate them, such that
+ *   backends still in Be can continue to validate pages until the barrier has
+ *   been absorbed such that they are in Bo. Once all backends are in Bo, the
+ *   barrier to transition to "off" can be raised and all backends can safely
+ *   stop writing data checksums as no backend is enforcing data checksum
+ *   validation any longer.
+ *
+ *
+ * Potential optimizations
+ * -----------------------
+ * Below are some potential optimizations and improvements which were brought
+ * up during reviews of this feature, but which weren't implemented in the
+ * initial version. These are ideas listed without any validation on their
+ * feasibility or potential payoff. More discussion on these can be found on
+ * the -hackers threads linked to in the commit message of this feature.
+ *
+ *   * Launching datachecksumsworker for resuming operation from the startup
+ *     process: Currently users have to restart processing manually after a
+ *     restart since dynamic background worker cannot be started from the
+ *     postmaster. Changing the startup process could make restarting the
+ *     processing automatic on cluster restart.
+ *   * Avoid dirtying the page when checksums already match: Iff the checksum
+ *     on the page happens to already match we still dirty the page. It should
+ *     be enough to only do the log_newpage_buffer() call in that case.
+ *   * Invent a lightweight WAL record that doesn't contain the full-page
+ *     image but just the block number: On replay, the redo routine would read
+ *     the page from disk.
+ *   * Teach pg_checksums to avoid checksummed pages when pg_checksums is used
+ *     to enable checksums on a cluster which is in inprogress-on state and
+ *     may have checksummed pages (make pg_checksums be able to resume an
+ *     online operation).
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/postmaster/datachecksumsworker.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/htup_details.h"
+#include "access/xact.h"
+#include "access/xloginsert.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_database.h"
+#include "commands/progress.h"
+#include "commands/vacuum.h"
+#include "common/relpath.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "postmaster/bgworker.h"
+#include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/bufmgr.h"
+#include "storage/checksum.h"
+#include "storage/ipc.h"
+#include "storage/lmgr.h"
+#include "storage/lwlock.h"
+#include "storage/procarray.h"
+#include "storage/smgr.h"
+#include "tcop/tcopprot.h"
+#include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
+#include "utils/ps_status.h"
+#include "utils/syscache.h"
+
+/*
+ * Number of times we retry to open a database before giving up and consider
+ * it to have failed processing.
+ */
+#define DATACHECKSUMSWORKER_MAX_DB_RETRIES 5
+
+typedef enum
+{
+	DATACHECKSUMSWORKER_SUCCESSFUL = 0,
+	DATACHECKSUMSWORKER_ABORTED,
+	DATACHECKSUMSWORKER_FAILED,
+	DATACHECKSUMSWORKER_RETRYDB,
+} DataChecksumsWorkerResult;
+
+/*
+ * Signaling between backends calling pg_enable/disable_data_checksums, the
+ * checksums launcher process, and the checksums worker process.
+ *
+ * This struct is protected by DataChecksumsWorkerLock
+ */
+typedef struct DataChecksumsWorkerShmemStruct
+{
+	/*
+	 * These are set by pg_enable/disable_data_checksums, to tell the launcher
+	 * what the target state is.
+	 */
+	bool		launch_enable_checksums;	/* True if checksums are being
+											 * enabled, else false */
+	int			launch_cost_delay;
+	int			launch_cost_limit;
+	bool		launch_fast;
+
+	/*
+	 * Is a launcher process is currently running?
+	 *
+	 * This is set by the launcher process, after it has read the above
+	 * launch_* parameters.
+	 */
+	bool		launcher_running;
+
+	/*
+	 * These fields indicate the target state that the launcher is currently
+	 * working towards. They can be different from the corresponding launch_*
+	 * fields, if a new pg_enable/disable_data_checksums() call was made while
+	 * the launcher/worker was already running.
+	 *
+	 * The below members are set when the launcher starts, and are only
+	 * accessed read-only by the single worker. Thus, we can access these
+	 * without a lock. If multiple workers, or dynamic cost parameters, are
+	 * supported at some point then this would need to be revisited.
+	 */
+	bool		enabling_checksums; /* True if checksums are being enabled,
+									 * else false */
+	int			cost_delay;
+	int			cost_limit;
+	bool		immediate_checkpoint;
+
+	/*
+	 * Signaling between the launcher and the worker process.
+	 *
+	 * As there is only a single worker, and the launcher won't read these
+	 * until the worker exits, they can be accessed without the need for a
+	 * lock. If multiple workers are supported then this will have to be
+	 * revisited.
+	 */
+
+	/* result, set by worker before exiting */
+	DataChecksumsWorkerResult success;
+
+	/*
+	 * tells the worker process whether it should also process the shared
+	 * catalogs
+	 */
+	bool		process_shared_catalogs;
+} DataChecksumsWorkerShmemStruct;
+
+/* Shared memory segment for datachecksumsworker */
+static DataChecksumsWorkerShmemStruct *DataChecksumsWorkerShmem;
+
+/* Bookkeeping for work to do */
+typedef struct DataChecksumsWorkerDatabase
+{
+	Oid			dboid;
+	char	   *dbname;
+} DataChecksumsWorkerDatabase;
+
+typedef struct DataChecksumsWorkerResultEntry
+{
+	Oid			dboid;
+	DataChecksumsWorkerResult result;
+	int			retries;
+} DataChecksumsWorkerResultEntry;
+
+
+/*
+ * Flag set by the interrupt handler
+ */
+static volatile sig_atomic_t abort_requested = false;
+
+/*
+ * Have we set the DataChecksumsWorkerShmemStruct->launcher_running flag?
+ * If we have, we need to clear it before exiting!
+ */
+static volatile sig_atomic_t launcher_running = false;
+
+/*
+ * Are we enabling data checksums, or disabling them?
+ */
+static bool enabling_checksums;
+
+/* Prototypes */
+static List *BuildDatabaseList(void);
+static List *BuildRelationList(bool temp_relations, bool include_shared);
+static void FreeDatabaseList(List *dblist);
+static DataChecksumsWorkerResult ProcessDatabase(DataChecksumsWorkerDatabase *db);
+static bool ProcessAllDatabases(bool immediate_checkpoint);
+static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy);
+static void launcher_cancel_handler(SIGNAL_ARGS);
+static void WaitForAllTransactionsToFinish(void);
+
+/*
+ * StartDataChecksumsWorkerLauncher
+ *		Main entry point for datachecksumsworker launcher process
+ *
+ * The main entrypoint for starting data checksums processing for enabling as
+ * well as disabling.
+ */
+void
+StartDataChecksumsWorkerLauncher(bool enable_checksums,
+								 int cost_delay,
+								 int cost_limit,
+								 bool fast)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	bool		launcher_running;
+
+	/* the cost delay settings have no effect when disabling */
+	Assert(enable_checksums || cost_delay == 0);
+	Assert(enable_checksums || cost_limit == 0);
+
+	/* store the desired state in shared memory */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	DataChecksumsWorkerShmem->launch_enable_checksums = enable_checksums;
+	DataChecksumsWorkerShmem->launch_cost_delay = cost_delay;
+	DataChecksumsWorkerShmem->launch_cost_limit = cost_limit;
+	DataChecksumsWorkerShmem->launch_fast = fast;
+
+	/* is the launcher already running? */
+	launcher_running = DataChecksumsWorkerShmem->launcher_running;
+
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * Launch a new launcher process, if it's not running already.
+	 *
+	 * If the launcher is currently busy enabling the checksums, and we want
+	 * them disabled (or vice versa), the launcher will notice that at latest
+	 * when it's about to exit, and will loop back process the new request. So
+	 * if the launcher is already running, we don't need to do anything more
+	 * here to abort it.
+	 *
+	 * If you call pg_enable/disable_data_checksums() twice in a row, before
+	 * the launcher has had a chance to start up, we still end up launching it
+	 * twice.  That's OK, the second invocation will see that a launcher is
+	 * already running and exit quickly.
+	 *
+	 * TODO: We could optimize here and skip launching the launcher, if we are
+	 * already in the desired state, i.e. if the checksums are already enabled
+	 * and you call pg_enable_data_checksums().
+	 */
+	if (!launcher_running)
+	{
+		/*
+		 * Prepare the BackgroundWorker and launch it.
+		 */
+		memset(&bgw, 0, sizeof(bgw));
+		bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+		bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+		snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+		snprintf(bgw.bgw_function_name, BGW_MAXLEN, "DataChecksumsWorkerLauncherMain");
+		snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker launcher");
+		snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker launcher");
+		bgw.bgw_restart_time = BGW_NEVER_RESTART;
+		bgw.bgw_notify_pid = MyProcPid;
+		bgw.bgw_main_arg = (Datum) 0;
+
+		if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+			ereport(ERROR,
+					(errmsg("failed to start background worker to process data checksums")));
+	}
+}
+
+/*
+ * ProcessSingleRelationFork
+ *		Enable data checksums in a single relation/fork.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy)
+{
+	BlockNumber numblocks = RelationGetNumberOfBlocksInFork(reln, forkNum);
+	BlockNumber blknum;
+	char		activity[NAMEDATALEN * 2 + 128];
+	char	   *relns;
+
+	relns = get_namespace_name(RelationGetNamespace(reln));
+
+	if (!relns)
+		return false;
+
+	/* Report the current relation to pgstat_activity */
+	snprintf(activity, sizeof(activity) - 1, "processing: %s.%s (%s, %dblocks)",
+			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	/*
+	 * We are looping over the blocks which existed at the time of process
+	 * start, which is safe since new blocks are created with checksums set
+	 * already due to the state being "inprogress-on".
+	 */
+	for (blknum = 0; blknum < numblocks; blknum++)
+	{
+		Buffer		buf = ReadBufferExtended(reln, forkNum, blknum, RBM_NORMAL, strategy);
+
+		/* Need to get an exclusive lock before we can flag as dirty */
+		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+
+		/*
+		 * Mark the buffer as dirty and force a full page write.  We have to
+		 * re-write the page to WAL even if the checksum hasn't changed,
+		 * because if there is a replica it might have a slightly different
+		 * version of the page with an invalid checksum, caused by unlogged
+		 * changes (e.g. hintbits) on the master happening while checksums
+		 * were off. This can happen if there was a valid checksum on the page
+		 * at one point in the past, so only when checksums are first on, then
+		 * off, and then turned on again. Iff wal_level is set to "minimal",
+		 * this could be avoided iff the checksum is calculated to be correct.
+		 */
+		START_CRIT_SECTION();
+		MarkBufferDirty(buf);
+		log_newpage_buffer(buf, false);
+		END_CRIT_SECTION();
+
+		UnlockReleaseBuffer(buf);
+
+		/*
+		 * This is the only place where we check if we are asked to abort, the
+		 * abortion will bubble up from here. It's safe to check this without
+		 * a lock, because if we miss it being set, we will try again soon.
+		 */
+		Assert(enabling_checksums);
+		if (!DataChecksumsWorkerShmem->launch_enable_checksums)
+			abort_requested = true;
+		if (abort_requested)
+			return false;
+
+		vacuum_delay_point();
+	}
+
+	pfree(relns);
+	return true;
+}
+
+/*
+ * ProcessSingleRelationByOid
+ *		Process a single relation based on oid.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationByOid(Oid relationId, BufferAccessStrategy strategy)
+{
+	Relation	rel;
+	ForkNumber	fnum;
+	bool		aborted = false;
+
+	StartTransactionCommand();
+
+	rel = try_relation_open(relationId, AccessShareLock);
+	if (rel == NULL)
+	{
+		/*
+		 * Relation no longer exists. We don't consider this an error since
+		 * there are no pages in it that need data checksums, and thus return
+		 * true. The worker operates off a list of relations generated at the
+		 * start of processing, so relations being dropped in the meantime is
+		 * to be expected.
+		 */
+		CommitTransactionCommand();
+		pgstat_report_activity(STATE_IDLE, NULL);
+		return true;
+	}
+	RelationGetSmgr(rel);
+
+	for (fnum = 0; fnum <= MAX_FORKNUM; fnum++)
+	{
+		if (smgrexists(rel->rd_smgr, fnum))
+		{
+			if (!ProcessSingleRelationFork(rel, fnum, strategy))
+			{
+				aborted = true;
+				break;
+			}
+		}
+	}
+	relation_close(rel, AccessShareLock);
+	elog(DEBUG2,
+		 "data checksum processing done for relation with OID %u: %s",
+		 relationId, (aborted ? "aborted" : "finished"));
+
+	CommitTransactionCommand();
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return !aborted;
+}
+
+/*
+ * ProcessDatabase
+ *		Enable data checksums in a single database.
+ *
+ * We do this by launching a dynamic background worker into this database, and
+ * waiting for it to finish.  We have to do this in a separate worker, since
+ * each process can only be connected to one database during its lifetime.
+ */
+static DataChecksumsWorkerResult
+ProcessDatabase(DataChecksumsWorkerDatabase *db)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	BgwHandleStatus status;
+	pid_t		pid;
+	char		activity[NAMEDATALEN + 64];
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_FAILED;
+
+	memset(&bgw, 0, sizeof(bgw));
+	bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+	bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+	snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+	snprintf(bgw.bgw_function_name, BGW_MAXLEN, "%s", "DataChecksumsWorkerMain");
+	snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker worker");
+	snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker worker");
+	bgw.bgw_restart_time = BGW_NEVER_RESTART;
+	bgw.bgw_notify_pid = MyProcPid;
+	bgw.bgw_main_arg = ObjectIdGetDatum(db->dboid);
+
+	/*
+	 * If there are no worker slots available, make sure we retry processing
+	 * this database. This will make the datachecksumsworker move on to the
+	 * next database and quite likely fail with the same problem. TODO: Maybe
+	 * we need a backoff to avoid running through all the databases here in
+	 * short order.
+	 */
+	if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+	{
+		ereport(WARNING,
+				(errmsg("failed to start worker for enabling data checksums in database \"%s\", retrying",
+						db->dbname),
+				 errhint("The max_worker_processes setting might be too low.")));
+		return DATACHECKSUMSWORKER_RETRYDB;
+	}
+
+	status = WaitForBackgroundWorkerStartup(bgw_handle, &pid);
+	if (status == BGWH_STOPPED)
+	{
+		ereport(WARNING,
+				(errmsg("could not start background worker for enabling data checksums in database \"%s\"",
+						db->dbname),
+				 errhint("More details on the error might be found in the server log.")));
+		return DATACHECKSUMSWORKER_FAILED;
+	}
+
+	/*
+	 * If the postmaster crashed we cannot end up with a processed database so
+	 * we have no alternative other than exiting. When enabling checksums we
+	 * won't at this time have changed the pg_control version to enabled so
+	 * when the cluster comes back up processing will have to be restarted.
+	 * When disabling, the pg_control version will be set to off before this
+	 * so when the cluster comes up checksums will be off as expected.
+	 */
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("cannot enable data checksums without the postmaster process"),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	Assert(status == BGWH_STARTED);
+	ereport(DEBUG1,
+			(errmsg("initiating data checksum processing in database \"%s\"",
+					db->dbname)));
+
+	snprintf(activity, sizeof(activity) - 1,
+			 "Waiting for worker in database %s (pid %d)", db->dbname, pid);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	status = WaitForBackgroundWorkerShutdown(bgw_handle);
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("postmaster exited during data checksum processing in \"%s\"",
+						db->dbname),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	if (DataChecksumsWorkerShmem->success == DATACHECKSUMSWORKER_ABORTED)
+		ereport(LOG,
+				(errmsg("data checksums processing was aborted in database \"%s\"",
+						db->dbname)));
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return DataChecksumsWorkerShmem->success;
+}
+
+/*
+ * launcher_exit
+ *
+ * Internal routine for cleaning up state when the launcher process exits. We
+ * need to clean up the abort flag to ensure that processing can be restarted
+ * again after it was previously aborted.
+ */
+static void
+launcher_exit(int code, Datum arg)
+{
+	if (launcher_running)
+	{
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		launcher_running = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		LWLockRelease(DataChecksumsWorkerLock);
+	}
+}
+
+/*
+ * launcher_cancel_handler
+ *
+ * Internal routine for reacting to SIGINT and flagging the worker to abort.
+ * The worker won't be interrupted immediately but will check for abort flag
+ * between each block in a relation.
+ */
+static void
+launcher_cancel_handler(SIGNAL_ARGS)
+{
+	int			save_errno = errno;
+
+	abort_requested = true;
+
+	/*
+	 * There is no sleeping in the main loop, the flag will be checked
+	 * periodically in ProcessSingleRelationFork. The worker does however
+	 * sleep when waiting for concurrent transactions to end so we still need
+	 * to set the latch.
+	 */
+	SetLatch(MyLatch);
+
+	errno = save_errno;
+}
+
+/*
+ * WaitForAllTransactionsToFinish
+ *		Blocks awaiting all current transactions to finish
+ *
+ * Returns when all transactions which are active at the call of the function
+ * have ended, or if the postmaster dies while waiting. If the postmaster dies
+ * the abort flag will be set to indicate that the caller of this shouldn't
+ * proceed.
+ *
+ * NB: this will return early, if aborted by SIGINT or if the target state
+ * is changed while we're running.
+ */
+static void
+WaitForAllTransactionsToFinish(void)
+{
+	TransactionId waitforxid;
+
+	LWLockAcquire(XidGenLock, LW_SHARED);
+	waitforxid = XidFromFullTransactionId(TransamVariables->nextXid);
+	LWLockRelease(XidGenLock);
+
+	while (TransactionIdPrecedes(GetOldestActiveTransactionId(), waitforxid))
+	{
+		char		activity[64];
+		int			rc;
+
+		/* Oldest running xid is older than us, so wait */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for current transactions to finish (waiting for %u)",
+				 waitforxid);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		rc = WaitLatch(MyLatch,
+					   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+					   5000,
+					   WAIT_EVENT_CHECKSUM_ENABLE_STARTCONDITION);
+
+		/*
+		 * If the postmaster died we won't be able to enable checksums
+		 * cluster-wide so abort and hope to continue when restarted.
+		 */
+		if (rc & WL_POSTMASTER_DEATH)
+			ereport(FATAL,
+					(errmsg("postmaster exited during data checksum processing"),
+					 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_SHARED);
+		if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			abort_requested = true;
+		LWLockRelease(DataChecksumsWorkerLock);
+		if (abort_requested)
+			break;
+	}
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+	return;
+}
+
+/*
+ * DataChecksumsWorkerLauncherMain
+ *
+ * Main function for launching dynamic background workers for processing data
+ * checksums in databases. This function has the bgworker management, with
+ * ProcessAllDatabases being responsible for looping over the databases and
+ * initiating processing.
+ */
+void
+DataChecksumsWorkerLauncherMain(Datum arg)
+{
+	on_shmem_exit(launcher_exit, 0);
+
+	ereport(DEBUG1,
+			errmsg("background worker \"datachecksumsworker\" launcher started"));
+
+	pqsignal(SIGTERM, die);
+	pqsignal(SIGINT, launcher_cancel_handler);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_LAUNCHER;
+	init_ps_display(NULL);
+
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	if (DataChecksumsWorkerShmem->launcher_running)
+	{
+		/* Launcher was already running, let it finish */
+		LWLockRelease(DataChecksumsWorkerLock);
+		return;
+	}
+
+	launcher_running = true;
+
+	/*
+	 * Initialize a connection to shared catalogs only.
+	 */
+	BackgroundWorkerInitializeConnectionByOid(InvalidOid, InvalidOid, 0);
+
+	/* Initialize backend status information */
+	pgstat_bestart();
+
+	enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+	DataChecksumsWorkerShmem->launcher_running = true;
+	DataChecksumsWorkerShmem->enabling_checksums = enabling_checksums;
+	DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+	DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+	DataChecksumsWorkerShmem->immediate_checkpoint = DataChecksumsWorkerShmem->launch_fast;
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * The target state can change while we are busy enabling/disabling
+	 * checksums, if the user calls pg_disable/enable_data_checksums() before
+	 * we are finished with the previous request. In that case, we will loop
+	 * back here, to process the new request.
+	 */
+again:
+
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	/*
+	 * If we're asked to enable checksums, we need to check if processing was
+	 * previously interrupted such that we should resume rather than start
+	 * from scratch.
+	 */
+	if (enabling_checksums)
+	{
+		/*
+		 * If we are asked to enable checksums in a cluster which already has
+		 * checksums enabled, exit immediately as there is nothing more to do.
+		 * Hold interrupts to make sure state doesn't change during checking.
+		 */
+		HOLD_INTERRUPTS();
+		if (DataChecksumsNeedVerify())
+		{
+			RESUME_INTERRUPTS();
+			goto done;
+		}
+		RESUME_INTERRUPTS();
+
+		/*
+		 * Initialize progress and indicate that we are waiting on the other
+		 * backends to clear the procsignalbarrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS);
+
+		/*
+		 * Set the state to inprogress-on and wait on the procsignal barrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_ENABLING);
+		SetDataChecksumsOnInProgress();
+
+		if (!ProcessAllDatabases(DataChecksumsWorkerShmem->immediate_checkpoint))
+		{
+			/*
+			 * If the target state changed during processing then it's not a
+			 * failure, so restart processing instead.
+			 */
+			LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+			if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			{
+				LWLockRelease(DataChecksumsWorkerLock);
+				goto done;
+			}
+			LWLockRelease(DataChecksumsWorkerLock);
+			ereport(ERROR,
+					(errmsg("unable to enable data checksums in cluster")));
+		}
+
+		SetDataChecksumsOn();
+	}
+	else
+	{
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_DISABLING);
+		SetDataChecksumsOff();
+	}
+
+done:
+
+	/*
+	 * All done. But before we exit, check if the target state was changed
+	 * while we were running. In that case we will have to start all over
+	 * again.
+	 */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+	if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+	{
+		DataChecksumsWorkerShmem->enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+		DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+		LWLockRelease(DataChecksumsWorkerLock);
+		goto again;
+	}
+
+	/* Shut down progress reporting as we are done */
+	pgstat_progress_end_command();
+
+	launcher_running = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	LWLockRelease(DataChecksumsWorkerLock);
+}
+
+/*
+ * ProcessAllDatabases
+ *		Compute the list of all databases and process checksums in each
+ *
+ * This will repeatedly generate a list of databases to process for enabling
+ * checksums. Until no new databases are found, this will loop around computing
+ * a new list and comparing it to the already seen ones.
+ *
+ * If immediate_checkpoint is set to true then a CHECKPOINT_IMMEDIATE will be
+ * issued. This is useful for testing but should be avoided in production use
+ * as it may affect cluster performance drastically.
+ */
+static bool
+ProcessAllDatabases(bool immediate_checkpoint)
+{
+	List	   *DatabaseList;
+	HTAB	   *ProcessedDatabases = NULL;
+	HASHCTL		hash_ctl;
+	bool		found_failed = false;
+	int			flags;
+
+	/* Initialize a hash tracking all processed databases */
+	memset(&hash_ctl, 0, sizeof(hash_ctl));
+	hash_ctl.keysize = sizeof(Oid);
+	hash_ctl.entrysize = sizeof(DataChecksumsWorkerResultEntry);
+	ProcessedDatabases = hash_create("Processed databases",
+									 64,
+									 &hash_ctl,
+									 HASH_ELEM | HASH_BLOBS);
+
+	/*
+	 * Set up so first run processes shared catalogs, but not once in every
+	 * db.
+	 */
+	DataChecksumsWorkerShmem->process_shared_catalogs = true;
+
+	/*
+	 * Get a list of all databases to process. This may include databases that
+	 * were created during our runtime.  Since a database can be created as a
+	 * copy of any other database (which may not have existed in our last
+	 * run), we have to repeat this loop until no new databases show up in the
+	 * list.
+	 */
+	DatabaseList = BuildDatabaseList();
+
+	/*
+	 * Update progress reporting with the total number of databases we need to
+	 * process. This number should not be changed during processing, the
+	 * columns for processed databases is instead increased such that it can
+	 * be compared against the total.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_DB,
+								 list_length(DatabaseList));
+
+	while (true)
+	{
+		int			processed_databases = 0;
+
+		foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+		{
+			DataChecksumsWorkerResult result;
+			DataChecksumsWorkerResultEntry *entry;
+			bool		found;
+
+			/*
+			 * Indicate which database is being processed set the number of
+			 * relations to -1 to clear field from previous values. -1 will
+			 * translate to NULL in the progress view.
+			 */
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_CUR_DB, db->dboid);
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_REL, -1);
+
+			/*
+			 * Check if this database has been processed already, and if so
+			 * whether it should be retried or skipped.
+			 */
+			entry = (DataChecksumsWorkerResultEntry *) hash_search(ProcessedDatabases, &db->dboid,
+																   HASH_FIND, NULL);
+
+			if (entry)
+			{
+				if (entry->result == DATACHECKSUMSWORKER_RETRYDB)
+				{
+					/*
+					 * Limit the number of retries to avoid infinite looping
+					 * in case there simply won't be enough workers in the
+					 * cluster to finish this operation.
+					 */
+					if (entry->retries > DATACHECKSUMSWORKER_MAX_DB_RETRIES)
+						entry->result = DATACHECKSUMSWORKER_FAILED;
+				}
+
+				/* Skip if this database has been processed already */
+				if (entry->result != DATACHECKSUMSWORKER_RETRYDB)
+					continue;
+			}
+
+			result = ProcessDatabase(db);
+			processed_databases++;
+
+			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
+			{
+				/*
+				 * If one database has completed shared catalogs, we don't
+				 * have to process them again.
+				 */
+				if (DataChecksumsWorkerShmem->process_shared_catalogs)
+					DataChecksumsWorkerShmem->process_shared_catalogs = false;
+			}
+			else if (result == DATACHECKSUMSWORKER_ABORTED)
+			{
+				/* Abort flag set, so exit the whole process */
+				return false;
+			}
+
+			entry = hash_search(ProcessedDatabases, &db->dboid, HASH_ENTER, &found);
+			entry->dboid = db->dboid;
+			entry->result = result;
+			if (!found)
+				entry->retries = 0;
+			else
+				entry->retries++;
+		}
+
+		elog(DEBUG1,
+			 "%i databases processed for data checksum enabling, %s",
+			 processed_databases,
+			 (processed_databases ? "process with restart" : "process completed"));
+
+		FreeDatabaseList(DatabaseList);
+
+		/*
+		 * If no databases were processed in this run of the loop, we have now
+		 * finished all databases and no concurrently created ones can exist.
+		 */
+		if (processed_databases == 0)
+			break;
+
+		/*
+		 * Re-generate the list of databases for another pass. Since we wait
+		 * for all pre-existing transactions finish, this way we can be
+		 * certain that there are no databases left without checksums.
+		 */
+		WaitForAllTransactionsToFinish();
+		DatabaseList = BuildDatabaseList();
+	}
+
+	/*
+	 * ProcessedDatabases now has all databases and the results of their
+	 * processing. Failure to enable checksums for a database can be because
+	 * they actually failed for some reason, or because the database was
+	 * dropped between us getting the database list and trying to process it.
+	 * Get a fresh list of databases to detect the second case where the
+	 * database was dropped before we had started processing it. If a database
+	 * still exists, but enabling checksums failed then we fail the entire
+	 * checksumming process and exit with an error.
+	 */
+	WaitForAllTransactionsToFinish();
+	DatabaseList = BuildDatabaseList();
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+	{
+		DataChecksumsWorkerResult *entry;
+		bool		found;
+
+		entry = hash_search(ProcessedDatabases, (void *) &db->dboid,
+							HASH_FIND, &found);
+
+		/*
+		 * We are only interested in the databases where the failed database
+		 * still exists.
+		 */
+		if (found && *entry == DATACHECKSUMSWORKER_FAILED)
+		{
+			ereport(WARNING,
+					(errmsg("failed to enable data checksums in \"%s\"",
+							db->dbname)));
+			found_failed = found;
+			continue;
+		}
+	}
+
+	FreeDatabaseList(DatabaseList);
+
+	if (found_failed)
+	{
+		/* Disable checksums on cluster, because we failed */
+		SetDataChecksumsOff();
+		ereport(ERROR,
+				(errmsg("data checksums failed to get enabled in all databases, aborting"),
+				 errhint("The server log might have more information on the cause of the error.")));
+	}
+
+	/*
+	 * Force a checkpoint to get everything out to disk. The use of immediate
+	 * checkpoints is for running tests, as they would otherwise not execute
+	 * in such a way that they can reliably be placed under timeout control.
+	 */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_IMMEDIATE;
+	RequestCheckpoint(flags);
+
+	return true;
+}
+
+/*
+ * DataChecksumsWorkerShmemSize
+ *		Compute required space for datachecksumsworker-related shared memory
+ */
+Size
+DataChecksumsWorkerShmemSize(void)
+{
+	Size		size;
+
+	size = sizeof(DataChecksumsWorkerShmemStruct);
+	size = MAXALIGN(size);
+
+	return size;
+}
+
+/*
+ * DataChecksumsWorkerShmemInit
+ *		Allocate and initialize datachecksumsworker-related shared memory
+ */
+void
+DataChecksumsWorkerShmemInit(void)
+{
+	bool		found;
+
+	DataChecksumsWorkerShmem = (DataChecksumsWorkerShmemStruct *)
+		ShmemInitStruct("DataChecksumsWorker Data",
+						DataChecksumsWorkerShmemSize(),
+						&found);
+
+	MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
+
+	/*
+	 * Even if this is a redundant assignment, we want to be explicit about
+	 * our intent for readability, since we want to be able to query this
+	 * state in case of restartability.
+	 */
+	DataChecksumsWorkerShmem->launch_enable_checksums = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	DataChecksumsWorkerShmem->launch_fast = false;
+}
+
+/*
+ * BuildDatabaseList
+ *		Compile a list of all currently available databases in the cluster
+ *
+ * This creates the list of databases for the datachecksumsworker workers to
+ * add checksums to. If the caller wants to ensure that no concurrently
+ * running CREATE DATABASE calls exist, this needs to be preceded by a call
+ * to WaitForAllTransactionsToFinish().
+ */
+static List *
+BuildDatabaseList(void)
+{
+	List	   *DatabaseList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(DatabaseRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_database pgdb = (Form_pg_database) GETSTRUCT(tup);
+		DataChecksumsWorkerDatabase *db;
+
+		oldctx = MemoryContextSwitchTo(ctx);
+
+		db = (DataChecksumsWorkerDatabase *) palloc0(sizeof(DataChecksumsWorkerDatabase));
+
+		db->dboid = pgdb->oid;
+		db->dbname = pstrdup(NameStr(pgdb->datname));
+
+		DatabaseList = lappend(DatabaseList, db);
+
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return DatabaseList;
+}
+
+static void
+FreeDatabaseList(List *dblist)
+{
+	if (!dblist)
+		return;
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, dblist)
+	{
+		if (db->dbname != NULL)
+			pfree(db->dbname);
+	}
+
+	list_free_deep(dblist);
+}
+
+/*
+ * BuildRelationList
+ *		Compile a list of relations in the database
+ *
+ * Returns a list of OIDs for the request relation types. If temp_relations
+ * is True then only temporary relations are returned. If temp_relations is
+ * False then non-temporary relations which have data checksums are returned.
+ * If include_shared is True then shared relations are included as well in a
+ * non-temporary list. include_shared has no relevance when building a list of
+ * temporary relations.
+ */
+static List *
+BuildRelationList(bool temp_relations, bool include_shared)
+{
+	List	   *RelationList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(RelationRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_class pgc = (Form_pg_class) GETSTRUCT(tup);
+
+		/*
+		 * Only include temporary relations when asked for a temp relation
+		 * list.
+		 */
+		if (pgc->relpersistence == RELPERSISTENCE_TEMP)
+		{
+			if (!temp_relations)
+				continue;
+		}
+		else
+		{
+			/*
+			 * If we are only interested in temp relations then continue
+			 * immediately as the current relation isn't a temp relation.
+			 */
+			if (temp_relations)
+				continue;
+
+			if (!RELKIND_HAS_STORAGE(pgc->relkind))
+				continue;
+
+			if (pgc->relisshared && !include_shared)
+				continue;
+		}
+
+		oldctx = MemoryContextSwitchTo(ctx);
+		RelationList = lappend_oid(RelationList, pgc->oid);
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return RelationList;
+}
+
+/*
+ * DataChecksumsWorkerMain
+ *
+ * Main function for enabling checksums in a single database, This is the
+ * function set as the bgw_function_name in the dynamic background worker
+ * process initiated for each database by the worker launcher. After enabling
+ * data checksums in each applicable relation in the database, it will wait for
+ * all temporary relations that were present when the function started to
+ * disappear before returning. This is required since we cannot rewrite
+ * existing temporary relations with data checksums.
+ */
+void
+DataChecksumsWorkerMain(Datum arg)
+{
+	Oid			dboid = DatumGetObjectId(arg);
+	List	   *RelationList = NIL;
+	List	   *InitialTempTableList = NIL;
+	BufferAccessStrategy strategy;
+	bool		aborted = false;
+
+	enabling_checksums = true;
+
+	pqsignal(SIGTERM, die);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_WORKER;
+	init_ps_display(NULL);
+
+	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
+											  BGWORKER_BYPASS_ALLOWCONN);
+
+	/*
+	 * Get a list of all temp tables present as we start in this database. We
+	 * need to wait until they are all gone until we are done, since we cannot
+	 * access these relations and modify them.
+	 */
+	InitialTempTableList = BuildRelationList(true, false);
+
+	/*
+	 * Enable vacuum cost delay, if any.
+	 */
+	Assert(DataChecksumsWorkerShmem->enabling_checksums);
+	VacuumCostDelay = DataChecksumsWorkerShmem->cost_delay;
+	VacuumCostLimit = DataChecksumsWorkerShmem->cost_limit;
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumCostPageHit = 0;
+	VacuumCostPageMiss = 0;
+	VacuumCostPageDirty = 0;
+
+	/*
+	 * Create and set the vacuum strategy as our buffer strategy.
+	 */
+	strategy = GetAccessStrategy(BAS_VACUUM);
+
+	RelationList = BuildRelationList(false,
+									 DataChecksumsWorkerShmem->process_shared_catalogs);
+	foreach_oid(reloid, RelationList)
+	{
+		if (!ProcessSingleRelationByOid(reloid, strategy))
+		{
+			aborted = true;
+			break;
+		}
+	}
+	list_free(RelationList);
+
+	if (aborted)
+	{
+		DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+		ereport(DEBUG1,
+				(errmsg("data checksum processing aborted in database OID %u",
+						dboid)));
+		return;
+	}
+
+	/*
+	 * Wait for all temp tables that existed when we started to go away. This
+	 * is necessary since we cannot "reach" them to enable checksums. Any temp
+	 * tables created after we started will already have checksums in them
+	 * (due to the "inprogress-on" state), so no need to wait for those.
+	 */
+	for (;;)
+	{
+		List	   *CurrentTempTables;
+		int			numleft;
+		char		activity[64];
+
+		CurrentTempTables = BuildRelationList(true, false);
+		numleft = 0;
+		foreach_oid(tmptbloid, InitialTempTableList)
+		{
+			if (list_member_oid(CurrentTempTables, tmptbloid))
+				numleft++;
+		}
+		list_free(CurrentTempTables);
+
+		if (numleft == 0)
+			break;
+
+		/*
+		 * At least one temp table is left to wait for, indicate in pgstat
+		 * activity and progress reporting.
+		 */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for %d temp tables to be removed", numleft);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		(void) WaitLatch(MyLatch,
+						 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+						 5000,
+						 WAIT_EVENT_CHECKSUM_ENABLE_FINISHCONDITION);
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		aborted = DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums;
+		LWLockRelease(DataChecksumsWorkerLock);
+
+		if (aborted || abort_requested)
+		{
+			DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+			ereport(DEBUG1,
+					(errmsg("data checksum processing aborted in database OID %u",
+							dboid)));
+			return;
+		}
+	}
+
+	list_free(InitialTempTableList);
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
+}
diff --git a/src/backend/postmaster/meson.build b/src/backend/postmaster/meson.build
index 0ea4bbe084..3aacb2e0e7 100644
--- a/src/backend/postmaster/meson.build
+++ b/src/backend/postmaster/meson.build
@@ -6,6 +6,7 @@ backend_sources += files(
   'bgworker.c',
   'bgwriter.c',
   'checkpointer.c',
+  'datachecksumsworker.c',
   'fork_process.c',
   'interrupt.c',
   'launch_backend.c',
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index d687ceee33..6c38a57a20 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -186,6 +186,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 		case XLOG_FPW_CHANGE:
 		case XLOG_FPI_FOR_HINT:
 		case XLOG_FPI:
+		case XLOG_CHECKSUMS:
 		case XLOG_OVERWRITE_CONTRECORD:
 		case XLOG_CHECKPOINT_REDO:
 			break;
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 10fc18f252..b9d98d0ada 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -31,6 +31,7 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "postmaster/walsummarizer.h"
 #include "replication/logicallauncher.h"
@@ -152,6 +153,7 @@ CalculateShmemSize(int *num_semaphores)
 	size = add_size(size, InjectionPointShmemSize());
 	size = add_size(size, SlotSyncShmemSize());
 	size = add_size(size, WaitLSNShmemSize());
+	size = add_size(size, DataChecksumsWorkerShmemSize());
 
 	/* include additional requested shmem from preload libraries */
 	size = add_size(size, total_addin_request);
@@ -334,6 +336,7 @@ CreateOrAttachShmemStructs(void)
 	PgArchShmemInit();
 	ApplyLauncherShmemInit();
 	SlotSyncShmemInit();
+	DataChecksumsWorkerShmemInit();
 
 	/*
 	 * Set up other modules that need some shared memory space
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 87027f27eb..e51cfb3a98 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -18,6 +18,7 @@
 #include <unistd.h>
 
 #include "access/parallel.h"
+#include "access/xlog.h"
 #include "commands/async.h"
 #include "miscadmin.h"
 #include "pgstat.h"
@@ -573,6 +574,18 @@ ProcessProcSignalBarrier(void)
 					case PROCSIGNAL_BARRIER_SMGRRELEASE:
 						processed = ProcessBarrierSmgrRelease();
 						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON:
+						processed = AbsorbChecksumsOnInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_ON:
+						processed = AbsorbChecksumsOnBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF:
+						processed = AbsorbChecksumsOffInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_OFF:
+						processed = AbsorbChecksumsOffBarrier();
+						break;
 				}
 
 				/*
diff --git a/src/backend/storage/page/README b/src/backend/storage/page/README
index e30d7ac59a..73c36a6390 100644
--- a/src/backend/storage/page/README
+++ b/src/backend/storage/page/README
@@ -10,7 +10,9 @@ http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf, discussed
 2010/12/22 on -hackers list.
 
 Current implementation requires this be enabled system-wide at initdb time, or
-by using the pg_checksums tool on an offline cluster.
+by using the pg_checksums tool on an offline cluster.  Checksums can also be
+enabled at runtime using pg_enable_data_checksums(), and disabled by using
+pg_disable_data_checksums().
 
 The checksum is not valid at all times on a data page!!
 The checksum is valid when the page leaves the shared pool and is checked
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index be6f1f62d2..112c05ee7e 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -100,7 +100,7 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 	 */
 	if (!PageIsNew(page))
 	{
-		if (DataChecksumsEnabled())
+		if (DataChecksumsNeedVerify())
 		{
 			checksum = pg_checksum_page((char *) page, blkno);
 
@@ -1512,7 +1512,7 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 	static char *pageCopy = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return (char *) page;
 
 	/*
@@ -1542,7 +1542,7 @@ void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
 	/* If we don't need a checksum, just return */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return;
 
 	((PageHeader) page)->pd_checksum = pg_checksum_page((char *) page, blkno);
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index cc2ffc78aa..2b5671e2ca 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -359,6 +359,8 @@ pgstat_tracks_io_bktype(BackendType bktype)
 		case B_WAL_SUMMARIZER:
 			return false;
 
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 		case B_AUTOVAC_LAUNCHER:
 		case B_AUTOVAC_WORKER:
 		case B_BACKEND:
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index 8efb4044d6..4ed6ec157a 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -115,6 +115,8 @@ CHECKPOINT_DELAY_COMPLETE	"Waiting for a backend that blocks a checkpoint from c
 CHECKPOINT_DELAY_START	"Waiting for a backend that blocks a checkpoint from starting."
 CHECKPOINT_DONE	"Waiting for a checkpoint to complete."
 CHECKPOINT_START	"Waiting for a checkpoint to start."
+CHECKSUM_ENABLE_STARTCONDITION	"Waiting for data checksums enabling to start."
+CHECKSUM_ENABLE_FINISHCONDITION	"Waiting for data checksums to be enabled."
 EXECUTE_GATHER	"Waiting for activity from a child process while executing a <literal>Gather</literal> plan node."
 HASH_BATCH_ALLOCATE	"Waiting for an elected Parallel Hash participant to allocate a hash table."
 HASH_BATCH_ELECT	"Waiting to elect a Parallel Hash participant to allocate a hash table."
@@ -347,6 +349,7 @@ DSMRegistry	"Waiting to read or update the dynamic shared memory registry."
 InjectionPoint	"Waiting to read or update information related to injection points."
 SerialControl	"Waiting to read or update shared <filename>pg_serial</filename> state."
 WaitLSN	"Waiting to read or update shared Wait-for-LSN state."
+DataChecksumsWorker	"Waiting for data checksumsworker."
 
 #
 # END OF PREDEFINED LWLOCKS (DO NOT CHANGE THIS LINE)
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index f7b50e0b5a..9f52541eda 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -246,6 +246,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		cmdtype = PROGRESS_COMMAND_BASEBACKUP;
 	else if (pg_strcasecmp(cmd, "COPY") == 0)
 		cmdtype = PROGRESS_COMMAND_COPY;
+	else if (pg_strcasecmp(cmd, "DATACHECKSUMS") == 0)
+		cmdtype = PROGRESS_COMMAND_DATACHECKSUMS;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -1115,9 +1117,6 @@ pg_stat_get_db_checksum_failures(PG_FUNCTION_ARGS)
 	int64		result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
@@ -1133,9 +1132,6 @@ pg_stat_get_db_checksum_last_failure(PG_FUNCTION_ARGS)
 	TimestampTz result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index ef60f41b8c..8d15c15edb 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -290,6 +290,12 @@ GetBackendTypeDesc(BackendType backendType)
 		case B_CHECKPOINTER:
 			backendDesc = "checkpointer";
 			break;
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+			backendDesc = "datachecksumsworker launcher";
+			break;
+		case B_DATACHECKSUMSWORKER_WORKER:
+			backendDesc = "datachecksumsworker worker";
+			break;
 		case B_LOGGER:
 			backendDesc = "logger";
 			break;
@@ -841,7 +847,8 @@ InitializeSessionUserIdStandalone(void)
 	 * workers, in slot sync worker and in background workers.
 	 */
 	Assert(!IsUnderPostmaster || AmAutoVacuumWorkerProcess() ||
-		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess());
+		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess() ||
+		   AmDataChecksumsWorkerProcess());
 
 	/* call only once */
 	Assert(!OidIsValid(AuthenticatedUserId));
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index a024b1151d..513b163ed7 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -720,6 +720,11 @@ InitPostgres(const char *in_dbname, Oid dboid,
 	 */
 	SharedInvalBackendInit(false);
 
+	/*
+	 * Set up backend local cache of Controldata values.
+	 */
+	InitLocalControldata();
+
 	ProcSignalInit(MyCancelKeyValid, MyCancelKey);
 
 	/*
@@ -864,7 +869,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
 					 errhint("You should immediately run CREATE USER \"%s\" SUPERUSER;.",
 							 username != NULL ? username : "postgres")));
 	}
-	else if (AmBackgroundWorkerProcess())
+	else if (AmBackgroundWorkerProcess() || AmDataChecksumsWorkerProcess())
 	{
 		if (username == NULL && !OidIsValid(useroid))
 		{
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 686309db58..3784db9cfd 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -474,6 +474,14 @@ static const struct config_enum_entry wal_compression_options[] = {
 	{NULL, 0, false}
 };
 
+static const struct config_enum_entry data_checksums_options[] = {
+	{"on", DATA_CHECKSUMS_ON, true},
+	{"off", DATA_CHECKSUMS_OFF, true},
+	{"inprogress-on", DATA_CHECKSUMS_INPROGRESS_ON, true},
+	{"inprogress-off", DATA_CHECKSUMS_INPROGRESS_OFF, true},
+	{NULL, 0, false}
+};
+
 /*
  * Options for enum values stored in other modules
  */
@@ -599,7 +607,7 @@ static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
 static int	num_os_semaphores;
-static bool data_checksums;
+static int	data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
@@ -1919,17 +1927,6 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
-	{
-		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
-			gettext_noop("Shows whether data checksums are turned on for this cluster."),
-			NULL,
-			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
-		},
-		&data_checksums,
-		false,
-		NULL, NULL, NULL
-	},
-
 	{
 		{"syslog_sequence_numbers", PGC_SIGHUP, LOGGING_WHERE,
 			gettext_noop("Add sequence number to syslog messages to avoid duplicate suppression."),
@@ -5196,6 +5193,17 @@ struct config_enum ConfigureNamesEnum[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows whether data checksums are turned on for this cluster."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+		},
+		&data_checksums,
+		DATA_CHECKSUMS_OFF, data_checksums_options,
+		NULL, NULL, show_data_checksums
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index f5f7ff1045..3bda3adb04 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -577,7 +577,7 @@ main(int argc, char *argv[])
 		mode == PG_MODE_DISABLE)
 		pg_fatal("data checksums are already disabled in cluster");
 
-	if (ControlFile->data_checksum_version > 0 &&
+	if (ControlFile->data_checksum_version == DATA_CHECKSUMS_ON &&
 		mode == PG_MODE_ENABLE)
 		pg_fatal("data checksums are already enabled in cluster");
 
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index 854c6887a2..7118f9069c 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -13,6 +13,7 @@
 
 #include "common/string.h"
 #include "pg_upgrade.h"
+#include "storage/bufpage.h"
 
 
 /*
@@ -693,6 +694,14 @@ check_control_data(ControlData *oldctrl,
 	 * check_for_isn_and_int8_passing_mismatch().
 	 */
 
+	/*
+	 * If data checksums are in any in-progress state then disallow the
+	 * upgrade. The user should either let the process finish, or turn off
+	 * data checksums, before retrying.
+	 */
+	if (oldctrl->data_checksum_version > PG_DATA_CHECKSUM_VERSION)
+		pg_fatal("checksums are being enabled in the old cluster");
+
 	/*
 	 * We might eventually allow upgrades from checksum to no-checksum
 	 * clusters.
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 34ad46c067..dd7c75abe3 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -117,7 +117,7 @@ extern PGDLLIMPORT int wal_level;
  * of the bits make it to disk, but the checksum wouldn't match.  Also WAL-log
  * them if forced by wal_log_hints=on.
  */
-#define XLogHintBitIsNeeded() (DataChecksumsEnabled() || wal_log_hints)
+#define XLogHintBitIsNeeded() (wal_log_hints || DataChecksumsNeedWrite())
 
 /* Do we need to WAL-log information required only for Hot Standby and logical replication? */
 #define XLogStandbyInfoActive() (wal_level >= WAL_LEVEL_REPLICA)
@@ -230,7 +230,19 @@ extern XLogRecPtr GetXLogWriteRecPtr(void);
 
 extern uint64 GetSystemIdentifier(void);
 extern char *GetMockAuthenticationNonce(void);
-extern bool DataChecksumsEnabled(void);
+extern bool DataChecksumsNeedWrite(void);
+extern bool DataChecksumsNeedVerify(void);
+extern bool DataChecksumsOnInProgress(void);
+extern bool DataChecksumsOffInProgress(void);
+extern void SetDataChecksumsOnInProgress(void);
+extern void SetDataChecksumsOn(void);
+extern void SetDataChecksumsOff(void);
+extern bool AbsorbChecksumsOnInProgressBarrier(void);
+extern bool AbsorbChecksumsOffInProgressBarrier(void);
+extern bool AbsorbChecksumsOnBarrier(void);
+extern bool AbsorbChecksumsOffBarrier(void);
+extern const char *show_data_checksums(void);
+extern void InitLocalControldata(void);
 extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
 extern Size XLOGShmemSize(void);
 extern void XLOGShmemInit(void);
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index 5ef244bcdb..613015e44a 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,6 +25,7 @@
 #include "lib/stringinfo.h"
 #include "pgtime.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/relfilelocator.h"
 
 
@@ -289,6 +290,12 @@ typedef struct xl_restore_point
 	char		rp_name[MAXFNAMELEN];
 } xl_restore_point;
 
+/* Information logged when data checksum level is changed */
+typedef struct xl_checksum_state
+{
+	ChecksumType new_checksumtype;
+} xl_checksum_state;
+
 /* Overwrite of prior contrecord */
 typedef struct xl_overwrite_contrecord
 {
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index e80ff8e414..17da375aa6 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -80,6 +80,7 @@ typedef struct CheckPoint
 /* 0xC0 is used in Postgres 9.5-11 */
 #define XLOG_OVERWRITE_CONTRECORD		0xD0
 #define XLOG_CHECKPOINT_REDO			0xE0
+#define XLOG_CHECKSUMS					0xF0
 
 
 /*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 3ae31a614c..c14dccfa6b 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12158,6 +12158,23 @@
   proname => 'jsonb_subscript_handler', prorettype => 'internal',
   proargtypes => 'internal', prosrc => 'jsonb_subscript_handler' },
 
+# data checksum management functions
+{ oid => '9258',
+  descr => 'disable data checksums',
+  proname => 'pg_disable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => '',
+  prosrc => 'disable_data_checksums' },
+
+{ oid => '9257',
+  descr => 'enable data checksums',
+  proname => 'pg_enable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'int4 int4 bool', proallargtypes => '{int4,int4,bool}',
+  proargmodes => '{i,i,i}',
+  proargnames => '{cost_delay,cost_limit,fast}',
+  prosrc => 'enable_data_checksums' },
+
 # collation management functions
 { oid => '3445', descr => 'import collations from operating system',
   proname => 'pg_import_system_collations', procost => '100',
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 5616d64523..8e3c180392 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -155,4 +155,22 @@
 #define PROGRESS_COPY_TYPE_PIPE 3
 #define PROGRESS_COPY_TYPE_CALLBACK 4
 
+/* Progress parameters for PROGRESS_DATACHECKSUMS */
+#define PROGRESS_DATACHECKSUMS_PHASE 0
+#define PROGRESS_DATACHECKSUMS_TOTAL_DB 1
+#define PROGRESS_DATACHECKSUMS_TOTAL_REL 2
+#define PROGRESS_DATACHECKSUMS_PROCESSED_DB 3
+#define PROGRESS_DATACHECKSUMS_PROCESSED_REL 4
+#define PROGRESS_DATACHECKSUMS_CUR_DB 5
+#define PROGRESS_DATACHECKSUMS_CUR_REL 6
+#define PROGRESS_DATACHECKSUMS_CUR_REL_TOTAL_BLOCKS 7
+#define PROGRESS_DATACHECKSUMS_CUR_REL_PROCESSED_BLOCKS 8
+
+/* Phases of datachecksumsworker operation */
+#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING 0
+#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING 1
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS 2
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL 3
+#define PROGRESS_DATACHECKSUMS_DONE 4
+
 #endif
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index e26d108a47..55e507683a 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -357,6 +357,9 @@ typedef enum BackendType
 	B_WAL_SUMMARIZER,
 	B_WAL_WRITER,
 
+	B_DATACHECKSUMSWORKER_LAUNCHER,
+	B_DATACHECKSUMSWORKER_WORKER,
+
 	/*
 	 * Logger is not connected to shared memory and does not have a PGPROC
 	 * entry.
@@ -380,6 +383,7 @@ extern PGDLLIMPORT BackendType MyBackendType;
 #define AmWalReceiverProcess()		(MyBackendType == B_WAL_RECEIVER)
 #define AmWalSummarizerProcess()	(MyBackendType == B_WAL_SUMMARIZER)
 #define AmWalWriterProcess()		(MyBackendType == B_WAL_WRITER)
+#define AmDataChecksumsWorkerProcess()	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
 
 extern const char *GetBackendTypeDesc(BackendType backendType);
 
diff --git a/src/include/postmaster/datachecksumsworker.h b/src/include/postmaster/datachecksumsworker.h
new file mode 100644
index 0000000000..59c9000d64
--- /dev/null
+++ b/src/include/postmaster/datachecksumsworker.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.h
+ *	  header file for checksum helper background worker
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/postmaster/datachecksumsworker.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef DATACHECKSUMSWORKER_H
+#define DATACHECKSUMSWORKER_H
+
+/* Shared memory */
+extern Size DataChecksumsWorkerShmemSize(void);
+extern void DataChecksumsWorkerShmemInit(void);
+
+/* Start the background processes for enabling or disabling checksums */
+void		StartDataChecksumsWorkerLauncher(bool enable_checksums,
+											 int cost_delay,
+											 int cost_limit,
+											 bool fast);
+
+/* Background worker entrypoints */
+void		DataChecksumsWorkerLauncherMain(Datum arg);
+void		DataChecksumsWorkerMain(Datum arg);
+
+#endif							/* DATACHECKSUMSWORKER_H */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 6222d46e53..9ff4127e58 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -204,7 +204,17 @@ typedef PageHeaderData *PageHeader;
  * handling pages.
  */
 #define PG_PAGE_LAYOUT_VERSION		4
+
+/*
+ * Checksum version 0 is used for when data checksums are disabled.
+ * PG_DATA_CHECKSUM_VERSION defines that data checksums are enabled in the
+ * cluster and PG_DATA_CHECKSUM_INPROGRESS_{ON|OFF}_VERSION defines that data
+ * checksums are either currently being enabled or disabled.
+ */
 #define PG_DATA_CHECKSUM_VERSION	1
+#define PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION	2
+#define PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION	3
+
 
 /* ----------------------------------------------------------------
  *						page support functions
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 08e9d598ce..e3e6ec3d4a 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -15,6 +15,14 @@
 
 #include "storage/block.h"
 
+typedef enum ChecksumType
+{
+	DATA_CHECKSUMS_OFF = 0,
+	DATA_CHECKSUMS_ON,
+	DATA_CHECKSUMS_INPROGRESS_ON,
+	DATA_CHECKSUMS_INPROGRESS_OFF
+} ChecksumType;
+
 /*
  * Compute the checksum for a Postgres page.  The page must be aligned on a
  * 4-byte boundary.
diff --git a/src/include/storage/lwlocklist.h b/src/include/storage/lwlocklist.h
index 88dc79b2bd..c3b3011f72 100644
--- a/src/include/storage/lwlocklist.h
+++ b/src/include/storage/lwlocklist.h
@@ -84,3 +84,4 @@ PG_LWLOCK(50, DSMRegistry)
 PG_LWLOCK(51, InjectionPoint)
 PG_LWLOCK(52, SerialControl)
 PG_LWLOCK(53, WaitLSN)
+PG_LWLOCK(54, DataChecksumsWorker)
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index ebcf0ad403..902a9b51a9 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -445,9 +445,10 @@ extern PGDLLIMPORT PGPROC *PreparedXactProcs;
  * Background writer, checkpointer, WAL writer, WAL summarizer, and archiver
  * run during normal operation.  Startup process and WAL receiver also consume
  * 2 slots, but WAL writer is launched only after startup has exited, so we
- * only need 6 slots.
+ * only need 6 slots to cover these. The DataChecksums worker and launcher
+ * can consume 2 slots when data checksums are enabled or disabled.
  */
-#define NUM_AUXILIARY_PROCS		6
+#define NUM_AUXILIARY_PROCS		8
 
 /* configurable options */
 extern PGDLLIMPORT int DeadlockTimeout;
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 221073def3..dbe68d7eed 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -54,6 +54,11 @@ typedef enum
 typedef enum
 {
 	PROCSIGNAL_BARRIER_SMGRRELEASE, /* ask smgr to close files */
+
+	PROCSIGNAL_BARRIER_CHECKSUM_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_ON,
 } ProcSignalBarrierType;
 
 /*
diff --git a/src/include/utils/backend_progress.h b/src/include/utils/backend_progress.h
index 7b63d38f97..063d3ef86c 100644
--- a/src/include/utils/backend_progress.h
+++ b/src/include/utils/backend_progress.h
@@ -28,6 +28,7 @@ typedef enum ProgressCommandType
 	PROGRESS_COMMAND_CREATE_INDEX,
 	PROGRESS_COMMAND_BASEBACKUP,
 	PROGRESS_COMMAND_COPY,
+	PROGRESS_COMMAND_DATACHECKSUMS,
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	20
diff --git a/src/test/Makefile b/src/test/Makefile
index abdd6e5a98..3b0d933d97 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,16 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = perl postmaster regress isolation modules authentication recovery subscription
+SUBDIRS = \
+		perl \
+		postmaster \
+		regress \
+		isolation \
+		modules \
+		authentication \
+		recovery \
+		subscription \
+		checksum
 
 ifeq ($(with_icu),yes)
 SUBDIRS += icu
diff --git a/src/test/checksum/.gitignore b/src/test/checksum/.gitignore
new file mode 100644
index 0000000000..871e943d50
--- /dev/null
+++ b/src/test/checksum/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/checksum/Makefile b/src/test/checksum/Makefile
new file mode 100644
index 0000000000..fd03bf73df
--- /dev/null
+++ b/src/test/checksum/Makefile
@@ -0,0 +1,23 @@
+#-------------------------------------------------------------------------
+#
+# Makefile for src/test/checksum
+#
+# Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/checksum/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/test/checksum
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
+
+clean distclean maintainer-clean:
+	rm -rf tmp_check
diff --git a/src/test/checksum/README b/src/test/checksum/README
new file mode 100644
index 0000000000..0f0317060b
--- /dev/null
+++ b/src/test/checksum/README
@@ -0,0 +1,22 @@
+src/test/checksum/README
+
+Regression tests for data checksums
+===================================
+
+This directory contains a test suite for enabling data checksums
+in a running cluster.
+
+Running the tests
+=================
+
+    make check
+
+or
+
+    make installcheck
+
+NOTE: This creates a temporary installation (in the case of "check"),
+with multiple nodes, be they master or standby(s) for the purpose of
+the tests.
+
+NOTE: This requires the --enable-tap-tests argument to configure.
diff --git a/src/test/checksum/meson.build b/src/test/checksum/meson.build
new file mode 100644
index 0000000000..5f96b5c246
--- /dev/null
+++ b/src/test/checksum/meson.build
@@ -0,0 +1,15 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+tests += {
+  'name': 'checksums',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'tests': [
+      't/001_basic.pl',
+      't/002_restarts.pl',
+      't/003_standby_restarts.pl',
+      't/004_offline.pl',
+    ],
+  },
+}
diff --git a/src/test/checksum/t/001_basic.pl b/src/test/checksum/t/001_basic.pl
new file mode 100644
index 0000000000..31777b2831
--- /dev/null
+++ b/src/test/checksum/t/001_basic.pl
@@ -0,0 +1,88 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init();
+$node->start();
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are turned off
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable data checksums
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+# Wait for checksums to become enabled
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Enable data checksums again which should be a no-op..
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+# ..and make sure we can still read/write data
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums again
+$node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are disabled');
+
+# Test reading again
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure previously checksummed pages can be read back');
+
+# Re-enable checksums and make sure that the underlying data has changed to
+# ensure that checksums will be different.
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back the data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$node->stop;
+
+done_testing();
diff --git a/src/test/checksum/t/002_restarts.pl b/src/test/checksum/t/002_restarts.pl
new file mode 100644
index 0000000000..45239c9fa2
--- /dev/null
+++ b/src/test/checksum/t/002_restarts.pl
@@ -0,0 +1,92 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# restarting the processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init;
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, '1', "ensure checksums aren't enabled yet");
+
+$bsession->quit;
+$node->stop;
+$node->start;
+
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'inprogress-on', "ensure checksums aren't enabled yet");
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are turned on');
+
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+$result = $node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are turned off');
+
+done_testing();
diff --git a/src/test/checksum/t/003_standby_restarts.pl b/src/test/checksum/t/003_standby_restarts.pl
new file mode 100644
index 0000000000..2188270bd3
--- /dev/null
+++ b/src/test/checksum/t/003_standby_restarts.pl
@@ -0,0 +1,131 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# streaming replication
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize primary node
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+$node_primary->init(allows_streaming => 1);
+$node_primary->start;
+my $backup_name = 'my_backup';
+
+# Take backup
+$node_primary->backup($backup_name);
+
+# Create streaming standby linking to primary
+my $node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $backup_name,
+	has_streaming => 1);
+$node_standby_1->start;
+
+# Create some content on the primary to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Wait for standbys to catch up
+$node_primary->wait_for_catchup($node_standby_1, 'replay',
+	$node_primary->lsn('insert'));
+
+# Check that checksums are turned off on all nodes
+my $result = $node_primary->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on primary');
+
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on standby_1');
+
+# Enable checksums for the cluster
+$node_primary->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+# Ensure that the primary switches to "inprogress-on"
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	"inprogress-on");
+is($result, 1, 'ensure checksums are in progress on primary');
+
+# Wait for checksum enable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Ensure that the standby has switched to "inprogress-on" or "on".  Normally it
+# would be "inprogress-on", but it is theoretically possible for the primary to
+# complete the checksum enabling *and* have the standby replay that record
+# before we reach the check below.
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting = 'off' FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'f');
+is($result, 1, 'ensure standby has absorbed the inprogress-on barrier');
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+
+is(($result eq 'inprogress-on' || $result eq 'on'),
+	1, 'ensure checksums are on, or in progress, on standby_1');
+
+# Insert some more data which should be checksummed on INSERT
+$node_primary->safe_psql('postgres',
+	"INSERT INTO t VALUES (generate_series(1, 10000));");
+
+# Wait for checksums enabled on the primary
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the primary');
+
+# Wait for checksums enabled on the standby
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the standby');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, '20000', 'ensure we can safely read all data with checksums');
+
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+# Disable checksums and ensure it's propagated to standby and that we can
+# still read all data
+$node_primary->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+# Wait for checksum disable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure data checksums are disabled on the primary 2');
+
+# Ensure that the standby has switched to off
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are off on standby_1');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, "20000", 'ensure we can safely read all data without checksums');
+
+done_testing();
diff --git a/src/test/checksum/t/004_offline.pl b/src/test/checksum/t/004_offline.pl
new file mode 100644
index 0000000000..f6c3bbefc7
--- /dev/null
+++ b/src/test/checksum/t/004_offline.pl
@@ -0,0 +1,101 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums offline from various states
+# of checksum processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init();
+$node->start();
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable checksums offline using pg_checksums
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are enabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums offline again using pg_checksums
+$node->stop;
+$node->checksum_disable_offline;
+$node->start;
+
+# Ensure that checksums are disabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, 1, 'ensure checksums are in the process of being enabled');
+
+# Turn the cluster off and enable checksums offline, then start back up
+$bsession->quit;
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are now enabled even though processing wasn't
+# restarted
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+done_testing();
diff --git a/src/test/meson.build b/src/test/meson.build
index 67376e4b7f..2ce656a82a 100644
--- a/src/test/meson.build
+++ b/src/test/meson.build
@@ -8,6 +8,7 @@ subdir('postmaster')
 subdir('recovery')
 subdir('subscription')
 subdir('modules')
+subdir('checksum')
 
 if ssl.found()
   subdir('ssl')
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index c793f2135d..a2a8bfe3b0 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3558,6 +3558,40 @@ sub advance_wal
 	}
 }
 
+=item $node->checksum_enable_offline()
+
+Enable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_enable_offline
+{
+	my ($self) = @_;
+
+	print "### Enabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-e');
+	return;
+}
+
+=item checksum_disable_offline
+
+Disable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_disable_offline
+{
+	my ($self) = @_;
+
+	print "### Disabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-d');
+	return;
+}
+
 =pod
 
 =back
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a65e1c07c5..aa2e41004c 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -396,6 +396,7 @@ CheckPointStmt
 CheckpointStatsData
 CheckpointerRequest
 CheckpointerShmemStruct
+ChecksumType
 Chromosome
 CkptSortItem
 CkptTsStatus
@@ -582,6 +583,10 @@ DataPageDeleteStack
 DataTypesUsageChecks
 DataTypesUsageVersionCheck
 DatabaseInfo
+DataChecksumsWorkerDatabase
+DataChecksumsWorkerResult
+DataChecksumsWorkerResultEntry
+DataChecksumsWorkerShmemStruct
 DateADT
 DateTimeErrorExtra
 Datum
@@ -4076,6 +4081,7 @@ xl_btree_split
 xl_btree_unlink_page
 xl_btree_update
 xl_btree_vacuum
+xl_checksum_state
 xl_clog_truncate
 xl_commit_ts_truncate
 xl_dbase_create_file_copy_rec
-- 
2.39.3 (Apple Git-146)

#14Daniel Gustafsson
daniel@yesql.se
In reply to: Daniel Gustafsson (#13)
1 attachment(s)
Re: Changing the state of data checksums in a running cluster

Attached is a rebased v6 fixing the tests to handle that checksums are now on
by default, no other changes are made as no outstanding review comments or
identified bugs exist.

Does anyone object to going ahead with this?

--
Daniel Gustafsson

Attachments:

v6-0001-Online-enabling-and-disabling-of-data-checksums.patchapplication/octet-stream; name=v6-0001-Online-enabling-and-disabling-of-data-checksums.patch; x-unix-mode=0644Download
From ec067e9747be0f146d05d37cb64f09416568f21b Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Fri, 11 Oct 2024 09:26:13 +0200
Subject: [PATCH v6] Online enabling and disabling of data checksums

This allows data checksums to be enabled, or disabled, in a running
cluster without restricting access to the cluster during processing.

Data checksums could prior to this only be enabled during initdb or
when the cluster is offline using the pg_checksums app. This commit
introduce functionality to enable, or disable, data checksums while
the cluster is running regardless of how it was initialized.

A background worker launcher process is responsible for launching a
dynamic per-database background worker which will mark all buffers
dirty for all relation with storage in order for them to have data
checksums calcuated on write.  Once all relations in all databases
have been processed, the data_checksums state will be set to on and
the cluster will at that point be identical to one which had data
checksums enabled during initialization or via offline processing.

When data checksums are being enabled, concurrent I/O operations
from backends other than the data checksums worker will write the
checksums but not verify them on reading.  Only when all backends
have absorbed the procsignalbarrier for setting data_checksums to
on will they also start verifying checksums on reading.  The same
process is repeated during disabling; all backends write checksums
but do not verify them until the barrier for setting the state to
off has been absorbed by all.  This in-progress state is used to
ensure there are no false negatives (or positives) due to reading
a checksum which is not in sync with the page.

This work is based on an earlier version of this patch which was
reviewed by among others Heikki Linnakangas, Robert Haas, Andres
Freund, Tomas Vondra, Michael Banck and Andrey Borodin.

Author: Daniel Gustafsson <daniel@yesql.se>
Author: Magnus Hagander <magnus@hagander.net>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com
Discussion: https://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de
Discussion: https://postgr.es/m/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com
---
 doc/src/sgml/func.sgml                        |   71 +
 doc/src/sgml/glossary.sgml                    |   23 +
 doc/src/sgml/monitoring.sgml                  |  206 ++-
 doc/src/sgml/ref/pg_checksums.sgml            |    6 +
 doc/src/sgml/wal.sgml                         |   57 +-
 src/backend/access/rmgrdesc/xlogdesc.c        |   24 +
 src/backend/access/transam/xlog.c             |  450 +++++-
 src/backend/access/transam/xlogfuncs.c        |   41 +
 src/backend/backup/basebackup.c               |    6 +-
 src/backend/catalog/system_functions.sql      |   11 +
 src/backend/catalog/system_views.sql          |   21 +
 src/backend/postmaster/Makefile               |    1 +
 src/backend/postmaster/bgworker.c             |    7 +
 src/backend/postmaster/datachecksumsworker.c  | 1376 +++++++++++++++++
 src/backend/postmaster/meson.build            |    1 +
 src/backend/replication/logical/decode.c      |    1 +
 src/backend/storage/ipc/ipci.c                |    4 +
 src/backend/storage/ipc/procsignal.c          |   13 +
 src/backend/storage/page/README               |    4 +-
 src/backend/storage/page/bufpage.c            |    6 +-
 src/backend/utils/activity/pgstat_io.c        |    2 +
 .../utils/activity/wait_event_names.txt       |    3 +
 src/backend/utils/adt/pgstatfuncs.c           |    8 +-
 src/backend/utils/init/miscinit.c             |    9 +-
 src/backend/utils/init/postinit.c             |    7 +-
 src/backend/utils/misc/guc_tables.c           |   32 +-
 src/bin/pg_checksums/pg_checksums.c           |    2 +-
 src/bin/pg_upgrade/controldata.c              |    9 +
 src/include/access/xlog.h                     |   16 +-
 src/include/access/xlog_internal.h            |    7 +
 src/include/catalog/pg_control.h              |    1 +
 src/include/catalog/pg_proc.dat               |   17 +
 src/include/commands/progress.h               |   18 +
 src/include/miscadmin.h                       |    4 +
 src/include/postmaster/datachecksumsworker.h  |   31 +
 src/include/storage/bufpage.h                 |   10 +
 src/include/storage/checksum.h                |    8 +
 src/include/storage/lwlocklist.h              |    1 +
 src/include/storage/proc.h                    |    5 +-
 src/include/storage/procsignal.h              |    5 +
 src/include/utils/backend_progress.h          |    1 +
 src/test/Makefile                             |   11 +-
 src/test/checksum/.gitignore                  |    2 +
 src/test/checksum/Makefile                    |   23 +
 src/test/checksum/README                      |   22 +
 src/test/checksum/meson.build                 |   15 +
 src/test/checksum/t/001_basic.pl              |   88 ++
 src/test/checksum/t/002_restarts.pl           |   92 ++
 src/test/checksum/t/003_standby_restarts.pl   |  131 ++
 src/test/checksum/t/004_offline.pl            |  101 ++
 src/test/meson.build                          |    1 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |   34 +
 src/tools/pgindent/typedefs.list              |    6 +
 53 files changed, 3002 insertions(+), 49 deletions(-)
 create mode 100644 src/backend/postmaster/datachecksumsworker.c
 create mode 100644 src/include/postmaster/datachecksumsworker.h
 create mode 100644 src/test/checksum/.gitignore
 create mode 100644 src/test/checksum/Makefile
 create mode 100644 src/test/checksum/README
 create mode 100644 src/test/checksum/meson.build
 create mode 100644 src/test/checksum/t/001_basic.pl
 create mode 100644 src/test/checksum/t/002_restarts.pl
 create mode 100644 src/test/checksum/t/003_standby_restarts.pl
 create mode 100644 src/test/checksum/t/004_offline.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 223d869f8c..1f1ec96bd0 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29776,6 +29776,77 @@ DETAIL:  Make sure pg_wal_replay_wait() isn't called within a transaction with a
 
   </sect2>
 
+  <sect2 id="functions-admin-checksum">
+   <title>Data Checksum Functions</title>
+
+   <para>
+    The functions shown in <xref linkend="functions-checksums-table" /> can
+    be used to enable or disable data checksums in a running cluster.
+    See <xref linkend="checksums" /> for details.
+   </para>
+
+   <table id="functions-checksums-table">
+    <title>Data Checksum Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_enable_data_checksums</primary>
+        </indexterm>
+        <function>pg_enable_data_checksums</function> ( <optional><parameter>cost_delay</parameter> <type>int</type>, <parameter>cost_limit</parameter> <type>int</type></optional> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Initiates data checksums for the cluster. This will switch the data
+        checksums mode to <literal>inprogress-on</literal> as well as start a
+        background worker that will process all pages in the database and
+        enable checksums on them. When all data pages have had checksums
+        enabled, the cluster will automatically switch data checksums mode to
+        <literal>on</literal>.
+       </para>
+       <para>
+        If <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter> are
+        specified, the speed of the process is throttled using the same principles as
+        <link linkend="runtime-config-resource-vacuum-cost">Cost-based Vacuum Delay</link>.
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_disable_data_checksums</primary>
+        </indexterm>
+        <function>pg_disable_data_checksums</function> ()
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Disables data checksum validation and calculation for the cluster. This
+        will switch the data checksum mode to <literal>inprogress-off</literal>
+        while data checksums are being disabled. When all active backends have
+        stopped validating data checksums, the data checksum mode will be
+        changed to <literal>off</literal>.  At this point the data pages will
+        still have checksums recorded but they are not updated when pages are
+        modified.
+       </para></entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+
+  </sect2>
+
   <sect2 id="functions-admin-dbobject">
    <title>Database Object Management Functions</title>
 
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index f54f25c1c6..57a9188dc1 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -159,6 +159,8 @@
      (but not the autovacuum workers),
      the <glossterm linkend="glossary-background-writer">background writer</glossterm>,
      the <glossterm linkend="glossary-checkpointer">checkpointer</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker">data checksums worker</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker-launcher">data checksums worker launcher</glossterm>,
      the <glossterm linkend="glossary-logger">logger</glossterm>,
      the <glossterm linkend="glossary-startup-process">startup process</glossterm>,
      the <glossterm linkend="glossary-wal-archiver">WAL archiver</glossterm>,
@@ -548,6 +550,27 @@
    </glossdef>
   </glossentry>
 
+  <glossentry id="glossary-data-checksums-worker">
+   <glossterm>Data Checksums Worker</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which enables or disables data checksums in a specific database.
+    </para>
+   </glossdef>
+  </glossentry>
+
+  <glossentry id="glossary-data-checksums-worker-launcher">
+   <glossterm>Data Checksums Worker Launcher</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which starts <glossterm linkend="glossary-data-checksums-worker"> processes</glossterm>
+     for each database.
+    </para>
+   </glossdef>
+  </glossentry>
+
   <glossentry id="glossary-db-cluster">
    <glossterm>Database cluster</glossterm>
    <glossdef>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 331315f8d3..aeb96aeceb 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3496,8 +3496,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Number of data page checksum failures detected in this
-       database (or on a shared object), or NULL if data checksums are not
-       enabled.
+       database. Detected failures are reported regardless of the
+       <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -3507,8 +3507,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Time at which the last data page checksum failure was detected in
-       this database (or on a shared object), or NULL if data checksums are not
-       enabled.
+       this database (or on a shared object). Last failure is reported
+       regardless of the <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -6670,6 +6670,204 @@ FROM pg_stat_get_backend_idset() AS backendid;
 
  </sect2>
 
+ <sect2 id="data-checksum-progress-reporting">
+  <title>Data Checksum Progress Reporting</title>
+
+  <indexterm>
+   <primary>pg_stat_progress_data_checksums</primary>
+  </indexterm>
+
+  <para>
+   When data checksums are being enabled on a running cluster, the
+   <structname>pg_stat_progress_data_checksums</structname> view will contain
+   a row for each background worker which is currently calculating checksums
+   for the data pages.
+  </para>
+
+  <table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
+   <title><structname>pg_stat_progress_data_checksums</structname> View</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        Column Type
+       </para>
+       <para>
+        Description>
+       </para>
+      </entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>pid</structfield> <type>integer</type>
+       </para>
+       <para>
+        Process ID of a datachecksumworker process.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>phase</structfield> <type>text</type>
+       </para>
+       <para>
+        Current processing phase, see <xref linkend="datachecksum-phases"/>
+        for description of the phases.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of databases which will be processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of relations which will be processed, or
+        <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of relations yet.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of databases which have been processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of relations which have been processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_current</structfield> <type>oid</type>
+       </para>
+       <para>
+        OID of the database currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current</structfield> <type>oid</type>
+       </para>
+       <para>
+        OID of the relation currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current_blocks</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of blocks in the relation currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current_blocks_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks which have been processed in the relation currently
+        being processed.
+       </para>
+      </entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+  <table id="datachecksum-phases">
+   <title>Data Checksum Phases</title>
+   <tgroup cols="2">
+    <colspec colname="col1" colwidth="1*"/>
+    <colspec colname="col2" colwidth="2*"/>
+    <thead>
+     <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+    <tbody>
+     <row>
+      <entry><literal>enabling</literal></entry>
+      <entry>
+       The command is currently enabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>disabling</literal></entry>
+      <entry>
+       The command is currently disabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on backends</literal></entry>
+      <entry>
+       The command is currently waiting for backends to acknowledge the data
+       checksum operation.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on temporary tables</literal></entry>
+      <entry>
+       The command is currently waiting for all temporary tables which existed
+       at the time the command was started to be removed.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>done</literal></entry>
+      <entry>
+       The command has finished processing and is exiting.
+      </entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+ </sect2>
+
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index 95043aa329..0343710af5 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -45,6 +45,12 @@ PostgreSQL documentation
    exit status is nonzero if the operation failed.
   </para>
 
+  <para>
+   When enabling checksums, if checksums were in the process of being enabled
+   when the cluster was shut down, <application>pg_checksums</application>
+   will still process all relations regardless of the online processing.
+  </para>
+
   <para>
    When verifying checksums, every file in the cluster is scanned. When
    enabling checksums, each relation file block with a changed checksum is
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index a34cddb5ed..448a9c0c94 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -248,9 +248,10 @@
   <para>
    Checksums are normally enabled when the cluster is initialized using <link
    linkend="app-initdb-data-checksums"><application>initdb</application></link>.
-   They can also be enabled or disabled at a later time as an offline
-   operation. Data checksums are enabled or disabled at the full cluster
-   level, and cannot be specified individually for databases or tables.
+   They can also be enabled or disabled at a later time either as an offline
+   operation or online in a running cluster allowing concurrent access. Data
+   checksums are enabled or disabled at the full cluster level, and cannot be
+   specified individually for databases or tables.
   </para>
 
   <para>
@@ -267,7 +268,7 @@
   </para>
 
   <sect2 id="checksums-offline-enable-disable">
-   <title>Off-line Enabling of Checksums</title>
+   <title>Offline Enabling of Checksums</title>
 
    <para>
     The <link linkend="app-pgchecksums"><application>pg_checksums</application></link>
@@ -276,6 +277,54 @@
    </para>
 
   </sect2>
+
+  <sect2 id="checksums-online-enable-disable">
+   <title>Online Enabling of Checksums</title>
+
+   <para>
+    Checksums can be enabled or disabled online, by calling the appropriate
+    <link linkend="functions-admin-checksum">functions</link>.
+   </para>
+
+   <para>
+    Enabling checksums will put the cluster checksum mode in
+    <literal>inprogress-on</literal> mode.  During this time, checksums will be
+    written but not verified. In addition to this, a background worker process
+    is started that enables checksums on all existing data in the cluster. Once
+    this worker has completed processing all databases in the cluster, the
+    checksum mode will automatically switch to <literal>on</literal>. The
+    processing will consume two background worker processes, make sure that
+    <varname>max_worker_processes</varname> allows for at least two more
+    additional processes.
+   </para>
+
+   <para>
+    The process will initially wait for all open transactions to finish before
+    it starts, so that it can be certain that there are no tables that have been
+    created inside a transaction that has not committed yet and thus would not
+    be visible to the process enabling checksums. It will also, for each database,
+    wait for all pre-existing temporary tables to get removed before it finishes.
+    If long-lived temporary tables are used in the application it may be necessary
+    to terminate these application connections to allow the process to complete.
+   </para>
+
+   <para>
+    If the cluster is stopped while in <literal>inprogress-on</literal> mode, for
+    any reason, then this process must be restarted manually. To do this,
+    re-execute the function <function>pg_enable_data_checksums()</function>
+    once the cluster has been restarted. The background worker will attempt
+    to resume the work from where it was interrupted.
+   </para>
+
+   <note>
+    <para>
+     Enabling checksums can cause significant I/O to the system, as most of the
+     database pages will need to be rewritten, and will be written both to the
+     data files and the WAL.
+    </para>
+   </note>
+
+  </sect2>
  </sect1>
 
   <sect1 id="wal-intro">
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 363294d623..465153e5d1 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -18,6 +18,7 @@
 #include "access/xlog.h"
 #include "access/xlog_internal.h"
 #include "catalog/pg_control.h"
+#include "storage/bufpage.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -167,6 +168,26 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
 		memcpy(&wal_level, rec, sizeof(int));
 		appendStringInfo(buf, "wal_level %s", get_wal_level_string(wal_level));
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state xlrec;
+
+		memcpy(&xlrec, rec, sizeof(xl_checksum_state));
+		switch (xlrec.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_VERSION:
+				appendStringInfoString(buf, "on");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				appendStringInfoString(buf, "inprogress-off");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				appendStringInfoString(buf, "inprogress-on");
+				break;
+			default:
+				appendStringInfoString(buf, "off");
+		}
+	}
 }
 
 const char *
@@ -218,6 +239,9 @@ xlog_identify(uint8 info)
 		case XLOG_CHECKPOINT_REDO:
 			id = "CHECKPOINT_REDO";
 			break;
+		case XLOG_CHECKSUMS:
+			id = "CHECKSUMS";
+			break;
 	}
 
 	return id;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index f14d3933ae..aa444c9216 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -648,6 +648,16 @@ static XLogRecPtr LocalMinRecoveryPoint;
 static TimeLineID LocalMinRecoveryPointTLI;
 static bool updateMinRecoveryPoint = true;
 
+/*
+ * Local state fror Controlfile data_checksum_version. After initialization
+ * this is only updated when absorbing a procsignal barrier during interrupt
+ * processing. The reason for keeping a copy in backend-private memory is to
+ * avoid locking for interrogating checksum state. Possible values are the
+ * checksum versions defined in storage/bufpage.h as well as zero when data
+ * checksums are disabled.
+ */
+static uint32 LocalDataChecksumVersion = 0;
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -716,6 +726,8 @@ static void WALInsertLockAcquireExclusive(void);
 static void WALInsertLockRelease(void);
 static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);
 
+static void XLogChecksums(ChecksumType new_type);
+
 /*
  * Insert an XLOG record represented by an already-constructed chain of data
  * chunks.  This is a low-level routine; to construct the WAL record header
@@ -829,9 +841,10 @@ XLogInsertRecord(XLogRecData *rdata,
 		 * only happen just after a checkpoint, so it's better to be slow in
 		 * this case and fast otherwise.
 		 *
-		 * Also check to see if fullPageWrites was just turned on or there's a
-		 * running backup (which forces full-page writes); if we weren't
-		 * already doing full-page writes then go back and recompute.
+		 * Also check to see if fullPageWrites was just turned on, there's a
+		 * running backup or if checksums are enabled (all of which forces
+		 * full-page writes); if we weren't already doing full-page writes
+		 * then go back and recompute.
 		 *
 		 * If we aren't doing full-page writes then RedoRecPtr doesn't
 		 * actually affect the contents of the XLOG record, so we'll update
@@ -844,7 +857,9 @@ XLogInsertRecord(XLogRecData *rdata,
 			Assert(RedoRecPtr < Insert->RedoRecPtr);
 			RedoRecPtr = Insert->RedoRecPtr;
 		}
-		doPageWrites = (Insert->fullPageWrites || Insert->runningBackups > 0);
+		doPageWrites = (Insert->fullPageWrites ||
+						Insert->runningBackups > 0 ||
+						DataChecksumsNeedWrite());
 
 		if (doPageWrites &&
 			(!prevDoPageWrites ||
@@ -4547,9 +4562,7 @@ ReadControlFile(void)
 
 	CalculateCheckpointSegments();
 
-	/* Make the initdb settings visible as GUC variables, too */
-	SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no",
-					PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
+	LocalDataChecksumVersion = ControlFile->data_checksum_version;
 }
 
 /*
@@ -4583,13 +4596,345 @@ GetMockAuthenticationNonce(void)
 }
 
 /*
- * Are checksums enabled for data pages?
+ * DataChecksumsNeedWrite
+ *		Returns whether data checksums must be written or not
+ *
+ * Returns true iff data checksums are enabled or are in the process of being
+ * enabled. During "inprogress-on" and "inprogress-off" states checksums must
+ * be written even though they are not verified (see datachecksumsworker.c for
+ * a longer discussion).
+ *
+ * This function is intended for callsites which are about to write a data page
+ * to storage, and need to know whether to re-calculate the checksum for the
+ * page header. Calling this function must be performed as close to the write
+ * operation as possible to keep the critical section short.
+ */
+bool
+DataChecksumsNeedWrite(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * DataChecksumsNeedVerify
+ *		Returns whether data checksums must be verified or not
+ *
+ * Data checksums are only verified if they are fully enabled in the cluster.
+ * During the "inprogress-on" and "inprogress-off" states they are only
+ * updated, not verified (see datachecksumsworker.c for a longer discussion).
+ *
+ * This function is intended for callsites which have read data and are about
+ * to perform checksum validation based on the result of this.  Calling this
+ * function must be performed as close to the validation call as possible to
+ * keep the critical section short. This is in order to protect against time of
+ * check/time of use situations around data checksum validation.
+ */
+bool
+DataChecksumsNeedVerify(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
+}
+
+/*
+ * DataChecksumsOnInProgress
+ *		Returns whether data checksums are being enabled
+ *
+ * Most operations don't need to worry about the "inprogress" states, and
+ * should use DataChecksumsNeedVerify() or DataChecksumsNeedWrite(). The
+ * "inprogress-on" state for enabling checksums is used when the checksum
+ * worker is setting checksums on all pages, it can thus be used to check for
+ * aborted checksum processing which need to be restarted.
+ */
+inline bool
+DataChecksumsOnInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+}
+
+/*
+ * DataChecksumsOffInProgress
+ *		Returns whether data checksums are being disabled
+ *
+ * The "inprogress-off" state for disabling checksums is used for when the
+ * worker resets the catalog state.  DataChecksumsNeedVerify() or
+ * DataChecksumsNeedWrite() should be used for deciding whether to read/write
+ * checksums.
  */
 bool
-DataChecksumsEnabled(void)
+DataChecksumsOffInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * SetDataChecksumsOnInProgress
+ *		Sets the data checksum state to "inprogress-on" to enable checksums
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on". See
+ * SetDataChecksumsOn below for a description on how this state change works.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOnInProgress(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile != NULL);
+
+	/*
+	 * The state transition is performed in a critical section with
+	 * checkpoints held off to provide crash safety.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state change in all backends to ensure that all backends are in
+	 * "inprogress-on". Once done we know that all backends are writing data
+	 * checksums.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOn
+ *		Enables data checksums cluster-wide
+ *
+ * Enabling data checksums is performed using two barriers, the first one to
+ * set the checksums state to "inprogress-on" (which is performed by
+ * SetDataChecksumsOnInProgress()) and the second one to set the state to "on"
+ * (performed here).
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on".  This state
+ * requires data checksums to be written but not verified. This ensures that
+ * all data pages can be checksummed without the risk of false negatives in
+ * validation during the process.  When all existing pages are guaranteed to
+ * have checksums, and all new pages will be initiated with checksums, the
+ * state can be changed to "on". Once the state is "on" checksums will be both
+ * written and verified. See datachecksumsworker.c for a longer discussion on
+ * how data checksums can be enabled in a running cluster.
+ *
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOn(void)
 {
+	uint64		barrier;
+
 	Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/*
+	 * The only allowed state transition to "on" is from "inprogress-on" since
+	 * that state ensures that all pages will have data checksums written.
+	 */
+	if (ControlFile->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+		elog(ERROR, "checksums not in \"inprogress-on\" mode");
+	}
+
+	LWLockRelease(ControlFileLock);
+
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state transition of "on" in all backends. When done we know that
+	 * data checksums are enabled in all backends and data checksums are both
+	 * written and verified.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOff
+ *		Disables data checksums cluster-wide
+ *
+ * Disabling data checksums must be performed with two sets of barriers, each
+ * carrying a different state. The state is first set to "inprogress-off"
+ * during which checksums are still written but not verified. This ensures that
+ * backends which have yet to observe the state change from "on" won't get
+ * validation errors on concurrently modified pages. Once all backends have
+ * changed to "inprogress-off", the barrier for moving to "off" can be emitted.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOff(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/* If data checksums are already disabled there is nothing to do */
+	if (ControlFile->data_checksum_version == 0)
+	{
+		LWLockRelease(ControlFileLock);
+		return;
+	}
+
+	/*
+	 * If data checksums are currently enabled we first transition to the
+	 * "inprogress-off" state during which backends continue to write
+	 * checksums without verifying them. When all backends are in
+	 * "inprogress-off" the next transition to "off" can be performed, after
+	 * which all data checksum processing is disabled.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+
+		MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+		START_CRIT_SECTION();
+
+		XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+		LWLockRelease(ControlFileLock);
+
+		barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+
+		END_CRIT_SECTION();
+		MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+
+		/*
+		 * Update local state in all backends to ensure that any backend in
+		 * "on" state is changed to "inprogress-off".
+		 */
+		WaitForProcSignalBarrier(barrier);
+
+		/*
+		 * At this point we know that no backends are verifying data checksums
+		 * during reading. Next, we can safely move to state "off" to also
+		 * stop writing checksums.
+		 */
+	}
+	else
+	{
+		/*
+		 * Ending up here implies that the checksums state is "inprogress-on"
+		 * or "inprogress-off" and we can transition directly to "off" from
+		 * there.
+		 */
+		LWLockRelease(ControlFileLock);
+	}
+
+	/*
+	 * Ensure that we don't incur a checkpoint during disabling checksums.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(0);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = 0;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * ProcSignalBarrier absorption functions for enabling and disabling data
+ * checksums in a running cluster. The procsignalbarriers are emitted in the
+ * SetDataChecksums* functions.
+ */
+bool
+AbsorbChecksumsOnInProgressBarrier(void)
+{
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOnBarrier(void)
+{
+	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffInProgressBarrier(void)
+{
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffBarrier(void)
+{
+	LocalDataChecksumVersion = 0;
+	return true;
+}
+
+/*
+ * InitLocalControlData
+ *
+ * Set up backend local caches of controldata variables which may change at
+ * any point during runtime and thus require special cased locking. So far
+ * this only applies to data_checksum_version, but it's intended to be general
+ * purpose enough to handle future cases.
+ */
+void
+InitLocalControldata(void)
+{
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	LocalDataChecksumVersion = ControlFile->data_checksum_version;
+	LWLockRelease(ControlFileLock);
+}
+
+/* guc hook */
+const char *
+show_data_checksums(void)
+{
+	if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)
+		return "on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		return "inprogress-on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+		return "inprogress-off";
+	else
+		return "off";
 }
 
 /*
@@ -6149,6 +6494,33 @@ StartupXLOG(void)
 	 */
 	CompleteCommitTsInitialization();
 
+	/*
+	 * If we reach this point with checksums being enabled ("inprogress-on"
+	 * state), we notify the user that they need to manually restart the
+	 * process to enable checksums. This is because we cannot launch a dynamic
+	 * background worker directly from here, it has to be launched from a
+	 * regular backend.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		ereport(WARNING,
+				(errmsg("data checksums are being enabled, but no worker is running"),
+				 errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
+
+	/*
+	 * If data checksums were being disabled when the cluster was shut down,
+	 * we know that we have a state where all backends have stopped validating
+	 * checksums and we can move to off instead of prompting the user to
+	 * perform any action.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+	{
+		XLogChecksums(0);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = 0;
+		LWLockRelease(ControlFileLock);
+	}
+
 	/*
 	 * All done with end-of-recovery actions.
 	 *
@@ -8162,6 +8534,24 @@ XLogReportParameters(void)
 	}
 }
 
+/*
+ * Log the new state of checksums
+ */
+static void
+XLogChecksums(ChecksumType new_type)
+{
+	xl_checksum_state xlrec;
+	XLogRecPtr	recptr;
+
+	xlrec.new_checksumtype = new_type;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, sizeof(xl_checksum_state));
+
+	recptr = XLogInsert(RM_XLOG_ID, XLOG_CHECKSUMS);
+	XLogFlush(recptr);
+}
+
 /*
  * Update full_page_writes in shared memory, and write an
  * XLOG_FPW_CHANGE record if necessary.
@@ -8590,6 +8980,46 @@ xlog_redo(XLogReaderState *record)
 	{
 		/* nothing to do here, just for informational purposes */
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state state;
+		uint64		barrier;
+
+		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = state.new_checksumtype;
+		LWLockRelease(ControlFileLock);
+
+		/*
+		 * Block on a procsignalbarrier to await all processes having seen the
+		 * change to checksum status. Once the barrier has been passed we can
+		 * initiate the corresponding processing.
+		 */
+		switch (state.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			default:
+				Assert(state.new_checksumtype == 0);
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+		}
+	}
 }
 
 /*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index bca1d39568..33eab41754 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -27,6 +27,7 @@
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "postmaster/datachecksumsworker.h"
 #include "replication/walreceiver.h"
 #include "storage/fd.h"
 #include "storage/proc.h"
@@ -862,3 +863,43 @@ pg_wal_replay_wait_status(PG_FUNCTION_ARGS)
 
 	PG_RETURN_TEXT_P(cstring_to_text(result_string));
 }
+
+/*
+ * Disables data checksums for the cluster, if applicable. Starts a background
+ * worker which turns off the data checksums.
+ */
+Datum
+disable_data_checksums(PG_FUNCTION_ARGS)
+{
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	StartDataChecksumsWorkerLauncher(false, 0, 0, false);
+	PG_RETURN_VOID();
+}
+
+/*
+ * Enables data checksums for the cluster, if applicable.  Supports vacuum-
+ * like cost based throttling to limit system load. Starts a background worker
+ * which updates data checksums on existing data.
+ */
+Datum
+enable_data_checksums(PG_FUNCTION_ARGS)
+{
+	int			cost_delay = PG_GETARG_INT32(0);
+	int			cost_limit = PG_GETARG_INT32(1);
+	bool		fast = PG_GETARG_BOOL(2);
+
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	if (cost_delay < 0)
+		ereport(ERROR, errmsg("cost delay cannot be a negative value"));
+
+	if (cost_limit <= 0)
+		ereport(ERROR, errmsg("cost limit must be greater than zero"));
+
+	StartDataChecksumsWorkerLauncher(true, cost_delay, cost_limit, fast);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index e2ed9081d1..122ec9c3b1 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1613,7 +1613,8 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	 * enabled for this cluster, and if this is a relation file, then verify
 	 * the checksum.
 	 */
-	if (!noverify_checksums && DataChecksumsEnabled() &&
+	if (!noverify_checksums &&
+		DataChecksumsNeedWrite() &&
 		RelFileNumberIsValid(relfilenumber))
 		verify_checksum = true;
 
@@ -2006,6 +2007,9 @@ verify_page_checksum(Page page, XLogRecPtr start_lsn, BlockNumber blkno,
 	if (PageIsNew(page) || PageGetLSN(page) >= start_lsn)
 		return true;
 
+	if (!DataChecksumsNeedVerify())
+		return true;
+
 	/* Perform the actual checksum calculation. */
 	checksum = pg_checksum_page(page, blkno);
 
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 20d3b9b73f..ee7f8abc94 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -673,6 +673,13 @@ LANGUAGE INTERNAL
 CALLED ON NULL INPUT VOLATILE
 AS 'pg_set_attribute_stats';
 
+CREATE OR REPLACE FUNCTION
+  pg_enable_data_checksums(cost_delay integer DEFAULT 0,
+                           cost_limit integer DEFAULT 100,
+						   fast boolean DEFAULT false)
+  RETURNS void STRICT VOLATILE LANGUAGE internal AS 'enable_data_checksums'
+  PARALLEL RESTRICTED;
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
@@ -796,6 +803,10 @@ REVOKE EXECUTE ON FUNCTION pg_ls_logicalmapdir() FROM PUBLIC;
 
 REVOKE EXECUTE ON FUNCTION pg_ls_replslotdir(text) FROM PUBLIC;
 
+REVOKE EXECUTE ON FUNCTION pg_enable_data_checksums(integer, integer, boolean) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_disable_data_checksums() FROM public;
+
 --
 -- We also set up some things as accessible to standard roles.
 --
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 3456b821bc..841d416fa5 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1323,6 +1323,27 @@ CREATE VIEW pg_stat_progress_copy AS
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_data_checksums AS
+	SELECT
+		S.pid AS pid, S.datid AS datid, D.datname AS datname,
+		CASE S.param1 WHEN 0 THEN 'enabling'
+                      WHEN 1 THEN 'disabling'
+					  WHEN 2 THEN 'waiting'
+					  WHEN 3 THEN 'waiting on backends'
+					  WHEN 4 THEN 'waiting on temporary tables'
+					  WHEN 5 THEN 'done'
+					  END AS phase,
+		CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+		CASE S.param3 WHEN -1 THEN NULL ELSE S.param3 END AS relations_total,
+		S.param4 AS databases_processed,
+		S.param5 AS relations_processed,
+		S.param6 AS databases_current,
+		S.param7 AS relation_current,
+		S.param8 AS relation_current_blocks,
+		S.param9 AS relation_current_blocks_processed
+	FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+		LEFT JOIN pg_database D ON S.datid = D.oid;
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/postmaster/Makefile b/src/backend/postmaster/Makefile
index db08543d19..e112d4b53e 100644
--- a/src/backend/postmaster/Makefile
+++ b/src/backend/postmaster/Makefile
@@ -18,6 +18,7 @@ OBJS = \
 	bgworker.o \
 	bgwriter.o \
 	checkpointer.o \
+	datachecksumsworker.o \
 	fork_process.o \
 	interrupt.o \
 	launch_backend.o \
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 07bc5517fc..662cd12681 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -18,6 +18,7 @@
 #include "pgstat.h"
 #include "port/atomics.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "replication/logicallauncher.h"
 #include "replication/logicalworker.h"
@@ -132,6 +133,12 @@ static const struct
 	},
 	{
 		"TablesyncWorkerMain", TablesyncWorkerMain
+	},
+	{
+		"DataChecksumsWorkerLauncherMain", DataChecksumsWorkerLauncherMain
+	},
+	{
+		"DataChecksumsWorkerMain", DataChecksumsWorkerMain
 	}
 };
 
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
new file mode 100644
index 0000000000..de7a077f9c
--- /dev/null
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -0,0 +1,1376 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.c
+ *	  Background worker for enabling or disabling data checksums online
+ *
+ * When enabling data checksums on a database at initdb time or when shut down
+ * with pg_checksums, no extra process is required as each page is checksummed,
+ * and verified, when accessed.  When enabling checksums on an already running
+ * cluster, this worker will ensure that all pages are checksummed before
+ * verification of the checksums is turned on. In the case of disabling
+ * checksums, the state transition is performed only in the control file, no
+ * changes are performed on the data pages.
+ *
+ * Checksums can be either enabled or disabled cluster-wide, with on/off being
+ * the end state for data_checksums.
+ *
+ * Enabling checksums
+ * ------------------
+ * When enabling checksums in an online cluster, data_checksums will be set to
+ * "inprogress-on" which signals that write operations MUST compute and write
+ * the checksum on the data page, but during reading the checksum SHALL NOT be
+ * verified. This ensures that all objects created during checksumming will
+ * have checksums set, but no reads will fail due to incorrect checksum. The
+ * DataChecksumsWorker will compile a list of databases which exist at the
+ * start of checksumming, and all of these which haven't been dropped during
+ * the processing MUST have been processed successfully in order for checksums
+ * to be enabled. Any new relation created during processing will see the
+ * in-progress state and will automatically be checksummed.
+ *
+ * For each database, all relations which have storage are read and every data
+ * page is marked dirty to force a write with the checksum. This will generate
+ * a lot of WAL as the entire database is read and written.
+ *
+ * If the processing is interrupted by a cluster restart, it will be restarted
+ * from the beginning again as state isn't persisted.
+ *
+ * Disabling checksums
+ * -------------------
+ * When disabling checksums, data_checksums will be set to "inprogress-off"
+ * which signals that checksums are written but no longer verified. This ensure
+ * that backends which have yet to move from the "on" state will still be able
+ * to process data checksum validation.
+ *
+ * Synchronization and Correctness
+ * -------------------------------
+ * The processes involved in enabling, or disabling, data checksums in an
+ * online cluster must be properly synchronized with the normal backends
+ * serving concurrent queries to ensure correctness. Correctness is defined
+ * as the following:
+ *
+ *    - Backends SHALL NOT violate local data_checksums state
+ *    - Data checksums SHALL NOT be considered enabled cluster-wide until all
+ *      currently connected backends have the local state "enabled"
+ *
+ * There are two levels of synchronization required for enabling data checksums
+ * in an online cluster: (i) changing state in the active backends ("on",
+ * "off", "inprogress-on" and "inprogress-off"), and (ii) ensuring no
+ * incompatible objects and processes are left in a database when workers end.
+ * The former deals with cluster-wide agreement on data checksum state and the
+ * latter with ensuring that any concurrent activity cannot break the data
+ * checksum contract during processing.
+ *
+ * Synchronizing the state change is done with procsignal barriers, where the
+ * WAL logging backend updating the global state in the controlfile will wait
+ * for all other backends to absorb the barrier. Barrier absorption will happen
+ * during interrupt processing, which means that connected backends will change
+ * state at different times. To prevent data checksum state changes when
+ * writing and verifying checksums, interrupts shall be held off before
+ * interrogating state and resumed when the IO operation has been performed.
+ *
+ *   When Enabling Data Checksums
+ *   ----------------------------
+ *   A process which fails to observe data checksums being enabled can induce
+ *   two types of errors: failing to write the checksum when modifying the page
+ *   and failing to validate the data checksum on the page when reading it.
+ *
+ *   When processing starts all backends belong to one of the below sets, with
+ *   one set being empty:
+ *
+ *   Bd: Backends in "off" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   If processing is started in an online cluster then all backends are in Bd.
+ *   If processing was halted by the cluster shutting down, the controlfile
+ *   state "inprogress-on" will be observed on system startup and all backends
+ *   will be in Bd. Backends transition Bd -> Bi via a procsignalbarrier.  When
+ *   the DataChecksumsWorker has finished writing checksums on all pages and
+ *   enables data checksums cluster-wide, there are four sets of backends where
+ *   Bd shall be an empty set:
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   Backends in Bi and Be will write checksums when modifying a page, but only
+ *   backends in Be will verify the checksum during reading. The Bg backend is
+ *   blocked waiting for all backends in Bi to process interrupts and move to
+ *   Be. Any backend starting while Bg is waiting on the procsignalbarrier will
+ *   observe the global state being "on" and will thus automatically belong to
+ *   Be.  Checksums are enabled cluster-wide when Bi is an empty set. Bi and Be
+ *   are compatible sets while still operating based on their local state as
+ *   both write data checksums.
+ *
+ *   When Disabling Data Checksums
+ *   -----------------------------
+ *   A process which fails to observe that data checksums have been disabled
+ *   can induce two types of errors: writing the checksum when modifying the
+ *   page and validating a data checksum which is no longer correct due to
+ *   modifications to the page.
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bo: Backends in "inprogress-off" state
+ *
+ *   Backends transition from the Be state to Bd like so: Be -> Bo -> Bd
+ *
+ *   The goal is to transition all backends to Bd making the others empty sets.
+ *   Backends in Bo write data checksums, but don't validate them, such that
+ *   backends still in Be can continue to validate pages until the barrier has
+ *   been absorbed such that they are in Bo. Once all backends are in Bo, the
+ *   barrier to transition to "off" can be raised and all backends can safely
+ *   stop writing data checksums as no backend is enforcing data checksum
+ *   validation any longer.
+ *
+ *
+ * Potential optimizations
+ * -----------------------
+ * Below are some potential optimizations and improvements which were brought
+ * up during reviews of this feature, but which weren't implemented in the
+ * initial version. These are ideas listed without any validation on their
+ * feasibility or potential payoff. More discussion on these can be found on
+ * the -hackers threads linked to in the commit message of this feature.
+ *
+ *   * Launching datachecksumsworker for resuming operation from the startup
+ *     process: Currently users have to restart processing manually after a
+ *     restart since dynamic background worker cannot be started from the
+ *     postmaster. Changing the startup process could make restarting the
+ *     processing automatic on cluster restart.
+ *   * Avoid dirtying the page when checksums already match: Iff the checksum
+ *     on the page happens to already match we still dirty the page. It should
+ *     be enough to only do the log_newpage_buffer() call in that case.
+ *   * Invent a lightweight WAL record that doesn't contain the full-page
+ *     image but just the block number: On replay, the redo routine would read
+ *     the page from disk.
+ *   * Teach pg_checksums to avoid checksummed pages when pg_checksums is used
+ *     to enable checksums on a cluster which is in inprogress-on state and
+ *     may have checksummed pages (make pg_checksums be able to resume an
+ *     online operation).
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/postmaster/datachecksumsworker.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/htup_details.h"
+#include "access/xact.h"
+#include "access/xloginsert.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_database.h"
+#include "commands/progress.h"
+#include "commands/vacuum.h"
+#include "common/relpath.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "postmaster/bgworker.h"
+#include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/bufmgr.h"
+#include "storage/checksum.h"
+#include "storage/ipc.h"
+#include "storage/lmgr.h"
+#include "storage/lwlock.h"
+#include "storage/procarray.h"
+#include "storage/smgr.h"
+#include "tcop/tcopprot.h"
+#include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
+#include "utils/ps_status.h"
+#include "utils/syscache.h"
+
+/*
+ * Number of times we retry to open a database before giving up and consider
+ * it to have failed processing.
+ */
+#define DATACHECKSUMSWORKER_MAX_DB_RETRIES 5
+
+typedef enum
+{
+	DATACHECKSUMSWORKER_SUCCESSFUL = 0,
+	DATACHECKSUMSWORKER_ABORTED,
+	DATACHECKSUMSWORKER_FAILED,
+	DATACHECKSUMSWORKER_RETRYDB,
+} DataChecksumsWorkerResult;
+
+/*
+ * Signaling between backends calling pg_enable/disable_data_checksums, the
+ * checksums launcher process, and the checksums worker process.
+ *
+ * This struct is protected by DataChecksumsWorkerLock
+ */
+typedef struct DataChecksumsWorkerShmemStruct
+{
+	/*
+	 * These are set by pg_enable/disable_data_checksums, to tell the launcher
+	 * what the target state is.
+	 */
+	bool		launch_enable_checksums;	/* True if checksums are being
+											 * enabled, else false */
+	int			launch_cost_delay;
+	int			launch_cost_limit;
+	bool		launch_fast;
+
+	/*
+	 * Is a launcher process is currently running?
+	 *
+	 * This is set by the launcher process, after it has read the above
+	 * launch_* parameters.
+	 */
+	bool		launcher_running;
+
+	/*
+	 * These fields indicate the target state that the launcher is currently
+	 * working towards. They can be different from the corresponding launch_*
+	 * fields, if a new pg_enable/disable_data_checksums() call was made while
+	 * the launcher/worker was already running.
+	 *
+	 * The below members are set when the launcher starts, and are only
+	 * accessed read-only by the single worker. Thus, we can access these
+	 * without a lock. If multiple workers, or dynamic cost parameters, are
+	 * supported at some point then this would need to be revisited.
+	 */
+	bool		enabling_checksums; /* True if checksums are being enabled,
+									 * else false */
+	int			cost_delay;
+	int			cost_limit;
+	bool		immediate_checkpoint;
+
+	/*
+	 * Signaling between the launcher and the worker process.
+	 *
+	 * As there is only a single worker, and the launcher won't read these
+	 * until the worker exits, they can be accessed without the need for a
+	 * lock. If multiple workers are supported then this will have to be
+	 * revisited.
+	 */
+
+	/* result, set by worker before exiting */
+	DataChecksumsWorkerResult success;
+
+	/*
+	 * tells the worker process whether it should also process the shared
+	 * catalogs
+	 */
+	bool		process_shared_catalogs;
+} DataChecksumsWorkerShmemStruct;
+
+/* Shared memory segment for datachecksumsworker */
+static DataChecksumsWorkerShmemStruct *DataChecksumsWorkerShmem;
+
+/* Bookkeeping for work to do */
+typedef struct DataChecksumsWorkerDatabase
+{
+	Oid			dboid;
+	char	   *dbname;
+} DataChecksumsWorkerDatabase;
+
+typedef struct DataChecksumsWorkerResultEntry
+{
+	Oid			dboid;
+	DataChecksumsWorkerResult result;
+	int			retries;
+} DataChecksumsWorkerResultEntry;
+
+
+/*
+ * Flag set by the interrupt handler
+ */
+static volatile sig_atomic_t abort_requested = false;
+
+/*
+ * Have we set the DataChecksumsWorkerShmemStruct->launcher_running flag?
+ * If we have, we need to clear it before exiting!
+ */
+static volatile sig_atomic_t launcher_running = false;
+
+/*
+ * Are we enabling data checksums, or disabling them?
+ */
+static bool enabling_checksums;
+
+/* Prototypes */
+static List *BuildDatabaseList(void);
+static List *BuildRelationList(bool temp_relations, bool include_shared);
+static void FreeDatabaseList(List *dblist);
+static DataChecksumsWorkerResult ProcessDatabase(DataChecksumsWorkerDatabase *db);
+static bool ProcessAllDatabases(bool immediate_checkpoint);
+static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy);
+static void launcher_cancel_handler(SIGNAL_ARGS);
+static void WaitForAllTransactionsToFinish(void);
+
+/*
+ * StartDataChecksumsWorkerLauncher
+ *		Main entry point for datachecksumsworker launcher process
+ *
+ * The main entrypoint for starting data checksums processing for enabling as
+ * well as disabling.
+ */
+void
+StartDataChecksumsWorkerLauncher(bool enable_checksums,
+								 int cost_delay,
+								 int cost_limit,
+								 bool fast)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	bool		launcher_running;
+
+	/* the cost delay settings have no effect when disabling */
+	Assert(enable_checksums || cost_delay == 0);
+	Assert(enable_checksums || cost_limit == 0);
+
+	/* store the desired state in shared memory */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	DataChecksumsWorkerShmem->launch_enable_checksums = enable_checksums;
+	DataChecksumsWorkerShmem->launch_cost_delay = cost_delay;
+	DataChecksumsWorkerShmem->launch_cost_limit = cost_limit;
+	DataChecksumsWorkerShmem->launch_fast = fast;
+
+	/* is the launcher already running? */
+	launcher_running = DataChecksumsWorkerShmem->launcher_running;
+
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * Launch a new launcher process, if it's not running already.
+	 *
+	 * If the launcher is currently busy enabling the checksums, and we want
+	 * them disabled (or vice versa), the launcher will notice that at latest
+	 * when it's about to exit, and will loop back process the new request. So
+	 * if the launcher is already running, we don't need to do anything more
+	 * here to abort it.
+	 *
+	 * If you call pg_enable/disable_data_checksums() twice in a row, before
+	 * the launcher has had a chance to start up, we still end up launching it
+	 * twice.  That's OK, the second invocation will see that a launcher is
+	 * already running and exit quickly.
+	 *
+	 * TODO: We could optimize here and skip launching the launcher, if we are
+	 * already in the desired state, i.e. if the checksums are already enabled
+	 * and you call pg_enable_data_checksums().
+	 */
+	if (!launcher_running)
+	{
+		/*
+		 * Prepare the BackgroundWorker and launch it.
+		 */
+		memset(&bgw, 0, sizeof(bgw));
+		bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+		bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+		snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+		snprintf(bgw.bgw_function_name, BGW_MAXLEN, "DataChecksumsWorkerLauncherMain");
+		snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker launcher");
+		snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker launcher");
+		bgw.bgw_restart_time = BGW_NEVER_RESTART;
+		bgw.bgw_notify_pid = MyProcPid;
+		bgw.bgw_main_arg = (Datum) 0;
+
+		if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+			ereport(ERROR,
+					(errmsg("failed to start background worker to process data checksums")));
+	}
+}
+
+/*
+ * ProcessSingleRelationFork
+ *		Enable data checksums in a single relation/fork.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy)
+{
+	BlockNumber numblocks = RelationGetNumberOfBlocksInFork(reln, forkNum);
+	BlockNumber blknum;
+	char		activity[NAMEDATALEN * 2 + 128];
+	char	   *relns;
+
+	relns = get_namespace_name(RelationGetNamespace(reln));
+
+	if (!relns)
+		return false;
+
+	/* Report the current relation to pgstat_activity */
+	snprintf(activity, sizeof(activity) - 1, "processing: %s.%s (%s, %dblocks)",
+			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	/*
+	 * We are looping over the blocks which existed at the time of process
+	 * start, which is safe since new blocks are created with checksums set
+	 * already due to the state being "inprogress-on".
+	 */
+	for (blknum = 0; blknum < numblocks; blknum++)
+	{
+		Buffer		buf = ReadBufferExtended(reln, forkNum, blknum, RBM_NORMAL, strategy);
+
+		/* Need to get an exclusive lock before we can flag as dirty */
+		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+
+		/*
+		 * Mark the buffer as dirty and force a full page write.  We have to
+		 * re-write the page to WAL even if the checksum hasn't changed,
+		 * because if there is a replica it might have a slightly different
+		 * version of the page with an invalid checksum, caused by unlogged
+		 * changes (e.g. hintbits) on the master happening while checksums
+		 * were off. This can happen if there was a valid checksum on the page
+		 * at one point in the past, so only when checksums are first on, then
+		 * off, and then turned on again. Iff wal_level is set to "minimal",
+		 * this could be avoided iff the checksum is calculated to be correct.
+		 */
+		START_CRIT_SECTION();
+		MarkBufferDirty(buf);
+		log_newpage_buffer(buf, false);
+		END_CRIT_SECTION();
+
+		UnlockReleaseBuffer(buf);
+
+		/*
+		 * This is the only place where we check if we are asked to abort, the
+		 * abortion will bubble up from here. It's safe to check this without
+		 * a lock, because if we miss it being set, we will try again soon.
+		 */
+		Assert(enabling_checksums);
+		if (!DataChecksumsWorkerShmem->launch_enable_checksums)
+			abort_requested = true;
+		if (abort_requested)
+			return false;
+
+		vacuum_delay_point();
+	}
+
+	pfree(relns);
+	return true;
+}
+
+/*
+ * ProcessSingleRelationByOid
+ *		Process a single relation based on oid.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationByOid(Oid relationId, BufferAccessStrategy strategy)
+{
+	Relation	rel;
+	ForkNumber	fnum;
+	bool		aborted = false;
+
+	StartTransactionCommand();
+
+	rel = try_relation_open(relationId, AccessShareLock);
+	if (rel == NULL)
+	{
+		/*
+		 * Relation no longer exists. We don't consider this an error since
+		 * there are no pages in it that need data checksums, and thus return
+		 * true. The worker operates off a list of relations generated at the
+		 * start of processing, so relations being dropped in the meantime is
+		 * to be expected.
+		 */
+		CommitTransactionCommand();
+		pgstat_report_activity(STATE_IDLE, NULL);
+		return true;
+	}
+	RelationGetSmgr(rel);
+
+	for (fnum = 0; fnum <= MAX_FORKNUM; fnum++)
+	{
+		if (smgrexists(rel->rd_smgr, fnum))
+		{
+			if (!ProcessSingleRelationFork(rel, fnum, strategy))
+			{
+				aborted = true;
+				break;
+			}
+		}
+	}
+	relation_close(rel, AccessShareLock);
+	elog(DEBUG2,
+		 "data checksum processing done for relation with OID %u: %s",
+		 relationId, (aborted ? "aborted" : "finished"));
+
+	CommitTransactionCommand();
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return !aborted;
+}
+
+/*
+ * ProcessDatabase
+ *		Enable data checksums in a single database.
+ *
+ * We do this by launching a dynamic background worker into this database, and
+ * waiting for it to finish.  We have to do this in a separate worker, since
+ * each process can only be connected to one database during its lifetime.
+ */
+static DataChecksumsWorkerResult
+ProcessDatabase(DataChecksumsWorkerDatabase *db)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	BgwHandleStatus status;
+	pid_t		pid;
+	char		activity[NAMEDATALEN + 64];
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_FAILED;
+
+	memset(&bgw, 0, sizeof(bgw));
+	bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+	bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+	snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+	snprintf(bgw.bgw_function_name, BGW_MAXLEN, "%s", "DataChecksumsWorkerMain");
+	snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker worker");
+	snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker worker");
+	bgw.bgw_restart_time = BGW_NEVER_RESTART;
+	bgw.bgw_notify_pid = MyProcPid;
+	bgw.bgw_main_arg = ObjectIdGetDatum(db->dboid);
+
+	/*
+	 * If there are no worker slots available, make sure we retry processing
+	 * this database. This will make the datachecksumsworker move on to the
+	 * next database and quite likely fail with the same problem. TODO: Maybe
+	 * we need a backoff to avoid running through all the databases here in
+	 * short order.
+	 */
+	if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+	{
+		ereport(WARNING,
+				(errmsg("failed to start worker for enabling data checksums in database \"%s\", retrying",
+						db->dbname),
+				 errhint("The max_worker_processes setting might be too low.")));
+		return DATACHECKSUMSWORKER_RETRYDB;
+	}
+
+	status = WaitForBackgroundWorkerStartup(bgw_handle, &pid);
+	if (status == BGWH_STOPPED)
+	{
+		ereport(WARNING,
+				(errmsg("could not start background worker for enabling data checksums in database \"%s\"",
+						db->dbname),
+				 errhint("More details on the error might be found in the server log.")));
+		return DATACHECKSUMSWORKER_FAILED;
+	}
+
+	/*
+	 * If the postmaster crashed we cannot end up with a processed database so
+	 * we have no alternative other than exiting. When enabling checksums we
+	 * won't at this time have changed the pg_control version to enabled so
+	 * when the cluster comes back up processing will have to be restarted.
+	 * When disabling, the pg_control version will be set to off before this
+	 * so when the cluster comes up checksums will be off as expected.
+	 */
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("cannot enable data checksums without the postmaster process"),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	Assert(status == BGWH_STARTED);
+	ereport(DEBUG1,
+			(errmsg("initiating data checksum processing in database \"%s\"",
+					db->dbname)));
+
+	snprintf(activity, sizeof(activity) - 1,
+			 "Waiting for worker in database %s (pid %d)", db->dbname, pid);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	status = WaitForBackgroundWorkerShutdown(bgw_handle);
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("postmaster exited during data checksum processing in \"%s\"",
+						db->dbname),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	if (DataChecksumsWorkerShmem->success == DATACHECKSUMSWORKER_ABORTED)
+		ereport(LOG,
+				(errmsg("data checksums processing was aborted in database \"%s\"",
+						db->dbname)));
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return DataChecksumsWorkerShmem->success;
+}
+
+/*
+ * launcher_exit
+ *
+ * Internal routine for cleaning up state when the launcher process exits. We
+ * need to clean up the abort flag to ensure that processing can be restarted
+ * again after it was previously aborted.
+ */
+static void
+launcher_exit(int code, Datum arg)
+{
+	if (launcher_running)
+	{
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		launcher_running = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		LWLockRelease(DataChecksumsWorkerLock);
+	}
+}
+
+/*
+ * launcher_cancel_handler
+ *
+ * Internal routine for reacting to SIGINT and flagging the worker to abort.
+ * The worker won't be interrupted immediately but will check for abort flag
+ * between each block in a relation.
+ */
+static void
+launcher_cancel_handler(SIGNAL_ARGS)
+{
+	int			save_errno = errno;
+
+	abort_requested = true;
+
+	/*
+	 * There is no sleeping in the main loop, the flag will be checked
+	 * periodically in ProcessSingleRelationFork. The worker does however
+	 * sleep when waiting for concurrent transactions to end so we still need
+	 * to set the latch.
+	 */
+	SetLatch(MyLatch);
+
+	errno = save_errno;
+}
+
+/*
+ * WaitForAllTransactionsToFinish
+ *		Blocks awaiting all current transactions to finish
+ *
+ * Returns when all transactions which are active at the call of the function
+ * have ended, or if the postmaster dies while waiting. If the postmaster dies
+ * the abort flag will be set to indicate that the caller of this shouldn't
+ * proceed.
+ *
+ * NB: this will return early, if aborted by SIGINT or if the target state
+ * is changed while we're running.
+ */
+static void
+WaitForAllTransactionsToFinish(void)
+{
+	TransactionId waitforxid;
+
+	LWLockAcquire(XidGenLock, LW_SHARED);
+	waitforxid = XidFromFullTransactionId(TransamVariables->nextXid);
+	LWLockRelease(XidGenLock);
+
+	while (TransactionIdPrecedes(GetOldestActiveTransactionId(), waitforxid))
+	{
+		char		activity[64];
+		int			rc;
+
+		/* Oldest running xid is older than us, so wait */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for current transactions to finish (waiting for %u)",
+				 waitforxid);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		rc = WaitLatch(MyLatch,
+					   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+					   5000,
+					   WAIT_EVENT_CHECKSUM_ENABLE_STARTCONDITION);
+
+		/*
+		 * If the postmaster died we won't be able to enable checksums
+		 * cluster-wide so abort and hope to continue when restarted.
+		 */
+		if (rc & WL_POSTMASTER_DEATH)
+			ereport(FATAL,
+					(errmsg("postmaster exited during data checksum processing"),
+					 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_SHARED);
+		if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			abort_requested = true;
+		LWLockRelease(DataChecksumsWorkerLock);
+		if (abort_requested)
+			break;
+	}
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+	return;
+}
+
+/*
+ * DataChecksumsWorkerLauncherMain
+ *
+ * Main function for launching dynamic background workers for processing data
+ * checksums in databases. This function has the bgworker management, with
+ * ProcessAllDatabases being responsible for looping over the databases and
+ * initiating processing.
+ */
+void
+DataChecksumsWorkerLauncherMain(Datum arg)
+{
+	on_shmem_exit(launcher_exit, 0);
+
+	ereport(DEBUG1,
+			errmsg("background worker \"datachecksumsworker\" launcher started"));
+
+	pqsignal(SIGTERM, die);
+	pqsignal(SIGINT, launcher_cancel_handler);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_LAUNCHER;
+	init_ps_display(NULL);
+
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	if (DataChecksumsWorkerShmem->launcher_running)
+	{
+		/* Launcher was already running, let it finish */
+		LWLockRelease(DataChecksumsWorkerLock);
+		return;
+	}
+
+	launcher_running = true;
+
+	/*
+	 * Initialize a connection to shared catalogs only.
+	 */
+	BackgroundWorkerInitializeConnectionByOid(InvalidOid, InvalidOid, 0);
+
+	/* Initialize backend status information */
+	pgstat_bestart();
+
+	enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+	DataChecksumsWorkerShmem->launcher_running = true;
+	DataChecksumsWorkerShmem->enabling_checksums = enabling_checksums;
+	DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+	DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+	DataChecksumsWorkerShmem->immediate_checkpoint = DataChecksumsWorkerShmem->launch_fast;
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * The target state can change while we are busy enabling/disabling
+	 * checksums, if the user calls pg_disable/enable_data_checksums() before
+	 * we are finished with the previous request. In that case, we will loop
+	 * back here, to process the new request.
+	 */
+again:
+
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	/*
+	 * If we're asked to enable checksums, we need to check if processing was
+	 * previously interrupted such that we should resume rather than start
+	 * from scratch.
+	 */
+	if (enabling_checksums)
+	{
+		/*
+		 * If we are asked to enable checksums in a cluster which already has
+		 * checksums enabled, exit immediately as there is nothing more to do.
+		 * Hold interrupts to make sure state doesn't change during checking.
+		 */
+		HOLD_INTERRUPTS();
+		if (DataChecksumsNeedVerify())
+		{
+			RESUME_INTERRUPTS();
+			goto done;
+		}
+		RESUME_INTERRUPTS();
+
+		/*
+		 * Initialize progress and indicate that we are waiting on the other
+		 * backends to clear the procsignalbarrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS);
+
+		/*
+		 * Set the state to inprogress-on and wait on the procsignal barrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_ENABLING);
+		SetDataChecksumsOnInProgress();
+
+		if (!ProcessAllDatabases(DataChecksumsWorkerShmem->immediate_checkpoint))
+		{
+			/*
+			 * If the target state changed during processing then it's not a
+			 * failure, so restart processing instead.
+			 */
+			LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+			if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			{
+				LWLockRelease(DataChecksumsWorkerLock);
+				goto done;
+			}
+			LWLockRelease(DataChecksumsWorkerLock);
+			ereport(ERROR,
+					(errmsg("unable to enable data checksums in cluster")));
+		}
+
+		SetDataChecksumsOn();
+	}
+	else
+	{
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_DISABLING);
+		SetDataChecksumsOff();
+	}
+
+done:
+
+	/*
+	 * All done. But before we exit, check if the target state was changed
+	 * while we were running. In that case we will have to start all over
+	 * again.
+	 */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+	if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+	{
+		DataChecksumsWorkerShmem->enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+		DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+		LWLockRelease(DataChecksumsWorkerLock);
+		goto again;
+	}
+
+	/* Shut down progress reporting as we are done */
+	pgstat_progress_end_command();
+
+	launcher_running = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	LWLockRelease(DataChecksumsWorkerLock);
+}
+
+/*
+ * ProcessAllDatabases
+ *		Compute the list of all databases and process checksums in each
+ *
+ * This will repeatedly generate a list of databases to process for enabling
+ * checksums. Until no new databases are found, this will loop around computing
+ * a new list and comparing it to the already seen ones.
+ *
+ * If immediate_checkpoint is set to true then a CHECKPOINT_IMMEDIATE will be
+ * issued. This is useful for testing but should be avoided in production use
+ * as it may affect cluster performance drastically.
+ */
+static bool
+ProcessAllDatabases(bool immediate_checkpoint)
+{
+	List	   *DatabaseList;
+	HTAB	   *ProcessedDatabases = NULL;
+	HASHCTL		hash_ctl;
+	bool		found_failed = false;
+	int			flags;
+
+	/* Initialize a hash tracking all processed databases */
+	memset(&hash_ctl, 0, sizeof(hash_ctl));
+	hash_ctl.keysize = sizeof(Oid);
+	hash_ctl.entrysize = sizeof(DataChecksumsWorkerResultEntry);
+	ProcessedDatabases = hash_create("Processed databases",
+									 64,
+									 &hash_ctl,
+									 HASH_ELEM | HASH_BLOBS);
+
+	/*
+	 * Set up so first run processes shared catalogs, but not once in every
+	 * db.
+	 */
+	DataChecksumsWorkerShmem->process_shared_catalogs = true;
+
+	/*
+	 * Get a list of all databases to process. This may include databases that
+	 * were created during our runtime.  Since a database can be created as a
+	 * copy of any other database (which may not have existed in our last
+	 * run), we have to repeat this loop until no new databases show up in the
+	 * list.
+	 */
+	DatabaseList = BuildDatabaseList();
+
+	/*
+	 * Update progress reporting with the total number of databases we need to
+	 * process. This number should not be changed during processing, the
+	 * columns for processed databases is instead increased such that it can
+	 * be compared against the total.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_DB,
+								 list_length(DatabaseList));
+
+	while (true)
+	{
+		int			processed_databases = 0;
+
+		foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+		{
+			DataChecksumsWorkerResult result;
+			DataChecksumsWorkerResultEntry *entry;
+			bool		found;
+
+			/*
+			 * Indicate which database is being processed set the number of
+			 * relations to -1 to clear field from previous values. -1 will
+			 * translate to NULL in the progress view.
+			 */
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_CUR_DB, db->dboid);
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_REL, -1);
+
+			/*
+			 * Check if this database has been processed already, and if so
+			 * whether it should be retried or skipped.
+			 */
+			entry = (DataChecksumsWorkerResultEntry *) hash_search(ProcessedDatabases, &db->dboid,
+																   HASH_FIND, NULL);
+
+			if (entry)
+			{
+				if (entry->result == DATACHECKSUMSWORKER_RETRYDB)
+				{
+					/*
+					 * Limit the number of retries to avoid infinite looping
+					 * in case there simply won't be enough workers in the
+					 * cluster to finish this operation.
+					 */
+					if (entry->retries > DATACHECKSUMSWORKER_MAX_DB_RETRIES)
+						entry->result = DATACHECKSUMSWORKER_FAILED;
+				}
+
+				/* Skip if this database has been processed already */
+				if (entry->result != DATACHECKSUMSWORKER_RETRYDB)
+					continue;
+			}
+
+			result = ProcessDatabase(db);
+			processed_databases++;
+
+			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
+			{
+				/*
+				 * If one database has completed shared catalogs, we don't
+				 * have to process them again.
+				 */
+				if (DataChecksumsWorkerShmem->process_shared_catalogs)
+					DataChecksumsWorkerShmem->process_shared_catalogs = false;
+			}
+			else if (result == DATACHECKSUMSWORKER_ABORTED)
+			{
+				/* Abort flag set, so exit the whole process */
+				return false;
+			}
+
+			entry = hash_search(ProcessedDatabases, &db->dboid, HASH_ENTER, &found);
+			entry->dboid = db->dboid;
+			entry->result = result;
+			if (!found)
+				entry->retries = 0;
+			else
+				entry->retries++;
+		}
+
+		elog(DEBUG1,
+			 "%i databases processed for data checksum enabling, %s",
+			 processed_databases,
+			 (processed_databases ? "process with restart" : "process completed"));
+
+		FreeDatabaseList(DatabaseList);
+
+		/*
+		 * If no databases were processed in this run of the loop, we have now
+		 * finished all databases and no concurrently created ones can exist.
+		 */
+		if (processed_databases == 0)
+			break;
+
+		/*
+		 * Re-generate the list of databases for another pass. Since we wait
+		 * for all pre-existing transactions finish, this way we can be
+		 * certain that there are no databases left without checksums.
+		 */
+		WaitForAllTransactionsToFinish();
+		DatabaseList = BuildDatabaseList();
+	}
+
+	/*
+	 * ProcessedDatabases now has all databases and the results of their
+	 * processing. Failure to enable checksums for a database can be because
+	 * they actually failed for some reason, or because the database was
+	 * dropped between us getting the database list and trying to process it.
+	 * Get a fresh list of databases to detect the second case where the
+	 * database was dropped before we had started processing it. If a database
+	 * still exists, but enabling checksums failed then we fail the entire
+	 * checksumming process and exit with an error.
+	 */
+	WaitForAllTransactionsToFinish();
+	DatabaseList = BuildDatabaseList();
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+	{
+		DataChecksumsWorkerResult *entry;
+		bool		found;
+
+		entry = hash_search(ProcessedDatabases, (void *) &db->dboid,
+							HASH_FIND, &found);
+
+		/*
+		 * We are only interested in the databases where the failed database
+		 * still exists.
+		 */
+		if (found && *entry == DATACHECKSUMSWORKER_FAILED)
+		{
+			ereport(WARNING,
+					(errmsg("failed to enable data checksums in \"%s\"",
+							db->dbname)));
+			found_failed = found;
+			continue;
+		}
+	}
+
+	FreeDatabaseList(DatabaseList);
+
+	if (found_failed)
+	{
+		/* Disable checksums on cluster, because we failed */
+		SetDataChecksumsOff();
+		ereport(ERROR,
+				(errmsg("data checksums failed to get enabled in all databases, aborting"),
+				 errhint("The server log might have more information on the cause of the error.")));
+	}
+
+	/*
+	 * Force a checkpoint to get everything out to disk. The use of immediate
+	 * checkpoints is for running tests, as they would otherwise not execute
+	 * in such a way that they can reliably be placed under timeout control.
+	 */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_IMMEDIATE;
+	RequestCheckpoint(flags);
+
+	return true;
+}
+
+/*
+ * DataChecksumsWorkerShmemSize
+ *		Compute required space for datachecksumsworker-related shared memory
+ */
+Size
+DataChecksumsWorkerShmemSize(void)
+{
+	Size		size;
+
+	size = sizeof(DataChecksumsWorkerShmemStruct);
+	size = MAXALIGN(size);
+
+	return size;
+}
+
+/*
+ * DataChecksumsWorkerShmemInit
+ *		Allocate and initialize datachecksumsworker-related shared memory
+ */
+void
+DataChecksumsWorkerShmemInit(void)
+{
+	bool		found;
+
+	DataChecksumsWorkerShmem = (DataChecksumsWorkerShmemStruct *)
+		ShmemInitStruct("DataChecksumsWorker Data",
+						DataChecksumsWorkerShmemSize(),
+						&found);
+
+	MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
+
+	/*
+	 * Even if this is a redundant assignment, we want to be explicit about
+	 * our intent for readability, since we want to be able to query this
+	 * state in case of restartability.
+	 */
+	DataChecksumsWorkerShmem->launch_enable_checksums = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	DataChecksumsWorkerShmem->launch_fast = false;
+}
+
+/*
+ * BuildDatabaseList
+ *		Compile a list of all currently available databases in the cluster
+ *
+ * This creates the list of databases for the datachecksumsworker workers to
+ * add checksums to. If the caller wants to ensure that no concurrently
+ * running CREATE DATABASE calls exist, this needs to be preceded by a call
+ * to WaitForAllTransactionsToFinish().
+ */
+static List *
+BuildDatabaseList(void)
+{
+	List	   *DatabaseList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(DatabaseRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_database pgdb = (Form_pg_database) GETSTRUCT(tup);
+		DataChecksumsWorkerDatabase *db;
+
+		oldctx = MemoryContextSwitchTo(ctx);
+
+		db = (DataChecksumsWorkerDatabase *) palloc0(sizeof(DataChecksumsWorkerDatabase));
+
+		db->dboid = pgdb->oid;
+		db->dbname = pstrdup(NameStr(pgdb->datname));
+
+		DatabaseList = lappend(DatabaseList, db);
+
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return DatabaseList;
+}
+
+static void
+FreeDatabaseList(List *dblist)
+{
+	if (!dblist)
+		return;
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, dblist)
+	{
+		if (db->dbname != NULL)
+			pfree(db->dbname);
+	}
+
+	list_free_deep(dblist);
+}
+
+/*
+ * BuildRelationList
+ *		Compile a list of relations in the database
+ *
+ * Returns a list of OIDs for the request relation types. If temp_relations
+ * is True then only temporary relations are returned. If temp_relations is
+ * False then non-temporary relations which have data checksums are returned.
+ * If include_shared is True then shared relations are included as well in a
+ * non-temporary list. include_shared has no relevance when building a list of
+ * temporary relations.
+ */
+static List *
+BuildRelationList(bool temp_relations, bool include_shared)
+{
+	List	   *RelationList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(RelationRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_class pgc = (Form_pg_class) GETSTRUCT(tup);
+
+		/*
+		 * Only include temporary relations when asked for a temp relation
+		 * list.
+		 */
+		if (pgc->relpersistence == RELPERSISTENCE_TEMP)
+		{
+			if (!temp_relations)
+				continue;
+		}
+		else
+		{
+			/*
+			 * If we are only interested in temp relations then continue
+			 * immediately as the current relation isn't a temp relation.
+			 */
+			if (temp_relations)
+				continue;
+
+			if (!RELKIND_HAS_STORAGE(pgc->relkind))
+				continue;
+
+			if (pgc->relisshared && !include_shared)
+				continue;
+		}
+
+		oldctx = MemoryContextSwitchTo(ctx);
+		RelationList = lappend_oid(RelationList, pgc->oid);
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return RelationList;
+}
+
+/*
+ * DataChecksumsWorkerMain
+ *
+ * Main function for enabling checksums in a single database, This is the
+ * function set as the bgw_function_name in the dynamic background worker
+ * process initiated for each database by the worker launcher. After enabling
+ * data checksums in each applicable relation in the database, it will wait for
+ * all temporary relations that were present when the function started to
+ * disappear before returning. This is required since we cannot rewrite
+ * existing temporary relations with data checksums.
+ */
+void
+DataChecksumsWorkerMain(Datum arg)
+{
+	Oid			dboid = DatumGetObjectId(arg);
+	List	   *RelationList = NIL;
+	List	   *InitialTempTableList = NIL;
+	BufferAccessStrategy strategy;
+	bool		aborted = false;
+
+	enabling_checksums = true;
+
+	pqsignal(SIGTERM, die);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_WORKER;
+	init_ps_display(NULL);
+
+	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
+											  BGWORKER_BYPASS_ALLOWCONN);
+
+	/*
+	 * Get a list of all temp tables present as we start in this database. We
+	 * need to wait until they are all gone until we are done, since we cannot
+	 * access these relations and modify them.
+	 */
+	InitialTempTableList = BuildRelationList(true, false);
+
+	/*
+	 * Enable vacuum cost delay, if any.
+	 */
+	Assert(DataChecksumsWorkerShmem->enabling_checksums);
+	VacuumCostDelay = DataChecksumsWorkerShmem->cost_delay;
+	VacuumCostLimit = DataChecksumsWorkerShmem->cost_limit;
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumCostPageHit = 0;
+	VacuumCostPageMiss = 0;
+	VacuumCostPageDirty = 0;
+
+	/*
+	 * Create and set the vacuum strategy as our buffer strategy.
+	 */
+	strategy = GetAccessStrategy(BAS_VACUUM);
+
+	RelationList = BuildRelationList(false,
+									 DataChecksumsWorkerShmem->process_shared_catalogs);
+	foreach_oid(reloid, RelationList)
+	{
+		if (!ProcessSingleRelationByOid(reloid, strategy))
+		{
+			aborted = true;
+			break;
+		}
+	}
+	list_free(RelationList);
+
+	if (aborted)
+	{
+		DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+		ereport(DEBUG1,
+				(errmsg("data checksum processing aborted in database OID %u",
+						dboid)));
+		return;
+	}
+
+	/*
+	 * Wait for all temp tables that existed when we started to go away. This
+	 * is necessary since we cannot "reach" them to enable checksums. Any temp
+	 * tables created after we started will already have checksums in them
+	 * (due to the "inprogress-on" state), so no need to wait for those.
+	 */
+	for (;;)
+	{
+		List	   *CurrentTempTables;
+		int			numleft;
+		char		activity[64];
+
+		CurrentTempTables = BuildRelationList(true, false);
+		numleft = 0;
+		foreach_oid(tmptbloid, InitialTempTableList)
+		{
+			if (list_member_oid(CurrentTempTables, tmptbloid))
+				numleft++;
+		}
+		list_free(CurrentTempTables);
+
+		if (numleft == 0)
+			break;
+
+		/*
+		 * At least one temp table is left to wait for, indicate in pgstat
+		 * activity and progress reporting.
+		 */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for %d temp tables to be removed", numleft);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		(void) WaitLatch(MyLatch,
+						 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+						 5000,
+						 WAIT_EVENT_CHECKSUM_ENABLE_FINISHCONDITION);
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		aborted = DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums;
+		LWLockRelease(DataChecksumsWorkerLock);
+
+		if (aborted || abort_requested)
+		{
+			DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+			ereport(DEBUG1,
+					(errmsg("data checksum processing aborted in database OID %u",
+							dboid)));
+			return;
+		}
+	}
+
+	list_free(InitialTempTableList);
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
+}
diff --git a/src/backend/postmaster/meson.build b/src/backend/postmaster/meson.build
index 0ea4bbe084..3aacb2e0e7 100644
--- a/src/backend/postmaster/meson.build
+++ b/src/backend/postmaster/meson.build
@@ -6,6 +6,7 @@ backend_sources += files(
   'bgworker.c',
   'bgwriter.c',
   'checkpointer.c',
+  'datachecksumsworker.c',
   'fork_process.c',
   'interrupt.c',
   'launch_backend.c',
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index e73576ad12..86f2ba1e2a 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -186,6 +186,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 		case XLOG_FPW_CHANGE:
 		case XLOG_FPI_FOR_HINT:
 		case XLOG_FPI:
+		case XLOG_CHECKSUMS:
 		case XLOG_OVERWRITE_CONTRECORD:
 		case XLOG_CHECKPOINT_REDO:
 			break;
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index d68aa29d93..e78b062ba7 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -31,6 +31,8 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "postmaster/postmaster.h"
 #include "postmaster/walsummarizer.h"
 #include "replication/logicallauncher.h"
 #include "replication/origin.h"
@@ -150,6 +152,7 @@ CalculateShmemSize(int *num_semaphores)
 	size = add_size(size, InjectionPointShmemSize());
 	size = add_size(size, SlotSyncShmemSize());
 	size = add_size(size, WaitLSNShmemSize());
+	size = add_size(size, DataChecksumsWorkerShmemSize());
 
 	/* include additional requested shmem from preload libraries */
 	size = add_size(size, total_addin_request);
@@ -332,6 +335,7 @@ CreateOrAttachShmemStructs(void)
 	PgArchShmemInit();
 	ApplyLauncherShmemInit();
 	SlotSyncShmemInit();
+	DataChecksumsWorkerShmemInit();
 
 	/*
 	 * Set up other modules that need some shared memory space
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 87027f27eb..e51cfb3a98 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -18,6 +18,7 @@
 #include <unistd.h>
 
 #include "access/parallel.h"
+#include "access/xlog.h"
 #include "commands/async.h"
 #include "miscadmin.h"
 #include "pgstat.h"
@@ -573,6 +574,18 @@ ProcessProcSignalBarrier(void)
 					case PROCSIGNAL_BARRIER_SMGRRELEASE:
 						processed = ProcessBarrierSmgrRelease();
 						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON:
+						processed = AbsorbChecksumsOnInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_ON:
+						processed = AbsorbChecksumsOnBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF:
+						processed = AbsorbChecksumsOffInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_OFF:
+						processed = AbsorbChecksumsOffBarrier();
+						break;
 				}
 
 				/*
diff --git a/src/backend/storage/page/README b/src/backend/storage/page/README
index e30d7ac59a..73c36a6390 100644
--- a/src/backend/storage/page/README
+++ b/src/backend/storage/page/README
@@ -10,7 +10,9 @@ http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf, discussed
 2010/12/22 on -hackers list.
 
 Current implementation requires this be enabled system-wide at initdb time, or
-by using the pg_checksums tool on an offline cluster.
+by using the pg_checksums tool on an offline cluster.  Checksums can also be
+enabled at runtime using pg_enable_data_checksums(), and disabled by using
+pg_disable_data_checksums().
 
 The checksum is not valid at all times on a data page!!
 The checksum is valid when the page leaves the shared pool and is checked
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index be6f1f62d2..112c05ee7e 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -100,7 +100,7 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 	 */
 	if (!PageIsNew(page))
 	{
-		if (DataChecksumsEnabled())
+		if (DataChecksumsNeedVerify())
 		{
 			checksum = pg_checksum_page((char *) page, blkno);
 
@@ -1512,7 +1512,7 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 	static char *pageCopy = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return (char *) page;
 
 	/*
@@ -1542,7 +1542,7 @@ void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
 	/* If we don't need a checksum, just return */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return;
 
 	((PageHeader) page)->pd_checksum = pg_checksum_page((char *) page, blkno);
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index cc2ffc78aa..2b5671e2ca 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -359,6 +359,8 @@ pgstat_tracks_io_bktype(BackendType bktype)
 		case B_WAL_SUMMARIZER:
 			return false;
 
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 		case B_AUTOVAC_LAUNCHER:
 		case B_AUTOVAC_WORKER:
 		case B_BACKEND:
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index 8efb4044d6..4ed6ec157a 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -115,6 +115,8 @@ CHECKPOINT_DELAY_COMPLETE	"Waiting for a backend that blocks a checkpoint from c
 CHECKPOINT_DELAY_START	"Waiting for a backend that blocks a checkpoint from starting."
 CHECKPOINT_DONE	"Waiting for a checkpoint to complete."
 CHECKPOINT_START	"Waiting for a checkpoint to start."
+CHECKSUM_ENABLE_STARTCONDITION	"Waiting for data checksums enabling to start."
+CHECKSUM_ENABLE_FINISHCONDITION	"Waiting for data checksums to be enabled."
 EXECUTE_GATHER	"Waiting for activity from a child process while executing a <literal>Gather</literal> plan node."
 HASH_BATCH_ALLOCATE	"Waiting for an elected Parallel Hash participant to allocate a hash table."
 HASH_BATCH_ELECT	"Waiting to elect a Parallel Hash participant to allocate a hash table."
@@ -347,6 +349,7 @@ DSMRegistry	"Waiting to read or update the dynamic shared memory registry."
 InjectionPoint	"Waiting to read or update information related to injection points."
 SerialControl	"Waiting to read or update shared <filename>pg_serial</filename> state."
 WaitLSN	"Waiting to read or update shared Wait-for-LSN state."
+DataChecksumsWorker	"Waiting for data checksumsworker."
 
 #
 # END OF PREDEFINED LWLOCKS (DO NOT CHANGE THIS LINE)
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index f7b50e0b5a..9f52541eda 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -246,6 +246,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		cmdtype = PROGRESS_COMMAND_BASEBACKUP;
 	else if (pg_strcasecmp(cmd, "COPY") == 0)
 		cmdtype = PROGRESS_COMMAND_COPY;
+	else if (pg_strcasecmp(cmd, "DATACHECKSUMS") == 0)
+		cmdtype = PROGRESS_COMMAND_DATACHECKSUMS;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -1115,9 +1117,6 @@ pg_stat_get_db_checksum_failures(PG_FUNCTION_ARGS)
 	int64		result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
@@ -1133,9 +1132,6 @@ pg_stat_get_db_checksum_last_failure(PG_FUNCTION_ARGS)
 	TimestampTz result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index ef60f41b8c..8d15c15edb 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -290,6 +290,12 @@ GetBackendTypeDesc(BackendType backendType)
 		case B_CHECKPOINTER:
 			backendDesc = "checkpointer";
 			break;
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+			backendDesc = "datachecksumsworker launcher";
+			break;
+		case B_DATACHECKSUMSWORKER_WORKER:
+			backendDesc = "datachecksumsworker worker";
+			break;
 		case B_LOGGER:
 			backendDesc = "logger";
 			break;
@@ -841,7 +847,8 @@ InitializeSessionUserIdStandalone(void)
 	 * workers, in slot sync worker and in background workers.
 	 */
 	Assert(!IsUnderPostmaster || AmAutoVacuumWorkerProcess() ||
-		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess());
+		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess() ||
+		   AmDataChecksumsWorkerProcess());
 
 	/* call only once */
 	Assert(!OidIsValid(AuthenticatedUserId));
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index a024b1151d..513b163ed7 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -720,6 +720,11 @@ InitPostgres(const char *in_dbname, Oid dboid,
 	 */
 	SharedInvalBackendInit(false);
 
+	/*
+	 * Set up backend local cache of Controldata values.
+	 */
+	InitLocalControldata();
+
 	ProcSignalInit(MyCancelKeyValid, MyCancelKey);
 
 	/*
@@ -864,7 +869,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
 					 errhint("You should immediately run CREATE USER \"%s\" SUPERUSER;.",
 							 username != NULL ? username : "postgres")));
 	}
-	else if (AmBackgroundWorkerProcess())
+	else if (AmBackgroundWorkerProcess() || AmDataChecksumsWorkerProcess())
 	{
 		if (username == NULL && !OidIsValid(useroid))
 		{
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 8a67f01200..3ab7ead97d 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -474,6 +474,14 @@ static const struct config_enum_entry wal_compression_options[] = {
 	{NULL, 0, false}
 };
 
+static const struct config_enum_entry data_checksums_options[] = {
+	{"on", DATA_CHECKSUMS_ON, true},
+	{"off", DATA_CHECKSUMS_OFF, true},
+	{"inprogress-on", DATA_CHECKSUMS_INPROGRESS_ON, true},
+	{"inprogress-off", DATA_CHECKSUMS_INPROGRESS_OFF, true},
+	{NULL, 0, false}
+};
+
 /*
  * Options for enum values stored in other modules
  */
@@ -599,7 +607,7 @@ static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
 static int	num_os_semaphores;
-static bool data_checksums;
+static int	data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
@@ -1919,17 +1927,6 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
-	{
-		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
-			gettext_noop("Shows whether data checksums are turned on for this cluster."),
-			NULL,
-			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
-		},
-		&data_checksums,
-		false,
-		NULL, NULL, NULL
-	},
-
 	{
 		{"syslog_sequence_numbers", PGC_SIGHUP, LOGGING_WHERE,
 			gettext_noop("Add sequence number to syslog messages to avoid duplicate suppression."),
@@ -5207,6 +5204,17 @@ struct config_enum ConfigureNamesEnum[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows whether data checksums are turned on for this cluster."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+		},
+		&data_checksums,
+		DATA_CHECKSUMS_OFF, data_checksums_options,
+		NULL, NULL, show_data_checksums
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index f5f7ff1045..3bda3adb04 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -577,7 +577,7 @@ main(int argc, char *argv[])
 		mode == PG_MODE_DISABLE)
 		pg_fatal("data checksums are already disabled in cluster");
 
-	if (ControlFile->data_checksum_version > 0 &&
+	if (ControlFile->data_checksum_version == DATA_CHECKSUMS_ON &&
 		mode == PG_MODE_ENABLE)
 		pg_fatal("data checksums are already enabled in cluster");
 
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index 854c6887a2..7118f9069c 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -13,6 +13,7 @@
 
 #include "common/string.h"
 #include "pg_upgrade.h"
+#include "storage/bufpage.h"
 
 
 /*
@@ -693,6 +694,14 @@ check_control_data(ControlData *oldctrl,
 	 * check_for_isn_and_int8_passing_mismatch().
 	 */
 
+	/*
+	 * If data checksums are in any in-progress state then disallow the
+	 * upgrade. The user should either let the process finish, or turn off
+	 * data checksums, before retrying.
+	 */
+	if (oldctrl->data_checksum_version > PG_DATA_CHECKSUM_VERSION)
+		pg_fatal("checksums are being enabled in the old cluster");
+
 	/*
 	 * We might eventually allow upgrades from checksum to no-checksum
 	 * clusters.
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 34ad46c067..dd7c75abe3 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -117,7 +117,7 @@ extern PGDLLIMPORT int wal_level;
  * of the bits make it to disk, but the checksum wouldn't match.  Also WAL-log
  * them if forced by wal_log_hints=on.
  */
-#define XLogHintBitIsNeeded() (DataChecksumsEnabled() || wal_log_hints)
+#define XLogHintBitIsNeeded() (wal_log_hints || DataChecksumsNeedWrite())
 
 /* Do we need to WAL-log information required only for Hot Standby and logical replication? */
 #define XLogStandbyInfoActive() (wal_level >= WAL_LEVEL_REPLICA)
@@ -230,7 +230,19 @@ extern XLogRecPtr GetXLogWriteRecPtr(void);
 
 extern uint64 GetSystemIdentifier(void);
 extern char *GetMockAuthenticationNonce(void);
-extern bool DataChecksumsEnabled(void);
+extern bool DataChecksumsNeedWrite(void);
+extern bool DataChecksumsNeedVerify(void);
+extern bool DataChecksumsOnInProgress(void);
+extern bool DataChecksumsOffInProgress(void);
+extern void SetDataChecksumsOnInProgress(void);
+extern void SetDataChecksumsOn(void);
+extern void SetDataChecksumsOff(void);
+extern bool AbsorbChecksumsOnInProgressBarrier(void);
+extern bool AbsorbChecksumsOffInProgressBarrier(void);
+extern bool AbsorbChecksumsOnBarrier(void);
+extern bool AbsorbChecksumsOffBarrier(void);
+extern const char *show_data_checksums(void);
+extern void InitLocalControldata(void);
 extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
 extern Size XLOGShmemSize(void);
 extern void XLOGShmemInit(void);
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index d9cf51a0f9..80a354cb74 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,6 +25,7 @@
 #include "lib/stringinfo.h"
 #include "pgtime.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/relfilelocator.h"
 
 
@@ -289,6 +290,12 @@ typedef struct xl_restore_point
 	char		rp_name[MAXFNAMELEN];
 } xl_restore_point;
 
+/* Information logged when data checksum level is changed */
+typedef struct xl_checksum_state
+{
+	ChecksumType new_checksumtype;
+} xl_checksum_state;
+
 /* Overwrite of prior contrecord */
 typedef struct xl_overwrite_contrecord
 {
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index e80ff8e414..17da375aa6 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -80,6 +80,7 @@ typedef struct CheckPoint
 /* 0xC0 is used in Postgres 9.5-11 */
 #define XLOG_OVERWRITE_CONTRECORD		0xD0
 #define XLOG_CHECKPOINT_REDO			0xE0
+#define XLOG_CHECKSUMS					0xF0
 
 
 /*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index a38e20f5d9..2eda355fd6 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12166,6 +12166,23 @@
   proname => 'jsonb_subscript_handler', prorettype => 'internal',
   proargtypes => 'internal', prosrc => 'jsonb_subscript_handler' },
 
+# data checksum management functions
+{ oid => '9258',
+  descr => 'disable data checksums',
+  proname => 'pg_disable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => '',
+  prosrc => 'disable_data_checksums' },
+
+{ oid => '9257',
+  descr => 'enable data checksums',
+  proname => 'pg_enable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'int4 int4 bool', proallargtypes => '{int4,int4,bool}',
+  proargmodes => '{i,i,i}',
+  proargnames => '{cost_delay,cost_limit,fast}',
+  prosrc => 'enable_data_checksums' },
+
 # collation management functions
 { oid => '3445', descr => 'import collations from operating system',
   proname => 'pg_import_system_collations', procost => '100',
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 5616d64523..8e3c180392 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -155,4 +155,22 @@
 #define PROGRESS_COPY_TYPE_PIPE 3
 #define PROGRESS_COPY_TYPE_CALLBACK 4
 
+/* Progress parameters for PROGRESS_DATACHECKSUMS */
+#define PROGRESS_DATACHECKSUMS_PHASE 0
+#define PROGRESS_DATACHECKSUMS_TOTAL_DB 1
+#define PROGRESS_DATACHECKSUMS_TOTAL_REL 2
+#define PROGRESS_DATACHECKSUMS_PROCESSED_DB 3
+#define PROGRESS_DATACHECKSUMS_PROCESSED_REL 4
+#define PROGRESS_DATACHECKSUMS_CUR_DB 5
+#define PROGRESS_DATACHECKSUMS_CUR_REL 6
+#define PROGRESS_DATACHECKSUMS_CUR_REL_TOTAL_BLOCKS 7
+#define PROGRESS_DATACHECKSUMS_CUR_REL_PROCESSED_BLOCKS 8
+
+/* Phases of datachecksumsworker operation */
+#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING 0
+#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING 1
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS 2
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL 3
+#define PROGRESS_DATACHECKSUMS_DONE 4
+
 #endif
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index e26d108a47..55e507683a 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -357,6 +357,9 @@ typedef enum BackendType
 	B_WAL_SUMMARIZER,
 	B_WAL_WRITER,
 
+	B_DATACHECKSUMSWORKER_LAUNCHER,
+	B_DATACHECKSUMSWORKER_WORKER,
+
 	/*
 	 * Logger is not connected to shared memory and does not have a PGPROC
 	 * entry.
@@ -380,6 +383,7 @@ extern PGDLLIMPORT BackendType MyBackendType;
 #define AmWalReceiverProcess()		(MyBackendType == B_WAL_RECEIVER)
 #define AmWalSummarizerProcess()	(MyBackendType == B_WAL_SUMMARIZER)
 #define AmWalWriterProcess()		(MyBackendType == B_WAL_WRITER)
+#define AmDataChecksumsWorkerProcess()	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
 
 extern const char *GetBackendTypeDesc(BackendType backendType);
 
diff --git a/src/include/postmaster/datachecksumsworker.h b/src/include/postmaster/datachecksumsworker.h
new file mode 100644
index 0000000000..59c9000d64
--- /dev/null
+++ b/src/include/postmaster/datachecksumsworker.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.h
+ *	  header file for checksum helper background worker
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/postmaster/datachecksumsworker.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef DATACHECKSUMSWORKER_H
+#define DATACHECKSUMSWORKER_H
+
+/* Shared memory */
+extern Size DataChecksumsWorkerShmemSize(void);
+extern void DataChecksumsWorkerShmemInit(void);
+
+/* Start the background processes for enabling or disabling checksums */
+void		StartDataChecksumsWorkerLauncher(bool enable_checksums,
+											 int cost_delay,
+											 int cost_limit,
+											 bool fast);
+
+/* Background worker entrypoints */
+void		DataChecksumsWorkerLauncherMain(Datum arg);
+void		DataChecksumsWorkerMain(Datum arg);
+
+#endif							/* DATACHECKSUMSWORKER_H */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 6222d46e53..9ff4127e58 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -204,7 +204,17 @@ typedef PageHeaderData *PageHeader;
  * handling pages.
  */
 #define PG_PAGE_LAYOUT_VERSION		4
+
+/*
+ * Checksum version 0 is used for when data checksums are disabled.
+ * PG_DATA_CHECKSUM_VERSION defines that data checksums are enabled in the
+ * cluster and PG_DATA_CHECKSUM_INPROGRESS_{ON|OFF}_VERSION defines that data
+ * checksums are either currently being enabled or disabled.
+ */
 #define PG_DATA_CHECKSUM_VERSION	1
+#define PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION	2
+#define PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION	3
+
 
 /* ----------------------------------------------------------------
  *						page support functions
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 08e9d598ce..e3e6ec3d4a 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -15,6 +15,14 @@
 
 #include "storage/block.h"
 
+typedef enum ChecksumType
+{
+	DATA_CHECKSUMS_OFF = 0,
+	DATA_CHECKSUMS_ON,
+	DATA_CHECKSUMS_INPROGRESS_ON,
+	DATA_CHECKSUMS_INPROGRESS_OFF
+} ChecksumType;
+
 /*
  * Compute the checksum for a Postgres page.  The page must be aligned on a
  * 4-byte boundary.
diff --git a/src/include/storage/lwlocklist.h b/src/include/storage/lwlocklist.h
index 88dc79b2bd..c3b3011f72 100644
--- a/src/include/storage/lwlocklist.h
+++ b/src/include/storage/lwlocklist.h
@@ -84,3 +84,4 @@ PG_LWLOCK(50, DSMRegistry)
 PG_LWLOCK(51, InjectionPoint)
 PG_LWLOCK(52, SerialControl)
 PG_LWLOCK(53, WaitLSN)
+PG_LWLOCK(54, DataChecksumsWorker)
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index d119465fa0..5038b24cfd 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -449,9 +449,10 @@ extern PGDLLIMPORT PGPROC *PreparedXactProcs;
  * Background writer, checkpointer, WAL writer, WAL summarizer, and archiver
  * run during normal operation.  Startup process and WAL receiver also consume
  * 2 slots, but WAL writer is launched only after startup has exited, so we
- * only need 6 slots.
+ * only need 6 slots to cover these. The DataChecksums worker and launcher
+ * can consume 2 slots when data checksums are enabled or disabled.
  */
-#define NUM_AUXILIARY_PROCS		6
+#define NUM_AUXILIARY_PROCS		8
 
 /* configurable options */
 extern PGDLLIMPORT int DeadlockTimeout;
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 221073def3..dbe68d7eed 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -54,6 +54,11 @@ typedef enum
 typedef enum
 {
 	PROCSIGNAL_BARRIER_SMGRRELEASE, /* ask smgr to close files */
+
+	PROCSIGNAL_BARRIER_CHECKSUM_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_ON,
 } ProcSignalBarrierType;
 
 /*
diff --git a/src/include/utils/backend_progress.h b/src/include/utils/backend_progress.h
index 7b63d38f97..063d3ef86c 100644
--- a/src/include/utils/backend_progress.h
+++ b/src/include/utils/backend_progress.h
@@ -28,6 +28,7 @@ typedef enum ProgressCommandType
 	PROGRESS_COMMAND_CREATE_INDEX,
 	PROGRESS_COMMAND_BASEBACKUP,
 	PROGRESS_COMMAND_COPY,
+	PROGRESS_COMMAND_DATACHECKSUMS,
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	20
diff --git a/src/test/Makefile b/src/test/Makefile
index abdd6e5a98..3b0d933d97 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,16 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = perl postmaster regress isolation modules authentication recovery subscription
+SUBDIRS = \
+		perl \
+		postmaster \
+		regress \
+		isolation \
+		modules \
+		authentication \
+		recovery \
+		subscription \
+		checksum
 
 ifeq ($(with_icu),yes)
 SUBDIRS += icu
diff --git a/src/test/checksum/.gitignore b/src/test/checksum/.gitignore
new file mode 100644
index 0000000000..871e943d50
--- /dev/null
+++ b/src/test/checksum/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/checksum/Makefile b/src/test/checksum/Makefile
new file mode 100644
index 0000000000..fd03bf73df
--- /dev/null
+++ b/src/test/checksum/Makefile
@@ -0,0 +1,23 @@
+#-------------------------------------------------------------------------
+#
+# Makefile for src/test/checksum
+#
+# Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/checksum/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/test/checksum
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
+
+clean distclean maintainer-clean:
+	rm -rf tmp_check
diff --git a/src/test/checksum/README b/src/test/checksum/README
new file mode 100644
index 0000000000..0f0317060b
--- /dev/null
+++ b/src/test/checksum/README
@@ -0,0 +1,22 @@
+src/test/checksum/README
+
+Regression tests for data checksums
+===================================
+
+This directory contains a test suite for enabling data checksums
+in a running cluster.
+
+Running the tests
+=================
+
+    make check
+
+or
+
+    make installcheck
+
+NOTE: This creates a temporary installation (in the case of "check"),
+with multiple nodes, be they master or standby(s) for the purpose of
+the tests.
+
+NOTE: This requires the --enable-tap-tests argument to configure.
diff --git a/src/test/checksum/meson.build b/src/test/checksum/meson.build
new file mode 100644
index 0000000000..5f96b5c246
--- /dev/null
+++ b/src/test/checksum/meson.build
@@ -0,0 +1,15 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+tests += {
+  'name': 'checksums',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'tests': [
+      't/001_basic.pl',
+      't/002_restarts.pl',
+      't/003_standby_restarts.pl',
+      't/004_offline.pl',
+    ],
+  },
+}
diff --git a/src/test/checksum/t/001_basic.pl b/src/test/checksum/t/001_basic.pl
new file mode 100644
index 0000000000..4c64f6a14f
--- /dev/null
+++ b/src/test/checksum/t/001_basic.pl
@@ -0,0 +1,88 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are turned off
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable data checksums
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+# Wait for checksums to become enabled
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Enable data checksums again which should be a no-op..
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+# ..and make sure we can still read/write data
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums again
+$node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are disabled');
+
+# Test reading again
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure previously checksummed pages can be read back');
+
+# Re-enable checksums and make sure that the underlying data has changed to
+# ensure that checksums will be different.
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back the data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$node->stop;
+
+done_testing();
diff --git a/src/test/checksum/t/002_restarts.pl b/src/test/checksum/t/002_restarts.pl
new file mode 100644
index 0000000000..2697b72225
--- /dev/null
+++ b/src/test/checksum/t/002_restarts.pl
@@ -0,0 +1,92 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# restarting the processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, '1', "ensure checksums aren't enabled yet");
+
+$bsession->quit;
+$node->stop;
+$node->start;
+
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'inprogress-on', "ensure checksums aren't enabled yet");
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are turned on');
+
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+$result = $node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are turned off');
+
+done_testing();
diff --git a/src/test/checksum/t/003_standby_restarts.pl b/src/test/checksum/t/003_standby_restarts.pl
new file mode 100644
index 0000000000..6c0fe8f3bf
--- /dev/null
+++ b/src/test/checksum/t/003_standby_restarts.pl
@@ -0,0 +1,131 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# streaming replication
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize primary node
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+$node_primary->init(allows_streaming => 1, no_data_checksums => 1);
+$node_primary->start;
+my $backup_name = 'my_backup';
+
+# Take backup
+$node_primary->backup($backup_name);
+
+# Create streaming standby linking to primary
+my $node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $backup_name,
+	has_streaming => 1);
+$node_standby_1->start;
+
+# Create some content on the primary to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Wait for standbys to catch up
+$node_primary->wait_for_catchup($node_standby_1, 'replay',
+	$node_primary->lsn('insert'));
+
+# Check that checksums are turned off on all nodes
+my $result = $node_primary->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on primary');
+
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on standby_1');
+
+# Enable checksums for the cluster
+$node_primary->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+# Ensure that the primary switches to "inprogress-on"
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	"inprogress-on");
+is($result, 1, 'ensure checksums are in progress on primary');
+
+# Wait for checksum enable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Ensure that the standby has switched to "inprogress-on" or "on".  Normally it
+# would be "inprogress-on", but it is theoretically possible for the primary to
+# complete the checksum enabling *and* have the standby replay that record
+# before we reach the check below.
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting = 'off' FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'f');
+is($result, 1, 'ensure standby has absorbed the inprogress-on barrier');
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+
+is(($result eq 'inprogress-on' || $result eq 'on'),
+	1, 'ensure checksums are on, or in progress, on standby_1');
+
+# Insert some more data which should be checksummed on INSERT
+$node_primary->safe_psql('postgres',
+	"INSERT INTO t VALUES (generate_series(1, 10000));");
+
+# Wait for checksums enabled on the primary
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the primary');
+
+# Wait for checksums enabled on the standby
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the standby');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, '20000', 'ensure we can safely read all data with checksums');
+
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+# Disable checksums and ensure it's propagated to standby and that we can
+# still read all data
+$node_primary->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+# Wait for checksum disable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure data checksums are disabled on the primary 2');
+
+# Ensure that the standby has switched to off
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are off on standby_1');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, "20000", 'ensure we can safely read all data without checksums');
+
+done_testing();
diff --git a/src/test/checksum/t/004_offline.pl b/src/test/checksum/t/004_offline.pl
new file mode 100644
index 0000000000..9cee62c9b5
--- /dev/null
+++ b/src/test/checksum/t/004_offline.pl
@@ -0,0 +1,101 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums offline from various states
+# of checksum processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable checksums offline using pg_checksums
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are enabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums offline again using pg_checksums
+$node->stop;
+$node->checksum_disable_offline;
+$node->start;
+
+# Ensure that checksums are disabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, 1, 'ensure checksums are in the process of being enabled');
+
+# Turn the cluster off and enable checksums offline, then start back up
+$bsession->quit;
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are now enabled even though processing wasn't
+# restarted
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+done_testing();
diff --git a/src/test/meson.build b/src/test/meson.build
index 67376e4b7f..2ce656a82a 100644
--- a/src/test/meson.build
+++ b/src/test/meson.build
@@ -8,6 +8,7 @@ subdir('postmaster')
 subdir('recovery')
 subdir('subscription')
 subdir('modules')
+subdir('checksum')
 
 if ssl.found()
   subdir('ssl')
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 007571e948..72c6d066bd 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3566,6 +3566,40 @@ sub advance_wal
 	}
 }
 
+=item $node->checksum_enable_offline()
+
+Enable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_enable_offline
+{
+	my ($self) = @_;
+
+	print "### Enabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-e');
+	return;
+}
+
+=item checksum_disable_offline
+
+Disable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_disable_offline
+{
+	my ($self) = @_;
+
+	print "### Disabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-d');
+	return;
+}
+
 =pod
 
 =back
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 171a7dd5d2..08928993c4 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -396,6 +396,7 @@ CheckPointStmt
 CheckpointStatsData
 CheckpointerRequest
 CheckpointerShmemStruct
+ChecksumType
 Chromosome
 CkptSortItem
 CkptTsStatus
@@ -582,6 +583,10 @@ DataPageDeleteStack
 DataTypesUsageChecks
 DataTypesUsageVersionCheck
 DatabaseInfo
+DataChecksumsWorkerDatabase
+DataChecksumsWorkerResult
+DataChecksumsWorkerResultEntry
+DataChecksumsWorkerShmemStruct
 DateADT
 DateTimeErrorExtra
 Datum
@@ -4082,6 +4087,7 @@ xl_btree_split
 xl_btree_unlink_page
 xl_btree_update
 xl_btree_vacuum
+xl_checksum_state
 xl_clog_truncate
 xl_commit_ts_truncate
 xl_dbase_create_file_copy_rec
-- 
2.39.3 (Apple Git-146)

#15Daniel Gustafsson
daniel@yesql.se
In reply to: Daniel Gustafsson (#14)
1 attachment(s)
Re: Changing the state of data checksums in a running cluster

On 4 Nov 2024, at 12:27, Daniel Gustafsson <daniel@yesql.se> wrote:

Attached is a rebased v6 fixing the tests to handle that checksums are now on
by default, no other changes are made as no outstanding review comments or
identified bugs exist.

Does anyone object to going ahead with this?

And a new rebase to cope with recent changes,

--
Daniel Gustafsson

Attachments:

v7-0001-Online-enabling-and-disabling-of-data-checksums.patchapplication/octet-stream; name=v7-0001-Online-enabling-and-disabling-of-data-checksums.patch; x-unix-mode=0644Download
From d8e87c7bb4b945c1d91875b7a23da576ba19248d Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Fri, 11 Oct 2024 09:26:13 +0200
Subject: [PATCH v7] Online enabling and disabling of data checksums

This allows data checksums to be enabled, or disabled, in a running
cluster without restricting access to the cluster during processing.

Data checksums could prior to this only be enabled during initdb or
when the cluster is offline using the pg_checksums app. This commit
introduce functionality to enable, or disable, data checksums while
the cluster is running regardless of how it was initialized.

A background worker launcher process is responsible for launching a
dynamic per-database background worker which will mark all buffers
dirty for all relation with storage in order for them to have data
checksums calcuated on write.  Once all relations in all databases
have been processed, the data_checksums state will be set to on and
the cluster will at that point be identical to one which had data
checksums enabled during initialization or via offline processing.

When data checksums are being enabled, concurrent I/O operations
from backends other than the data checksums worker will write the
checksums but not verify them on reading.  Only when all backends
have absorbed the procsignalbarrier for setting data_checksums to
on will they also start verifying checksums on reading.  The same
process is repeated during disabling; all backends write checksums
but do not verify them until the barrier for setting the state to
off has been absorbed by all.  This in-progress state is used to
ensure there are no false negatives (or positives) due to reading
a checksum which is not in sync with the page.

This work is based on an earlier version of this patch which was
reviewed by among others Heikki Linnakangas, Robert Haas, Andres
Freund, Tomas Vondra, Michael Banck and Andrey Borodin.

Author: Daniel Gustafsson <daniel@yesql.se>
Author: Magnus Hagander <magnus@hagander.net>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com
Discussion: https://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de
Discussion: https://postgr.es/m/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com
---
 doc/src/sgml/func.sgml                        |   71 +
 doc/src/sgml/glossary.sgml                    |   23 +
 doc/src/sgml/monitoring.sgml                  |  206 ++-
 doc/src/sgml/ref/pg_checksums.sgml            |    6 +
 doc/src/sgml/wal.sgml                         |   57 +-
 src/backend/access/rmgrdesc/xlogdesc.c        |   24 +
 src/backend/access/transam/xlog.c             |  450 +++++-
 src/backend/access/transam/xlogfuncs.c        |   41 +
 src/backend/backup/basebackup.c               |    6 +-
 src/backend/catalog/system_functions.sql      |   11 +
 src/backend/catalog/system_views.sql          |   21 +
 src/backend/postmaster/Makefile               |    1 +
 src/backend/postmaster/bgworker.c             |    7 +
 src/backend/postmaster/datachecksumsworker.c  | 1376 +++++++++++++++++
 src/backend/postmaster/meson.build            |    1 +
 src/backend/replication/logical/decode.c      |    1 +
 src/backend/storage/ipc/ipci.c                |    4 +
 src/backend/storage/ipc/procsignal.c          |   13 +
 src/backend/storage/page/README               |    4 +-
 src/backend/storage/page/bufpage.c            |    6 +-
 src/backend/utils/activity/pgstat_io.c        |    2 +
 .../utils/activity/wait_event_names.txt       |    3 +
 src/backend/utils/adt/pgstatfuncs.c           |    8 +-
 src/backend/utils/init/miscinit.c             |    9 +-
 src/backend/utils/init/postinit.c             |    7 +-
 src/backend/utils/misc/guc_tables.c           |   32 +-
 src/bin/pg_checksums/pg_checksums.c           |    2 +-
 src/bin/pg_upgrade/controldata.c              |    9 +
 src/include/access/xlog.h                     |   16 +-
 src/include/access/xlog_internal.h            |    7 +
 src/include/catalog/pg_control.h              |    1 +
 src/include/catalog/pg_proc.dat               |   17 +
 src/include/commands/progress.h               |   18 +
 src/include/miscadmin.h                       |    4 +
 src/include/postmaster/datachecksumsworker.h  |   31 +
 src/include/storage/bufpage.h                 |   10 +
 src/include/storage/checksum.h                |    8 +
 src/include/storage/lwlocklist.h              |    1 +
 src/include/storage/proc.h                    |    5 +-
 src/include/storage/procsignal.h              |    5 +
 src/include/utils/backend_progress.h          |    1 +
 src/test/Makefile                             |   11 +-
 src/test/checksum/.gitignore                  |    2 +
 src/test/checksum/Makefile                    |   23 +
 src/test/checksum/README                      |   22 +
 src/test/checksum/meson.build                 |   15 +
 src/test/checksum/t/001_basic.pl              |   88 ++
 src/test/checksum/t/002_restarts.pl           |   92 ++
 src/test/checksum/t/003_standby_restarts.pl   |  131 ++
 src/test/checksum/t/004_offline.pl            |  101 ++
 src/test/meson.build                          |    1 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |   34 +
 src/tools/pgindent/typedefs.list              |    6 +
 53 files changed, 3002 insertions(+), 49 deletions(-)
 create mode 100644 src/backend/postmaster/datachecksumsworker.c
 create mode 100644 src/include/postmaster/datachecksumsworker.h
 create mode 100644 src/test/checksum/.gitignore
 create mode 100644 src/test/checksum/Makefile
 create mode 100644 src/test/checksum/README
 create mode 100644 src/test/checksum/meson.build
 create mode 100644 src/test/checksum/t/001_basic.pl
 create mode 100644 src/test/checksum/t/002_restarts.pl
 create mode 100644 src/test/checksum/t/003_standby_restarts.pl
 create mode 100644 src/test/checksum/t/004_offline.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 73979f20ff..bd621e924c 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29606,6 +29606,77 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
 
   </sect2>
 
+  <sect2 id="functions-admin-checksum">
+   <title>Data Checksum Functions</title>
+
+   <para>
+    The functions shown in <xref linkend="functions-checksums-table" /> can
+    be used to enable or disable data checksums in a running cluster.
+    See <xref linkend="checksums" /> for details.
+   </para>
+
+   <table id="functions-checksums-table">
+    <title>Data Checksum Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_enable_data_checksums</primary>
+        </indexterm>
+        <function>pg_enable_data_checksums</function> ( <optional><parameter>cost_delay</parameter> <type>int</type>, <parameter>cost_limit</parameter> <type>int</type></optional> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Initiates data checksums for the cluster. This will switch the data
+        checksums mode to <literal>inprogress-on</literal> as well as start a
+        background worker that will process all pages in the database and
+        enable checksums on them. When all data pages have had checksums
+        enabled, the cluster will automatically switch data checksums mode to
+        <literal>on</literal>.
+       </para>
+       <para>
+        If <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter> are
+        specified, the speed of the process is throttled using the same principles as
+        <link linkend="runtime-config-resource-vacuum-cost">Cost-based Vacuum Delay</link>.
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_disable_data_checksums</primary>
+        </indexterm>
+        <function>pg_disable_data_checksums</function> ()
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Disables data checksum validation and calculation for the cluster. This
+        will switch the data checksum mode to <literal>inprogress-off</literal>
+        while data checksums are being disabled. When all active backends have
+        stopped validating data checksums, the data checksum mode will be
+        changed to <literal>off</literal>.  At this point the data pages will
+        still have checksums recorded but they are not updated when pages are
+        modified.
+       </para></entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+
+  </sect2>
+
   <sect2 id="functions-admin-dbobject">
    <title>Database Object Management Functions</title>
 
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index f54f25c1c6..57a9188dc1 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -159,6 +159,8 @@
      (but not the autovacuum workers),
      the <glossterm linkend="glossary-background-writer">background writer</glossterm>,
      the <glossterm linkend="glossary-checkpointer">checkpointer</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker">data checksums worker</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker-launcher">data checksums worker launcher</glossterm>,
      the <glossterm linkend="glossary-logger">logger</glossterm>,
      the <glossterm linkend="glossary-startup-process">startup process</glossterm>,
      the <glossterm linkend="glossary-wal-archiver">WAL archiver</glossterm>,
@@ -548,6 +550,27 @@
    </glossdef>
   </glossentry>
 
+  <glossentry id="glossary-data-checksums-worker">
+   <glossterm>Data Checksums Worker</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which enables or disables data checksums in a specific database.
+    </para>
+   </glossdef>
+  </glossentry>
+
+  <glossentry id="glossary-data-checksums-worker-launcher">
+   <glossterm>Data Checksums Worker Launcher</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which starts <glossterm linkend="glossary-data-checksums-worker"> processes</glossterm>
+     for each database.
+    </para>
+   </glossdef>
+  </glossentry>
+
   <glossentry id="glossary-db-cluster">
    <glossterm>Database cluster</glossterm>
    <glossdef>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 331315f8d3..aeb96aeceb 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3496,8 +3496,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Number of data page checksum failures detected in this
-       database (or on a shared object), or NULL if data checksums are not
-       enabled.
+       database. Detected failures are reported regardless of the
+       <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -3507,8 +3507,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Time at which the last data page checksum failure was detected in
-       this database (or on a shared object), or NULL if data checksums are not
-       enabled.
+       this database (or on a shared object). Last failure is reported
+       regardless of the <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -6670,6 +6670,204 @@ FROM pg_stat_get_backend_idset() AS backendid;
 
  </sect2>
 
+ <sect2 id="data-checksum-progress-reporting">
+  <title>Data Checksum Progress Reporting</title>
+
+  <indexterm>
+   <primary>pg_stat_progress_data_checksums</primary>
+  </indexterm>
+
+  <para>
+   When data checksums are being enabled on a running cluster, the
+   <structname>pg_stat_progress_data_checksums</structname> view will contain
+   a row for each background worker which is currently calculating checksums
+   for the data pages.
+  </para>
+
+  <table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
+   <title><structname>pg_stat_progress_data_checksums</structname> View</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        Column Type
+       </para>
+       <para>
+        Description>
+       </para>
+      </entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>pid</structfield> <type>integer</type>
+       </para>
+       <para>
+        Process ID of a datachecksumworker process.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>phase</structfield> <type>text</type>
+       </para>
+       <para>
+        Current processing phase, see <xref linkend="datachecksum-phases"/>
+        for description of the phases.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of databases which will be processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of relations which will be processed, or
+        <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of relations yet.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of databases which have been processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of relations which have been processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_current</structfield> <type>oid</type>
+       </para>
+       <para>
+        OID of the database currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current</structfield> <type>oid</type>
+       </para>
+       <para>
+        OID of the relation currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current_blocks</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of blocks in the relation currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current_blocks_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks which have been processed in the relation currently
+        being processed.
+       </para>
+      </entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+  <table id="datachecksum-phases">
+   <title>Data Checksum Phases</title>
+   <tgroup cols="2">
+    <colspec colname="col1" colwidth="1*"/>
+    <colspec colname="col2" colwidth="2*"/>
+    <thead>
+     <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+    <tbody>
+     <row>
+      <entry><literal>enabling</literal></entry>
+      <entry>
+       The command is currently enabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>disabling</literal></entry>
+      <entry>
+       The command is currently disabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on backends</literal></entry>
+      <entry>
+       The command is currently waiting for backends to acknowledge the data
+       checksum operation.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on temporary tables</literal></entry>
+      <entry>
+       The command is currently waiting for all temporary tables which existed
+       at the time the command was started to be removed.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>done</literal></entry>
+      <entry>
+       The command has finished processing and is exiting.
+      </entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+ </sect2>
+
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index 95043aa329..0343710af5 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -45,6 +45,12 @@ PostgreSQL documentation
    exit status is nonzero if the operation failed.
   </para>
 
+  <para>
+   When enabling checksums, if checksums were in the process of being enabled
+   when the cluster was shut down, <application>pg_checksums</application>
+   will still process all relations regardless of the online processing.
+  </para>
+
   <para>
    When verifying checksums, every file in the cluster is scanned. When
    enabling checksums, each relation file block with a changed checksum is
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index a34cddb5ed..448a9c0c94 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -248,9 +248,10 @@
   <para>
    Checksums are normally enabled when the cluster is initialized using <link
    linkend="app-initdb-data-checksums"><application>initdb</application></link>.
-   They can also be enabled or disabled at a later time as an offline
-   operation. Data checksums are enabled or disabled at the full cluster
-   level, and cannot be specified individually for databases or tables.
+   They can also be enabled or disabled at a later time either as an offline
+   operation or online in a running cluster allowing concurrent access. Data
+   checksums are enabled or disabled at the full cluster level, and cannot be
+   specified individually for databases or tables.
   </para>
 
   <para>
@@ -267,7 +268,7 @@
   </para>
 
   <sect2 id="checksums-offline-enable-disable">
-   <title>Off-line Enabling of Checksums</title>
+   <title>Offline Enabling of Checksums</title>
 
    <para>
     The <link linkend="app-pgchecksums"><application>pg_checksums</application></link>
@@ -276,6 +277,54 @@
    </para>
 
   </sect2>
+
+  <sect2 id="checksums-online-enable-disable">
+   <title>Online Enabling of Checksums</title>
+
+   <para>
+    Checksums can be enabled or disabled online, by calling the appropriate
+    <link linkend="functions-admin-checksum">functions</link>.
+   </para>
+
+   <para>
+    Enabling checksums will put the cluster checksum mode in
+    <literal>inprogress-on</literal> mode.  During this time, checksums will be
+    written but not verified. In addition to this, a background worker process
+    is started that enables checksums on all existing data in the cluster. Once
+    this worker has completed processing all databases in the cluster, the
+    checksum mode will automatically switch to <literal>on</literal>. The
+    processing will consume two background worker processes, make sure that
+    <varname>max_worker_processes</varname> allows for at least two more
+    additional processes.
+   </para>
+
+   <para>
+    The process will initially wait for all open transactions to finish before
+    it starts, so that it can be certain that there are no tables that have been
+    created inside a transaction that has not committed yet and thus would not
+    be visible to the process enabling checksums. It will also, for each database,
+    wait for all pre-existing temporary tables to get removed before it finishes.
+    If long-lived temporary tables are used in the application it may be necessary
+    to terminate these application connections to allow the process to complete.
+   </para>
+
+   <para>
+    If the cluster is stopped while in <literal>inprogress-on</literal> mode, for
+    any reason, then this process must be restarted manually. To do this,
+    re-execute the function <function>pg_enable_data_checksums()</function>
+    once the cluster has been restarted. The background worker will attempt
+    to resume the work from where it was interrupted.
+   </para>
+
+   <note>
+    <para>
+     Enabling checksums can cause significant I/O to the system, as most of the
+     database pages will need to be rewritten, and will be written both to the
+     data files and the WAL.
+    </para>
+   </note>
+
+  </sect2>
  </sect1>
 
   <sect1 id="wal-intro">
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 363294d623..465153e5d1 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -18,6 +18,7 @@
 #include "access/xlog.h"
 #include "access/xlog_internal.h"
 #include "catalog/pg_control.h"
+#include "storage/bufpage.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -167,6 +168,26 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
 		memcpy(&wal_level, rec, sizeof(int));
 		appendStringInfo(buf, "wal_level %s", get_wal_level_string(wal_level));
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state xlrec;
+
+		memcpy(&xlrec, rec, sizeof(xl_checksum_state));
+		switch (xlrec.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_VERSION:
+				appendStringInfoString(buf, "on");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				appendStringInfoString(buf, "inprogress-off");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				appendStringInfoString(buf, "inprogress-on");
+				break;
+			default:
+				appendStringInfoString(buf, "off");
+		}
+	}
 }
 
 const char *
@@ -218,6 +239,9 @@ xlog_identify(uint8 info)
 		case XLOG_CHECKPOINT_REDO:
 			id = "CHECKPOINT_REDO";
 			break;
+		case XLOG_CHECKSUMS:
+			id = "CHECKSUMS";
+			break;
 	}
 
 	return id;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 6f58412bca..3d676f0f62 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -647,6 +647,16 @@ static XLogRecPtr LocalMinRecoveryPoint;
 static TimeLineID LocalMinRecoveryPointTLI;
 static bool updateMinRecoveryPoint = true;
 
+/*
+ * Local state fror Controlfile data_checksum_version. After initialization
+ * this is only updated when absorbing a procsignal barrier during interrupt
+ * processing. The reason for keeping a copy in backend-private memory is to
+ * avoid locking for interrogating checksum state. Possible values are the
+ * checksum versions defined in storage/bufpage.h as well as zero when data
+ * checksums are disabled.
+ */
+static uint32 LocalDataChecksumVersion = 0;
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -715,6 +725,8 @@ static void WALInsertLockAcquireExclusive(void);
 static void WALInsertLockRelease(void);
 static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);
 
+static void XLogChecksums(ChecksumType new_type);
+
 /*
  * Insert an XLOG record represented by an already-constructed chain of data
  * chunks.  This is a low-level routine; to construct the WAL record header
@@ -828,9 +840,10 @@ XLogInsertRecord(XLogRecData *rdata,
 		 * only happen just after a checkpoint, so it's better to be slow in
 		 * this case and fast otherwise.
 		 *
-		 * Also check to see if fullPageWrites was just turned on or there's a
-		 * running backup (which forces full-page writes); if we weren't
-		 * already doing full-page writes then go back and recompute.
+		 * Also check to see if fullPageWrites was just turned on, there's a
+		 * running backup or if checksums are enabled (all of which forces
+		 * full-page writes); if we weren't already doing full-page writes
+		 * then go back and recompute.
 		 *
 		 * If we aren't doing full-page writes then RedoRecPtr doesn't
 		 * actually affect the contents of the XLOG record, so we'll update
@@ -843,7 +856,9 @@ XLogInsertRecord(XLogRecData *rdata,
 			Assert(RedoRecPtr < Insert->RedoRecPtr);
 			RedoRecPtr = Insert->RedoRecPtr;
 		}
-		doPageWrites = (Insert->fullPageWrites || Insert->runningBackups > 0);
+		doPageWrites = (Insert->fullPageWrites ||
+						Insert->runningBackups > 0 ||
+						DataChecksumsNeedWrite());
 
 		if (doPageWrites &&
 			(!prevDoPageWrites ||
@@ -4546,9 +4561,7 @@ ReadControlFile(void)
 
 	CalculateCheckpointSegments();
 
-	/* Make the initdb settings visible as GUC variables, too */
-	SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no",
-					PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
+	LocalDataChecksumVersion = ControlFile->data_checksum_version;
 }
 
 /*
@@ -4582,13 +4595,345 @@ GetMockAuthenticationNonce(void)
 }
 
 /*
- * Are checksums enabled for data pages?
+ * DataChecksumsNeedWrite
+ *		Returns whether data checksums must be written or not
+ *
+ * Returns true iff data checksums are enabled or are in the process of being
+ * enabled. During "inprogress-on" and "inprogress-off" states checksums must
+ * be written even though they are not verified (see datachecksumsworker.c for
+ * a longer discussion).
+ *
+ * This function is intended for callsites which are about to write a data page
+ * to storage, and need to know whether to re-calculate the checksum for the
+ * page header. Calling this function must be performed as close to the write
+ * operation as possible to keep the critical section short.
+ */
+bool
+DataChecksumsNeedWrite(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * DataChecksumsNeedVerify
+ *		Returns whether data checksums must be verified or not
+ *
+ * Data checksums are only verified if they are fully enabled in the cluster.
+ * During the "inprogress-on" and "inprogress-off" states they are only
+ * updated, not verified (see datachecksumsworker.c for a longer discussion).
+ *
+ * This function is intended for callsites which have read data and are about
+ * to perform checksum validation based on the result of this.  Calling this
+ * function must be performed as close to the validation call as possible to
+ * keep the critical section short. This is in order to protect against time of
+ * check/time of use situations around data checksum validation.
+ */
+bool
+DataChecksumsNeedVerify(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
+}
+
+/*
+ * DataChecksumsOnInProgress
+ *		Returns whether data checksums are being enabled
+ *
+ * Most operations don't need to worry about the "inprogress" states, and
+ * should use DataChecksumsNeedVerify() or DataChecksumsNeedWrite(). The
+ * "inprogress-on" state for enabling checksums is used when the checksum
+ * worker is setting checksums on all pages, it can thus be used to check for
+ * aborted checksum processing which need to be restarted.
+ */
+inline bool
+DataChecksumsOnInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+}
+
+/*
+ * DataChecksumsOffInProgress
+ *		Returns whether data checksums are being disabled
+ *
+ * The "inprogress-off" state for disabling checksums is used for when the
+ * worker resets the catalog state.  DataChecksumsNeedVerify() or
+ * DataChecksumsNeedWrite() should be used for deciding whether to read/write
+ * checksums.
  */
 bool
-DataChecksumsEnabled(void)
+DataChecksumsOffInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * SetDataChecksumsOnInProgress
+ *		Sets the data checksum state to "inprogress-on" to enable checksums
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on". See
+ * SetDataChecksumsOn below for a description on how this state change works.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOnInProgress(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile != NULL);
+
+	/*
+	 * The state transition is performed in a critical section with
+	 * checkpoints held off to provide crash safety.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state change in all backends to ensure that all backends are in
+	 * "inprogress-on". Once done we know that all backends are writing data
+	 * checksums.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOn
+ *		Enables data checksums cluster-wide
+ *
+ * Enabling data checksums is performed using two barriers, the first one to
+ * set the checksums state to "inprogress-on" (which is performed by
+ * SetDataChecksumsOnInProgress()) and the second one to set the state to "on"
+ * (performed here).
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on".  This state
+ * requires data checksums to be written but not verified. This ensures that
+ * all data pages can be checksummed without the risk of false negatives in
+ * validation during the process.  When all existing pages are guaranteed to
+ * have checksums, and all new pages will be initiated with checksums, the
+ * state can be changed to "on". Once the state is "on" checksums will be both
+ * written and verified. See datachecksumsworker.c for a longer discussion on
+ * how data checksums can be enabled in a running cluster.
+ *
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOn(void)
 {
+	uint64		barrier;
+
 	Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/*
+	 * The only allowed state transition to "on" is from "inprogress-on" since
+	 * that state ensures that all pages will have data checksums written.
+	 */
+	if (ControlFile->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+		elog(ERROR, "checksums not in \"inprogress-on\" mode");
+	}
+
+	LWLockRelease(ControlFileLock);
+
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state transition of "on" in all backends. When done we know that
+	 * data checksums are enabled in all backends and data checksums are both
+	 * written and verified.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOff
+ *		Disables data checksums cluster-wide
+ *
+ * Disabling data checksums must be performed with two sets of barriers, each
+ * carrying a different state. The state is first set to "inprogress-off"
+ * during which checksums are still written but not verified. This ensures that
+ * backends which have yet to observe the state change from "on" won't get
+ * validation errors on concurrently modified pages. Once all backends have
+ * changed to "inprogress-off", the barrier for moving to "off" can be emitted.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOff(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/* If data checksums are already disabled there is nothing to do */
+	if (ControlFile->data_checksum_version == 0)
+	{
+		LWLockRelease(ControlFileLock);
+		return;
+	}
+
+	/*
+	 * If data checksums are currently enabled we first transition to the
+	 * "inprogress-off" state during which backends continue to write
+	 * checksums without verifying them. When all backends are in
+	 * "inprogress-off" the next transition to "off" can be performed, after
+	 * which all data checksum processing is disabled.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+
+		MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+		START_CRIT_SECTION();
+
+		XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+		LWLockRelease(ControlFileLock);
+
+		barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+
+		END_CRIT_SECTION();
+		MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+
+		/*
+		 * Update local state in all backends to ensure that any backend in
+		 * "on" state is changed to "inprogress-off".
+		 */
+		WaitForProcSignalBarrier(barrier);
+
+		/*
+		 * At this point we know that no backends are verifying data checksums
+		 * during reading. Next, we can safely move to state "off" to also
+		 * stop writing checksums.
+		 */
+	}
+	else
+	{
+		/*
+		 * Ending up here implies that the checksums state is "inprogress-on"
+		 * or "inprogress-off" and we can transition directly to "off" from
+		 * there.
+		 */
+		LWLockRelease(ControlFileLock);
+	}
+
+	/*
+	 * Ensure that we don't incur a checkpoint during disabling checksums.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(0);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = 0;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * ProcSignalBarrier absorption functions for enabling and disabling data
+ * checksums in a running cluster. The procsignalbarriers are emitted in the
+ * SetDataChecksums* functions.
+ */
+bool
+AbsorbChecksumsOnInProgressBarrier(void)
+{
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOnBarrier(void)
+{
+	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffInProgressBarrier(void)
+{
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffBarrier(void)
+{
+	LocalDataChecksumVersion = 0;
+	return true;
+}
+
+/*
+ * InitLocalControlData
+ *
+ * Set up backend local caches of controldata variables which may change at
+ * any point during runtime and thus require special cased locking. So far
+ * this only applies to data_checksum_version, but it's intended to be general
+ * purpose enough to handle future cases.
+ */
+void
+InitLocalControldata(void)
+{
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	LocalDataChecksumVersion = ControlFile->data_checksum_version;
+	LWLockRelease(ControlFileLock);
+}
+
+/* guc hook */
+const char *
+show_data_checksums(void)
+{
+	if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)
+		return "on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		return "inprogress-on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+		return "inprogress-off";
+	else
+		return "off";
 }
 
 /*
@@ -6148,6 +6493,33 @@ StartupXLOG(void)
 	 */
 	CompleteCommitTsInitialization();
 
+	/*
+	 * If we reach this point with checksums being enabled ("inprogress-on"
+	 * state), we notify the user that they need to manually restart the
+	 * process to enable checksums. This is because we cannot launch a dynamic
+	 * background worker directly from here, it has to be launched from a
+	 * regular backend.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		ereport(WARNING,
+				(errmsg("data checksums are being enabled, but no worker is running"),
+				 errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
+
+	/*
+	 * If data checksums were being disabled when the cluster was shut down,
+	 * we know that we have a state where all backends have stopped validating
+	 * checksums and we can move to off instead of prompting the user to
+	 * perform any action.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+	{
+		XLogChecksums(0);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = 0;
+		LWLockRelease(ControlFileLock);
+	}
+
 	/*
 	 * All done with end-of-recovery actions.
 	 *
@@ -8155,6 +8527,24 @@ XLogReportParameters(void)
 	}
 }
 
+/*
+ * Log the new state of checksums
+ */
+static void
+XLogChecksums(ChecksumType new_type)
+{
+	xl_checksum_state xlrec;
+	XLogRecPtr	recptr;
+
+	xlrec.new_checksumtype = new_type;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, sizeof(xl_checksum_state));
+
+	recptr = XLogInsert(RM_XLOG_ID, XLOG_CHECKSUMS);
+	XLogFlush(recptr);
+}
+
 /*
  * Update full_page_writes in shared memory, and write an
  * XLOG_FPW_CHANGE record if necessary.
@@ -8583,6 +8973,46 @@ xlog_redo(XLogReaderState *record)
 	{
 		/* nothing to do here, just for informational purposes */
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state state;
+		uint64		barrier;
+
+		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = state.new_checksumtype;
+		LWLockRelease(ControlFileLock);
+
+		/*
+		 * Block on a procsignalbarrier to await all processes having seen the
+		 * change to checksum status. Once the barrier has been passed we can
+		 * initiate the corresponding processing.
+		 */
+		switch (state.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			default:
+				Assert(state.new_checksumtype == 0);
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+		}
+	}
 }
 
 /*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index b0c6d7c687..bf6f5a8347 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -26,6 +26,7 @@
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "postmaster/datachecksumsworker.h"
 #include "replication/walreceiver.h"
 #include "storage/fd.h"
 #include "storage/latch.h"
@@ -748,3 +749,43 @@ pg_promote(PG_FUNCTION_ARGS)
 						   wait_seconds)));
 	PG_RETURN_BOOL(false);
 }
+
+/*
+ * Disables data checksums for the cluster, if applicable. Starts a background
+ * worker which turns off the data checksums.
+ */
+Datum
+disable_data_checksums(PG_FUNCTION_ARGS)
+{
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	StartDataChecksumsWorkerLauncher(false, 0, 0, false);
+	PG_RETURN_VOID();
+}
+
+/*
+ * Enables data checksums for the cluster, if applicable.  Supports vacuum-
+ * like cost based throttling to limit system load. Starts a background worker
+ * which updates data checksums on existing data.
+ */
+Datum
+enable_data_checksums(PG_FUNCTION_ARGS)
+{
+	int			cost_delay = PG_GETARG_INT32(0);
+	int			cost_limit = PG_GETARG_INT32(1);
+	bool		fast = PG_GETARG_BOOL(2);
+
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	if (cost_delay < 0)
+		ereport(ERROR, errmsg("cost delay cannot be a negative value"));
+
+	if (cost_limit <= 0)
+		ereport(ERROR, errmsg("cost limit must be greater than zero"));
+
+	StartDataChecksumsWorkerLauncher(true, cost_delay, cost_limit, fast);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index e2ed9081d1..122ec9c3b1 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1613,7 +1613,8 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	 * enabled for this cluster, and if this is a relation file, then verify
 	 * the checksum.
 	 */
-	if (!noverify_checksums && DataChecksumsEnabled() &&
+	if (!noverify_checksums &&
+		DataChecksumsNeedWrite() &&
 		RelFileNumberIsValid(relfilenumber))
 		verify_checksum = true;
 
@@ -2006,6 +2007,9 @@ verify_page_checksum(Page page, XLogRecPtr start_lsn, BlockNumber blkno,
 	if (PageIsNew(page) || PageGetLSN(page) >= start_lsn)
 		return true;
 
+	if (!DataChecksumsNeedVerify())
+		return true;
+
 	/* Perform the actual checksum calculation. */
 	checksum = pg_checksum_page(page, blkno);
 
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index c51dfca802..2301a338fb 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -668,6 +668,13 @@ LANGUAGE INTERNAL
 CALLED ON NULL INPUT VOLATILE
 AS 'pg_set_attribute_stats';
 
+CREATE OR REPLACE FUNCTION
+  pg_enable_data_checksums(cost_delay integer DEFAULT 0,
+                           cost_limit integer DEFAULT 100,
+						   fast boolean DEFAULT false)
+  RETURNS void STRICT VOLATILE LANGUAGE internal AS 'enable_data_checksums'
+  PARALLEL RESTRICTED;
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
@@ -791,6 +798,10 @@ REVOKE EXECUTE ON FUNCTION pg_ls_logicalmapdir() FROM PUBLIC;
 
 REVOKE EXECUTE ON FUNCTION pg_ls_replslotdir(text) FROM PUBLIC;
 
+REVOKE EXECUTE ON FUNCTION pg_enable_data_checksums(integer, integer, boolean) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_disable_data_checksums() FROM public;
+
 --
 -- We also set up some things as accessible to standard roles.
 --
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 3456b821bc..841d416fa5 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1323,6 +1323,27 @@ CREATE VIEW pg_stat_progress_copy AS
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_data_checksums AS
+	SELECT
+		S.pid AS pid, S.datid AS datid, D.datname AS datname,
+		CASE S.param1 WHEN 0 THEN 'enabling'
+                      WHEN 1 THEN 'disabling'
+					  WHEN 2 THEN 'waiting'
+					  WHEN 3 THEN 'waiting on backends'
+					  WHEN 4 THEN 'waiting on temporary tables'
+					  WHEN 5 THEN 'done'
+					  END AS phase,
+		CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+		CASE S.param3 WHEN -1 THEN NULL ELSE S.param3 END AS relations_total,
+		S.param4 AS databases_processed,
+		S.param5 AS relations_processed,
+		S.param6 AS databases_current,
+		S.param7 AS relation_current,
+		S.param8 AS relation_current_blocks,
+		S.param9 AS relation_current_blocks_processed
+	FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+		LEFT JOIN pg_database D ON S.datid = D.oid;
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/postmaster/Makefile b/src/backend/postmaster/Makefile
index db08543d19..e112d4b53e 100644
--- a/src/backend/postmaster/Makefile
+++ b/src/backend/postmaster/Makefile
@@ -18,6 +18,7 @@ OBJS = \
 	bgworker.o \
 	bgwriter.o \
 	checkpointer.o \
+	datachecksumsworker.o \
 	fork_process.o \
 	interrupt.o \
 	launch_backend.o \
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 07bc5517fc..662cd12681 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -18,6 +18,7 @@
 #include "pgstat.h"
 #include "port/atomics.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "replication/logicallauncher.h"
 #include "replication/logicalworker.h"
@@ -132,6 +133,12 @@ static const struct
 	},
 	{
 		"TablesyncWorkerMain", TablesyncWorkerMain
+	},
+	{
+		"DataChecksumsWorkerLauncherMain", DataChecksumsWorkerLauncherMain
+	},
+	{
+		"DataChecksumsWorkerMain", DataChecksumsWorkerMain
 	}
 };
 
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
new file mode 100644
index 0000000000..de7a077f9c
--- /dev/null
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -0,0 +1,1376 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.c
+ *	  Background worker for enabling or disabling data checksums online
+ *
+ * When enabling data checksums on a database at initdb time or when shut down
+ * with pg_checksums, no extra process is required as each page is checksummed,
+ * and verified, when accessed.  When enabling checksums on an already running
+ * cluster, this worker will ensure that all pages are checksummed before
+ * verification of the checksums is turned on. In the case of disabling
+ * checksums, the state transition is performed only in the control file, no
+ * changes are performed on the data pages.
+ *
+ * Checksums can be either enabled or disabled cluster-wide, with on/off being
+ * the end state for data_checksums.
+ *
+ * Enabling checksums
+ * ------------------
+ * When enabling checksums in an online cluster, data_checksums will be set to
+ * "inprogress-on" which signals that write operations MUST compute and write
+ * the checksum on the data page, but during reading the checksum SHALL NOT be
+ * verified. This ensures that all objects created during checksumming will
+ * have checksums set, but no reads will fail due to incorrect checksum. The
+ * DataChecksumsWorker will compile a list of databases which exist at the
+ * start of checksumming, and all of these which haven't been dropped during
+ * the processing MUST have been processed successfully in order for checksums
+ * to be enabled. Any new relation created during processing will see the
+ * in-progress state and will automatically be checksummed.
+ *
+ * For each database, all relations which have storage are read and every data
+ * page is marked dirty to force a write with the checksum. This will generate
+ * a lot of WAL as the entire database is read and written.
+ *
+ * If the processing is interrupted by a cluster restart, it will be restarted
+ * from the beginning again as state isn't persisted.
+ *
+ * Disabling checksums
+ * -------------------
+ * When disabling checksums, data_checksums will be set to "inprogress-off"
+ * which signals that checksums are written but no longer verified. This ensure
+ * that backends which have yet to move from the "on" state will still be able
+ * to process data checksum validation.
+ *
+ * Synchronization and Correctness
+ * -------------------------------
+ * The processes involved in enabling, or disabling, data checksums in an
+ * online cluster must be properly synchronized with the normal backends
+ * serving concurrent queries to ensure correctness. Correctness is defined
+ * as the following:
+ *
+ *    - Backends SHALL NOT violate local data_checksums state
+ *    - Data checksums SHALL NOT be considered enabled cluster-wide until all
+ *      currently connected backends have the local state "enabled"
+ *
+ * There are two levels of synchronization required for enabling data checksums
+ * in an online cluster: (i) changing state in the active backends ("on",
+ * "off", "inprogress-on" and "inprogress-off"), and (ii) ensuring no
+ * incompatible objects and processes are left in a database when workers end.
+ * The former deals with cluster-wide agreement on data checksum state and the
+ * latter with ensuring that any concurrent activity cannot break the data
+ * checksum contract during processing.
+ *
+ * Synchronizing the state change is done with procsignal barriers, where the
+ * WAL logging backend updating the global state in the controlfile will wait
+ * for all other backends to absorb the barrier. Barrier absorption will happen
+ * during interrupt processing, which means that connected backends will change
+ * state at different times. To prevent data checksum state changes when
+ * writing and verifying checksums, interrupts shall be held off before
+ * interrogating state and resumed when the IO operation has been performed.
+ *
+ *   When Enabling Data Checksums
+ *   ----------------------------
+ *   A process which fails to observe data checksums being enabled can induce
+ *   two types of errors: failing to write the checksum when modifying the page
+ *   and failing to validate the data checksum on the page when reading it.
+ *
+ *   When processing starts all backends belong to one of the below sets, with
+ *   one set being empty:
+ *
+ *   Bd: Backends in "off" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   If processing is started in an online cluster then all backends are in Bd.
+ *   If processing was halted by the cluster shutting down, the controlfile
+ *   state "inprogress-on" will be observed on system startup and all backends
+ *   will be in Bd. Backends transition Bd -> Bi via a procsignalbarrier.  When
+ *   the DataChecksumsWorker has finished writing checksums on all pages and
+ *   enables data checksums cluster-wide, there are four sets of backends where
+ *   Bd shall be an empty set:
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   Backends in Bi and Be will write checksums when modifying a page, but only
+ *   backends in Be will verify the checksum during reading. The Bg backend is
+ *   blocked waiting for all backends in Bi to process interrupts and move to
+ *   Be. Any backend starting while Bg is waiting on the procsignalbarrier will
+ *   observe the global state being "on" and will thus automatically belong to
+ *   Be.  Checksums are enabled cluster-wide when Bi is an empty set. Bi and Be
+ *   are compatible sets while still operating based on their local state as
+ *   both write data checksums.
+ *
+ *   When Disabling Data Checksums
+ *   -----------------------------
+ *   A process which fails to observe that data checksums have been disabled
+ *   can induce two types of errors: writing the checksum when modifying the
+ *   page and validating a data checksum which is no longer correct due to
+ *   modifications to the page.
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bo: Backends in "inprogress-off" state
+ *
+ *   Backends transition from the Be state to Bd like so: Be -> Bo -> Bd
+ *
+ *   The goal is to transition all backends to Bd making the others empty sets.
+ *   Backends in Bo write data checksums, but don't validate them, such that
+ *   backends still in Be can continue to validate pages until the barrier has
+ *   been absorbed such that they are in Bo. Once all backends are in Bo, the
+ *   barrier to transition to "off" can be raised and all backends can safely
+ *   stop writing data checksums as no backend is enforcing data checksum
+ *   validation any longer.
+ *
+ *
+ * Potential optimizations
+ * -----------------------
+ * Below are some potential optimizations and improvements which were brought
+ * up during reviews of this feature, but which weren't implemented in the
+ * initial version. These are ideas listed without any validation on their
+ * feasibility or potential payoff. More discussion on these can be found on
+ * the -hackers threads linked to in the commit message of this feature.
+ *
+ *   * Launching datachecksumsworker for resuming operation from the startup
+ *     process: Currently users have to restart processing manually after a
+ *     restart since dynamic background worker cannot be started from the
+ *     postmaster. Changing the startup process could make restarting the
+ *     processing automatic on cluster restart.
+ *   * Avoid dirtying the page when checksums already match: Iff the checksum
+ *     on the page happens to already match we still dirty the page. It should
+ *     be enough to only do the log_newpage_buffer() call in that case.
+ *   * Invent a lightweight WAL record that doesn't contain the full-page
+ *     image but just the block number: On replay, the redo routine would read
+ *     the page from disk.
+ *   * Teach pg_checksums to avoid checksummed pages when pg_checksums is used
+ *     to enable checksums on a cluster which is in inprogress-on state and
+ *     may have checksummed pages (make pg_checksums be able to resume an
+ *     online operation).
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/postmaster/datachecksumsworker.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/htup_details.h"
+#include "access/xact.h"
+#include "access/xloginsert.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_database.h"
+#include "commands/progress.h"
+#include "commands/vacuum.h"
+#include "common/relpath.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "postmaster/bgworker.h"
+#include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/bufmgr.h"
+#include "storage/checksum.h"
+#include "storage/ipc.h"
+#include "storage/lmgr.h"
+#include "storage/lwlock.h"
+#include "storage/procarray.h"
+#include "storage/smgr.h"
+#include "tcop/tcopprot.h"
+#include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
+#include "utils/ps_status.h"
+#include "utils/syscache.h"
+
+/*
+ * Number of times we retry to open a database before giving up and consider
+ * it to have failed processing.
+ */
+#define DATACHECKSUMSWORKER_MAX_DB_RETRIES 5
+
+typedef enum
+{
+	DATACHECKSUMSWORKER_SUCCESSFUL = 0,
+	DATACHECKSUMSWORKER_ABORTED,
+	DATACHECKSUMSWORKER_FAILED,
+	DATACHECKSUMSWORKER_RETRYDB,
+} DataChecksumsWorkerResult;
+
+/*
+ * Signaling between backends calling pg_enable/disable_data_checksums, the
+ * checksums launcher process, and the checksums worker process.
+ *
+ * This struct is protected by DataChecksumsWorkerLock
+ */
+typedef struct DataChecksumsWorkerShmemStruct
+{
+	/*
+	 * These are set by pg_enable/disable_data_checksums, to tell the launcher
+	 * what the target state is.
+	 */
+	bool		launch_enable_checksums;	/* True if checksums are being
+											 * enabled, else false */
+	int			launch_cost_delay;
+	int			launch_cost_limit;
+	bool		launch_fast;
+
+	/*
+	 * Is a launcher process is currently running?
+	 *
+	 * This is set by the launcher process, after it has read the above
+	 * launch_* parameters.
+	 */
+	bool		launcher_running;
+
+	/*
+	 * These fields indicate the target state that the launcher is currently
+	 * working towards. They can be different from the corresponding launch_*
+	 * fields, if a new pg_enable/disable_data_checksums() call was made while
+	 * the launcher/worker was already running.
+	 *
+	 * The below members are set when the launcher starts, and are only
+	 * accessed read-only by the single worker. Thus, we can access these
+	 * without a lock. If multiple workers, or dynamic cost parameters, are
+	 * supported at some point then this would need to be revisited.
+	 */
+	bool		enabling_checksums; /* True if checksums are being enabled,
+									 * else false */
+	int			cost_delay;
+	int			cost_limit;
+	bool		immediate_checkpoint;
+
+	/*
+	 * Signaling between the launcher and the worker process.
+	 *
+	 * As there is only a single worker, and the launcher won't read these
+	 * until the worker exits, they can be accessed without the need for a
+	 * lock. If multiple workers are supported then this will have to be
+	 * revisited.
+	 */
+
+	/* result, set by worker before exiting */
+	DataChecksumsWorkerResult success;
+
+	/*
+	 * tells the worker process whether it should also process the shared
+	 * catalogs
+	 */
+	bool		process_shared_catalogs;
+} DataChecksumsWorkerShmemStruct;
+
+/* Shared memory segment for datachecksumsworker */
+static DataChecksumsWorkerShmemStruct *DataChecksumsWorkerShmem;
+
+/* Bookkeeping for work to do */
+typedef struct DataChecksumsWorkerDatabase
+{
+	Oid			dboid;
+	char	   *dbname;
+} DataChecksumsWorkerDatabase;
+
+typedef struct DataChecksumsWorkerResultEntry
+{
+	Oid			dboid;
+	DataChecksumsWorkerResult result;
+	int			retries;
+} DataChecksumsWorkerResultEntry;
+
+
+/*
+ * Flag set by the interrupt handler
+ */
+static volatile sig_atomic_t abort_requested = false;
+
+/*
+ * Have we set the DataChecksumsWorkerShmemStruct->launcher_running flag?
+ * If we have, we need to clear it before exiting!
+ */
+static volatile sig_atomic_t launcher_running = false;
+
+/*
+ * Are we enabling data checksums, or disabling them?
+ */
+static bool enabling_checksums;
+
+/* Prototypes */
+static List *BuildDatabaseList(void);
+static List *BuildRelationList(bool temp_relations, bool include_shared);
+static void FreeDatabaseList(List *dblist);
+static DataChecksumsWorkerResult ProcessDatabase(DataChecksumsWorkerDatabase *db);
+static bool ProcessAllDatabases(bool immediate_checkpoint);
+static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy);
+static void launcher_cancel_handler(SIGNAL_ARGS);
+static void WaitForAllTransactionsToFinish(void);
+
+/*
+ * StartDataChecksumsWorkerLauncher
+ *		Main entry point for datachecksumsworker launcher process
+ *
+ * The main entrypoint for starting data checksums processing for enabling as
+ * well as disabling.
+ */
+void
+StartDataChecksumsWorkerLauncher(bool enable_checksums,
+								 int cost_delay,
+								 int cost_limit,
+								 bool fast)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	bool		launcher_running;
+
+	/* the cost delay settings have no effect when disabling */
+	Assert(enable_checksums || cost_delay == 0);
+	Assert(enable_checksums || cost_limit == 0);
+
+	/* store the desired state in shared memory */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	DataChecksumsWorkerShmem->launch_enable_checksums = enable_checksums;
+	DataChecksumsWorkerShmem->launch_cost_delay = cost_delay;
+	DataChecksumsWorkerShmem->launch_cost_limit = cost_limit;
+	DataChecksumsWorkerShmem->launch_fast = fast;
+
+	/* is the launcher already running? */
+	launcher_running = DataChecksumsWorkerShmem->launcher_running;
+
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * Launch a new launcher process, if it's not running already.
+	 *
+	 * If the launcher is currently busy enabling the checksums, and we want
+	 * them disabled (or vice versa), the launcher will notice that at latest
+	 * when it's about to exit, and will loop back process the new request. So
+	 * if the launcher is already running, we don't need to do anything more
+	 * here to abort it.
+	 *
+	 * If you call pg_enable/disable_data_checksums() twice in a row, before
+	 * the launcher has had a chance to start up, we still end up launching it
+	 * twice.  That's OK, the second invocation will see that a launcher is
+	 * already running and exit quickly.
+	 *
+	 * TODO: We could optimize here and skip launching the launcher, if we are
+	 * already in the desired state, i.e. if the checksums are already enabled
+	 * and you call pg_enable_data_checksums().
+	 */
+	if (!launcher_running)
+	{
+		/*
+		 * Prepare the BackgroundWorker and launch it.
+		 */
+		memset(&bgw, 0, sizeof(bgw));
+		bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+		bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+		snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+		snprintf(bgw.bgw_function_name, BGW_MAXLEN, "DataChecksumsWorkerLauncherMain");
+		snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker launcher");
+		snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker launcher");
+		bgw.bgw_restart_time = BGW_NEVER_RESTART;
+		bgw.bgw_notify_pid = MyProcPid;
+		bgw.bgw_main_arg = (Datum) 0;
+
+		if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+			ereport(ERROR,
+					(errmsg("failed to start background worker to process data checksums")));
+	}
+}
+
+/*
+ * ProcessSingleRelationFork
+ *		Enable data checksums in a single relation/fork.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy)
+{
+	BlockNumber numblocks = RelationGetNumberOfBlocksInFork(reln, forkNum);
+	BlockNumber blknum;
+	char		activity[NAMEDATALEN * 2 + 128];
+	char	   *relns;
+
+	relns = get_namespace_name(RelationGetNamespace(reln));
+
+	if (!relns)
+		return false;
+
+	/* Report the current relation to pgstat_activity */
+	snprintf(activity, sizeof(activity) - 1, "processing: %s.%s (%s, %dblocks)",
+			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	/*
+	 * We are looping over the blocks which existed at the time of process
+	 * start, which is safe since new blocks are created with checksums set
+	 * already due to the state being "inprogress-on".
+	 */
+	for (blknum = 0; blknum < numblocks; blknum++)
+	{
+		Buffer		buf = ReadBufferExtended(reln, forkNum, blknum, RBM_NORMAL, strategy);
+
+		/* Need to get an exclusive lock before we can flag as dirty */
+		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+
+		/*
+		 * Mark the buffer as dirty and force a full page write.  We have to
+		 * re-write the page to WAL even if the checksum hasn't changed,
+		 * because if there is a replica it might have a slightly different
+		 * version of the page with an invalid checksum, caused by unlogged
+		 * changes (e.g. hintbits) on the master happening while checksums
+		 * were off. This can happen if there was a valid checksum on the page
+		 * at one point in the past, so only when checksums are first on, then
+		 * off, and then turned on again. Iff wal_level is set to "minimal",
+		 * this could be avoided iff the checksum is calculated to be correct.
+		 */
+		START_CRIT_SECTION();
+		MarkBufferDirty(buf);
+		log_newpage_buffer(buf, false);
+		END_CRIT_SECTION();
+
+		UnlockReleaseBuffer(buf);
+
+		/*
+		 * This is the only place where we check if we are asked to abort, the
+		 * abortion will bubble up from here. It's safe to check this without
+		 * a lock, because if we miss it being set, we will try again soon.
+		 */
+		Assert(enabling_checksums);
+		if (!DataChecksumsWorkerShmem->launch_enable_checksums)
+			abort_requested = true;
+		if (abort_requested)
+			return false;
+
+		vacuum_delay_point();
+	}
+
+	pfree(relns);
+	return true;
+}
+
+/*
+ * ProcessSingleRelationByOid
+ *		Process a single relation based on oid.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationByOid(Oid relationId, BufferAccessStrategy strategy)
+{
+	Relation	rel;
+	ForkNumber	fnum;
+	bool		aborted = false;
+
+	StartTransactionCommand();
+
+	rel = try_relation_open(relationId, AccessShareLock);
+	if (rel == NULL)
+	{
+		/*
+		 * Relation no longer exists. We don't consider this an error since
+		 * there are no pages in it that need data checksums, and thus return
+		 * true. The worker operates off a list of relations generated at the
+		 * start of processing, so relations being dropped in the meantime is
+		 * to be expected.
+		 */
+		CommitTransactionCommand();
+		pgstat_report_activity(STATE_IDLE, NULL);
+		return true;
+	}
+	RelationGetSmgr(rel);
+
+	for (fnum = 0; fnum <= MAX_FORKNUM; fnum++)
+	{
+		if (smgrexists(rel->rd_smgr, fnum))
+		{
+			if (!ProcessSingleRelationFork(rel, fnum, strategy))
+			{
+				aborted = true;
+				break;
+			}
+		}
+	}
+	relation_close(rel, AccessShareLock);
+	elog(DEBUG2,
+		 "data checksum processing done for relation with OID %u: %s",
+		 relationId, (aborted ? "aborted" : "finished"));
+
+	CommitTransactionCommand();
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return !aborted;
+}
+
+/*
+ * ProcessDatabase
+ *		Enable data checksums in a single database.
+ *
+ * We do this by launching a dynamic background worker into this database, and
+ * waiting for it to finish.  We have to do this in a separate worker, since
+ * each process can only be connected to one database during its lifetime.
+ */
+static DataChecksumsWorkerResult
+ProcessDatabase(DataChecksumsWorkerDatabase *db)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	BgwHandleStatus status;
+	pid_t		pid;
+	char		activity[NAMEDATALEN + 64];
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_FAILED;
+
+	memset(&bgw, 0, sizeof(bgw));
+	bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+	bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+	snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+	snprintf(bgw.bgw_function_name, BGW_MAXLEN, "%s", "DataChecksumsWorkerMain");
+	snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker worker");
+	snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker worker");
+	bgw.bgw_restart_time = BGW_NEVER_RESTART;
+	bgw.bgw_notify_pid = MyProcPid;
+	bgw.bgw_main_arg = ObjectIdGetDatum(db->dboid);
+
+	/*
+	 * If there are no worker slots available, make sure we retry processing
+	 * this database. This will make the datachecksumsworker move on to the
+	 * next database and quite likely fail with the same problem. TODO: Maybe
+	 * we need a backoff to avoid running through all the databases here in
+	 * short order.
+	 */
+	if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+	{
+		ereport(WARNING,
+				(errmsg("failed to start worker for enabling data checksums in database \"%s\", retrying",
+						db->dbname),
+				 errhint("The max_worker_processes setting might be too low.")));
+		return DATACHECKSUMSWORKER_RETRYDB;
+	}
+
+	status = WaitForBackgroundWorkerStartup(bgw_handle, &pid);
+	if (status == BGWH_STOPPED)
+	{
+		ereport(WARNING,
+				(errmsg("could not start background worker for enabling data checksums in database \"%s\"",
+						db->dbname),
+				 errhint("More details on the error might be found in the server log.")));
+		return DATACHECKSUMSWORKER_FAILED;
+	}
+
+	/*
+	 * If the postmaster crashed we cannot end up with a processed database so
+	 * we have no alternative other than exiting. When enabling checksums we
+	 * won't at this time have changed the pg_control version to enabled so
+	 * when the cluster comes back up processing will have to be restarted.
+	 * When disabling, the pg_control version will be set to off before this
+	 * so when the cluster comes up checksums will be off as expected.
+	 */
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("cannot enable data checksums without the postmaster process"),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	Assert(status == BGWH_STARTED);
+	ereport(DEBUG1,
+			(errmsg("initiating data checksum processing in database \"%s\"",
+					db->dbname)));
+
+	snprintf(activity, sizeof(activity) - 1,
+			 "Waiting for worker in database %s (pid %d)", db->dbname, pid);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	status = WaitForBackgroundWorkerShutdown(bgw_handle);
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("postmaster exited during data checksum processing in \"%s\"",
+						db->dbname),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	if (DataChecksumsWorkerShmem->success == DATACHECKSUMSWORKER_ABORTED)
+		ereport(LOG,
+				(errmsg("data checksums processing was aborted in database \"%s\"",
+						db->dbname)));
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return DataChecksumsWorkerShmem->success;
+}
+
+/*
+ * launcher_exit
+ *
+ * Internal routine for cleaning up state when the launcher process exits. We
+ * need to clean up the abort flag to ensure that processing can be restarted
+ * again after it was previously aborted.
+ */
+static void
+launcher_exit(int code, Datum arg)
+{
+	if (launcher_running)
+	{
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		launcher_running = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		LWLockRelease(DataChecksumsWorkerLock);
+	}
+}
+
+/*
+ * launcher_cancel_handler
+ *
+ * Internal routine for reacting to SIGINT and flagging the worker to abort.
+ * The worker won't be interrupted immediately but will check for abort flag
+ * between each block in a relation.
+ */
+static void
+launcher_cancel_handler(SIGNAL_ARGS)
+{
+	int			save_errno = errno;
+
+	abort_requested = true;
+
+	/*
+	 * There is no sleeping in the main loop, the flag will be checked
+	 * periodically in ProcessSingleRelationFork. The worker does however
+	 * sleep when waiting for concurrent transactions to end so we still need
+	 * to set the latch.
+	 */
+	SetLatch(MyLatch);
+
+	errno = save_errno;
+}
+
+/*
+ * WaitForAllTransactionsToFinish
+ *		Blocks awaiting all current transactions to finish
+ *
+ * Returns when all transactions which are active at the call of the function
+ * have ended, or if the postmaster dies while waiting. If the postmaster dies
+ * the abort flag will be set to indicate that the caller of this shouldn't
+ * proceed.
+ *
+ * NB: this will return early, if aborted by SIGINT or if the target state
+ * is changed while we're running.
+ */
+static void
+WaitForAllTransactionsToFinish(void)
+{
+	TransactionId waitforxid;
+
+	LWLockAcquire(XidGenLock, LW_SHARED);
+	waitforxid = XidFromFullTransactionId(TransamVariables->nextXid);
+	LWLockRelease(XidGenLock);
+
+	while (TransactionIdPrecedes(GetOldestActiveTransactionId(), waitforxid))
+	{
+		char		activity[64];
+		int			rc;
+
+		/* Oldest running xid is older than us, so wait */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for current transactions to finish (waiting for %u)",
+				 waitforxid);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		rc = WaitLatch(MyLatch,
+					   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+					   5000,
+					   WAIT_EVENT_CHECKSUM_ENABLE_STARTCONDITION);
+
+		/*
+		 * If the postmaster died we won't be able to enable checksums
+		 * cluster-wide so abort and hope to continue when restarted.
+		 */
+		if (rc & WL_POSTMASTER_DEATH)
+			ereport(FATAL,
+					(errmsg("postmaster exited during data checksum processing"),
+					 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_SHARED);
+		if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			abort_requested = true;
+		LWLockRelease(DataChecksumsWorkerLock);
+		if (abort_requested)
+			break;
+	}
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+	return;
+}
+
+/*
+ * DataChecksumsWorkerLauncherMain
+ *
+ * Main function for launching dynamic background workers for processing data
+ * checksums in databases. This function has the bgworker management, with
+ * ProcessAllDatabases being responsible for looping over the databases and
+ * initiating processing.
+ */
+void
+DataChecksumsWorkerLauncherMain(Datum arg)
+{
+	on_shmem_exit(launcher_exit, 0);
+
+	ereport(DEBUG1,
+			errmsg("background worker \"datachecksumsworker\" launcher started"));
+
+	pqsignal(SIGTERM, die);
+	pqsignal(SIGINT, launcher_cancel_handler);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_LAUNCHER;
+	init_ps_display(NULL);
+
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	if (DataChecksumsWorkerShmem->launcher_running)
+	{
+		/* Launcher was already running, let it finish */
+		LWLockRelease(DataChecksumsWorkerLock);
+		return;
+	}
+
+	launcher_running = true;
+
+	/*
+	 * Initialize a connection to shared catalogs only.
+	 */
+	BackgroundWorkerInitializeConnectionByOid(InvalidOid, InvalidOid, 0);
+
+	/* Initialize backend status information */
+	pgstat_bestart();
+
+	enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+	DataChecksumsWorkerShmem->launcher_running = true;
+	DataChecksumsWorkerShmem->enabling_checksums = enabling_checksums;
+	DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+	DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+	DataChecksumsWorkerShmem->immediate_checkpoint = DataChecksumsWorkerShmem->launch_fast;
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * The target state can change while we are busy enabling/disabling
+	 * checksums, if the user calls pg_disable/enable_data_checksums() before
+	 * we are finished with the previous request. In that case, we will loop
+	 * back here, to process the new request.
+	 */
+again:
+
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	/*
+	 * If we're asked to enable checksums, we need to check if processing was
+	 * previously interrupted such that we should resume rather than start
+	 * from scratch.
+	 */
+	if (enabling_checksums)
+	{
+		/*
+		 * If we are asked to enable checksums in a cluster which already has
+		 * checksums enabled, exit immediately as there is nothing more to do.
+		 * Hold interrupts to make sure state doesn't change during checking.
+		 */
+		HOLD_INTERRUPTS();
+		if (DataChecksumsNeedVerify())
+		{
+			RESUME_INTERRUPTS();
+			goto done;
+		}
+		RESUME_INTERRUPTS();
+
+		/*
+		 * Initialize progress and indicate that we are waiting on the other
+		 * backends to clear the procsignalbarrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS);
+
+		/*
+		 * Set the state to inprogress-on and wait on the procsignal barrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_ENABLING);
+		SetDataChecksumsOnInProgress();
+
+		if (!ProcessAllDatabases(DataChecksumsWorkerShmem->immediate_checkpoint))
+		{
+			/*
+			 * If the target state changed during processing then it's not a
+			 * failure, so restart processing instead.
+			 */
+			LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+			if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			{
+				LWLockRelease(DataChecksumsWorkerLock);
+				goto done;
+			}
+			LWLockRelease(DataChecksumsWorkerLock);
+			ereport(ERROR,
+					(errmsg("unable to enable data checksums in cluster")));
+		}
+
+		SetDataChecksumsOn();
+	}
+	else
+	{
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_DISABLING);
+		SetDataChecksumsOff();
+	}
+
+done:
+
+	/*
+	 * All done. But before we exit, check if the target state was changed
+	 * while we were running. In that case we will have to start all over
+	 * again.
+	 */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+	if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+	{
+		DataChecksumsWorkerShmem->enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+		DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+		LWLockRelease(DataChecksumsWorkerLock);
+		goto again;
+	}
+
+	/* Shut down progress reporting as we are done */
+	pgstat_progress_end_command();
+
+	launcher_running = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	LWLockRelease(DataChecksumsWorkerLock);
+}
+
+/*
+ * ProcessAllDatabases
+ *		Compute the list of all databases and process checksums in each
+ *
+ * This will repeatedly generate a list of databases to process for enabling
+ * checksums. Until no new databases are found, this will loop around computing
+ * a new list and comparing it to the already seen ones.
+ *
+ * If immediate_checkpoint is set to true then a CHECKPOINT_IMMEDIATE will be
+ * issued. This is useful for testing but should be avoided in production use
+ * as it may affect cluster performance drastically.
+ */
+static bool
+ProcessAllDatabases(bool immediate_checkpoint)
+{
+	List	   *DatabaseList;
+	HTAB	   *ProcessedDatabases = NULL;
+	HASHCTL		hash_ctl;
+	bool		found_failed = false;
+	int			flags;
+
+	/* Initialize a hash tracking all processed databases */
+	memset(&hash_ctl, 0, sizeof(hash_ctl));
+	hash_ctl.keysize = sizeof(Oid);
+	hash_ctl.entrysize = sizeof(DataChecksumsWorkerResultEntry);
+	ProcessedDatabases = hash_create("Processed databases",
+									 64,
+									 &hash_ctl,
+									 HASH_ELEM | HASH_BLOBS);
+
+	/*
+	 * Set up so first run processes shared catalogs, but not once in every
+	 * db.
+	 */
+	DataChecksumsWorkerShmem->process_shared_catalogs = true;
+
+	/*
+	 * Get a list of all databases to process. This may include databases that
+	 * were created during our runtime.  Since a database can be created as a
+	 * copy of any other database (which may not have existed in our last
+	 * run), we have to repeat this loop until no new databases show up in the
+	 * list.
+	 */
+	DatabaseList = BuildDatabaseList();
+
+	/*
+	 * Update progress reporting with the total number of databases we need to
+	 * process. This number should not be changed during processing, the
+	 * columns for processed databases is instead increased such that it can
+	 * be compared against the total.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_DB,
+								 list_length(DatabaseList));
+
+	while (true)
+	{
+		int			processed_databases = 0;
+
+		foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+		{
+			DataChecksumsWorkerResult result;
+			DataChecksumsWorkerResultEntry *entry;
+			bool		found;
+
+			/*
+			 * Indicate which database is being processed set the number of
+			 * relations to -1 to clear field from previous values. -1 will
+			 * translate to NULL in the progress view.
+			 */
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_CUR_DB, db->dboid);
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_REL, -1);
+
+			/*
+			 * Check if this database has been processed already, and if so
+			 * whether it should be retried or skipped.
+			 */
+			entry = (DataChecksumsWorkerResultEntry *) hash_search(ProcessedDatabases, &db->dboid,
+																   HASH_FIND, NULL);
+
+			if (entry)
+			{
+				if (entry->result == DATACHECKSUMSWORKER_RETRYDB)
+				{
+					/*
+					 * Limit the number of retries to avoid infinite looping
+					 * in case there simply won't be enough workers in the
+					 * cluster to finish this operation.
+					 */
+					if (entry->retries > DATACHECKSUMSWORKER_MAX_DB_RETRIES)
+						entry->result = DATACHECKSUMSWORKER_FAILED;
+				}
+
+				/* Skip if this database has been processed already */
+				if (entry->result != DATACHECKSUMSWORKER_RETRYDB)
+					continue;
+			}
+
+			result = ProcessDatabase(db);
+			processed_databases++;
+
+			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
+			{
+				/*
+				 * If one database has completed shared catalogs, we don't
+				 * have to process them again.
+				 */
+				if (DataChecksumsWorkerShmem->process_shared_catalogs)
+					DataChecksumsWorkerShmem->process_shared_catalogs = false;
+			}
+			else if (result == DATACHECKSUMSWORKER_ABORTED)
+			{
+				/* Abort flag set, so exit the whole process */
+				return false;
+			}
+
+			entry = hash_search(ProcessedDatabases, &db->dboid, HASH_ENTER, &found);
+			entry->dboid = db->dboid;
+			entry->result = result;
+			if (!found)
+				entry->retries = 0;
+			else
+				entry->retries++;
+		}
+
+		elog(DEBUG1,
+			 "%i databases processed for data checksum enabling, %s",
+			 processed_databases,
+			 (processed_databases ? "process with restart" : "process completed"));
+
+		FreeDatabaseList(DatabaseList);
+
+		/*
+		 * If no databases were processed in this run of the loop, we have now
+		 * finished all databases and no concurrently created ones can exist.
+		 */
+		if (processed_databases == 0)
+			break;
+
+		/*
+		 * Re-generate the list of databases for another pass. Since we wait
+		 * for all pre-existing transactions finish, this way we can be
+		 * certain that there are no databases left without checksums.
+		 */
+		WaitForAllTransactionsToFinish();
+		DatabaseList = BuildDatabaseList();
+	}
+
+	/*
+	 * ProcessedDatabases now has all databases and the results of their
+	 * processing. Failure to enable checksums for a database can be because
+	 * they actually failed for some reason, or because the database was
+	 * dropped between us getting the database list and trying to process it.
+	 * Get a fresh list of databases to detect the second case where the
+	 * database was dropped before we had started processing it. If a database
+	 * still exists, but enabling checksums failed then we fail the entire
+	 * checksumming process and exit with an error.
+	 */
+	WaitForAllTransactionsToFinish();
+	DatabaseList = BuildDatabaseList();
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+	{
+		DataChecksumsWorkerResult *entry;
+		bool		found;
+
+		entry = hash_search(ProcessedDatabases, (void *) &db->dboid,
+							HASH_FIND, &found);
+
+		/*
+		 * We are only interested in the databases where the failed database
+		 * still exists.
+		 */
+		if (found && *entry == DATACHECKSUMSWORKER_FAILED)
+		{
+			ereport(WARNING,
+					(errmsg("failed to enable data checksums in \"%s\"",
+							db->dbname)));
+			found_failed = found;
+			continue;
+		}
+	}
+
+	FreeDatabaseList(DatabaseList);
+
+	if (found_failed)
+	{
+		/* Disable checksums on cluster, because we failed */
+		SetDataChecksumsOff();
+		ereport(ERROR,
+				(errmsg("data checksums failed to get enabled in all databases, aborting"),
+				 errhint("The server log might have more information on the cause of the error.")));
+	}
+
+	/*
+	 * Force a checkpoint to get everything out to disk. The use of immediate
+	 * checkpoints is for running tests, as they would otherwise not execute
+	 * in such a way that they can reliably be placed under timeout control.
+	 */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_IMMEDIATE;
+	RequestCheckpoint(flags);
+
+	return true;
+}
+
+/*
+ * DataChecksumsWorkerShmemSize
+ *		Compute required space for datachecksumsworker-related shared memory
+ */
+Size
+DataChecksumsWorkerShmemSize(void)
+{
+	Size		size;
+
+	size = sizeof(DataChecksumsWorkerShmemStruct);
+	size = MAXALIGN(size);
+
+	return size;
+}
+
+/*
+ * DataChecksumsWorkerShmemInit
+ *		Allocate and initialize datachecksumsworker-related shared memory
+ */
+void
+DataChecksumsWorkerShmemInit(void)
+{
+	bool		found;
+
+	DataChecksumsWorkerShmem = (DataChecksumsWorkerShmemStruct *)
+		ShmemInitStruct("DataChecksumsWorker Data",
+						DataChecksumsWorkerShmemSize(),
+						&found);
+
+	MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
+
+	/*
+	 * Even if this is a redundant assignment, we want to be explicit about
+	 * our intent for readability, since we want to be able to query this
+	 * state in case of restartability.
+	 */
+	DataChecksumsWorkerShmem->launch_enable_checksums = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	DataChecksumsWorkerShmem->launch_fast = false;
+}
+
+/*
+ * BuildDatabaseList
+ *		Compile a list of all currently available databases in the cluster
+ *
+ * This creates the list of databases for the datachecksumsworker workers to
+ * add checksums to. If the caller wants to ensure that no concurrently
+ * running CREATE DATABASE calls exist, this needs to be preceded by a call
+ * to WaitForAllTransactionsToFinish().
+ */
+static List *
+BuildDatabaseList(void)
+{
+	List	   *DatabaseList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(DatabaseRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_database pgdb = (Form_pg_database) GETSTRUCT(tup);
+		DataChecksumsWorkerDatabase *db;
+
+		oldctx = MemoryContextSwitchTo(ctx);
+
+		db = (DataChecksumsWorkerDatabase *) palloc0(sizeof(DataChecksumsWorkerDatabase));
+
+		db->dboid = pgdb->oid;
+		db->dbname = pstrdup(NameStr(pgdb->datname));
+
+		DatabaseList = lappend(DatabaseList, db);
+
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return DatabaseList;
+}
+
+static void
+FreeDatabaseList(List *dblist)
+{
+	if (!dblist)
+		return;
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, dblist)
+	{
+		if (db->dbname != NULL)
+			pfree(db->dbname);
+	}
+
+	list_free_deep(dblist);
+}
+
+/*
+ * BuildRelationList
+ *		Compile a list of relations in the database
+ *
+ * Returns a list of OIDs for the request relation types. If temp_relations
+ * is True then only temporary relations are returned. If temp_relations is
+ * False then non-temporary relations which have data checksums are returned.
+ * If include_shared is True then shared relations are included as well in a
+ * non-temporary list. include_shared has no relevance when building a list of
+ * temporary relations.
+ */
+static List *
+BuildRelationList(bool temp_relations, bool include_shared)
+{
+	List	   *RelationList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(RelationRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_class pgc = (Form_pg_class) GETSTRUCT(tup);
+
+		/*
+		 * Only include temporary relations when asked for a temp relation
+		 * list.
+		 */
+		if (pgc->relpersistence == RELPERSISTENCE_TEMP)
+		{
+			if (!temp_relations)
+				continue;
+		}
+		else
+		{
+			/*
+			 * If we are only interested in temp relations then continue
+			 * immediately as the current relation isn't a temp relation.
+			 */
+			if (temp_relations)
+				continue;
+
+			if (!RELKIND_HAS_STORAGE(pgc->relkind))
+				continue;
+
+			if (pgc->relisshared && !include_shared)
+				continue;
+		}
+
+		oldctx = MemoryContextSwitchTo(ctx);
+		RelationList = lappend_oid(RelationList, pgc->oid);
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return RelationList;
+}
+
+/*
+ * DataChecksumsWorkerMain
+ *
+ * Main function for enabling checksums in a single database, This is the
+ * function set as the bgw_function_name in the dynamic background worker
+ * process initiated for each database by the worker launcher. After enabling
+ * data checksums in each applicable relation in the database, it will wait for
+ * all temporary relations that were present when the function started to
+ * disappear before returning. This is required since we cannot rewrite
+ * existing temporary relations with data checksums.
+ */
+void
+DataChecksumsWorkerMain(Datum arg)
+{
+	Oid			dboid = DatumGetObjectId(arg);
+	List	   *RelationList = NIL;
+	List	   *InitialTempTableList = NIL;
+	BufferAccessStrategy strategy;
+	bool		aborted = false;
+
+	enabling_checksums = true;
+
+	pqsignal(SIGTERM, die);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_WORKER;
+	init_ps_display(NULL);
+
+	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
+											  BGWORKER_BYPASS_ALLOWCONN);
+
+	/*
+	 * Get a list of all temp tables present as we start in this database. We
+	 * need to wait until they are all gone until we are done, since we cannot
+	 * access these relations and modify them.
+	 */
+	InitialTempTableList = BuildRelationList(true, false);
+
+	/*
+	 * Enable vacuum cost delay, if any.
+	 */
+	Assert(DataChecksumsWorkerShmem->enabling_checksums);
+	VacuumCostDelay = DataChecksumsWorkerShmem->cost_delay;
+	VacuumCostLimit = DataChecksumsWorkerShmem->cost_limit;
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumCostPageHit = 0;
+	VacuumCostPageMiss = 0;
+	VacuumCostPageDirty = 0;
+
+	/*
+	 * Create and set the vacuum strategy as our buffer strategy.
+	 */
+	strategy = GetAccessStrategy(BAS_VACUUM);
+
+	RelationList = BuildRelationList(false,
+									 DataChecksumsWorkerShmem->process_shared_catalogs);
+	foreach_oid(reloid, RelationList)
+	{
+		if (!ProcessSingleRelationByOid(reloid, strategy))
+		{
+			aborted = true;
+			break;
+		}
+	}
+	list_free(RelationList);
+
+	if (aborted)
+	{
+		DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+		ereport(DEBUG1,
+				(errmsg("data checksum processing aborted in database OID %u",
+						dboid)));
+		return;
+	}
+
+	/*
+	 * Wait for all temp tables that existed when we started to go away. This
+	 * is necessary since we cannot "reach" them to enable checksums. Any temp
+	 * tables created after we started will already have checksums in them
+	 * (due to the "inprogress-on" state), so no need to wait for those.
+	 */
+	for (;;)
+	{
+		List	   *CurrentTempTables;
+		int			numleft;
+		char		activity[64];
+
+		CurrentTempTables = BuildRelationList(true, false);
+		numleft = 0;
+		foreach_oid(tmptbloid, InitialTempTableList)
+		{
+			if (list_member_oid(CurrentTempTables, tmptbloid))
+				numleft++;
+		}
+		list_free(CurrentTempTables);
+
+		if (numleft == 0)
+			break;
+
+		/*
+		 * At least one temp table is left to wait for, indicate in pgstat
+		 * activity and progress reporting.
+		 */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for %d temp tables to be removed", numleft);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		(void) WaitLatch(MyLatch,
+						 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+						 5000,
+						 WAIT_EVENT_CHECKSUM_ENABLE_FINISHCONDITION);
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		aborted = DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums;
+		LWLockRelease(DataChecksumsWorkerLock);
+
+		if (aborted || abort_requested)
+		{
+			DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+			ereport(DEBUG1,
+					(errmsg("data checksum processing aborted in database OID %u",
+							dboid)));
+			return;
+		}
+	}
+
+	list_free(InitialTempTableList);
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
+}
diff --git a/src/backend/postmaster/meson.build b/src/backend/postmaster/meson.build
index 0ea4bbe084..3aacb2e0e7 100644
--- a/src/backend/postmaster/meson.build
+++ b/src/backend/postmaster/meson.build
@@ -6,6 +6,7 @@ backend_sources += files(
   'bgworker.c',
   'bgwriter.c',
   'checkpointer.c',
+  'datachecksumsworker.c',
   'fork_process.c',
   'interrupt.c',
   'launch_backend.c',
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index e73576ad12..86f2ba1e2a 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -186,6 +186,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 		case XLOG_FPW_CHANGE:
 		case XLOG_FPI_FOR_HINT:
 		case XLOG_FPI:
+		case XLOG_CHECKSUMS:
 		case XLOG_OVERWRITE_CONTRECORD:
 		case XLOG_CHECKPOINT_REDO:
 			break;
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 7783ba854f..d68df6d323 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -30,6 +30,8 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "postmaster/postmaster.h"
 #include "postmaster/walsummarizer.h"
 #include "replication/logicallauncher.h"
 #include "replication/origin.h"
@@ -148,6 +150,7 @@ CalculateShmemSize(int *num_semaphores)
 	size = add_size(size, WaitEventCustomShmemSize());
 	size = add_size(size, InjectionPointShmemSize());
 	size = add_size(size, SlotSyncShmemSize());
+	size = add_size(size, DataChecksumsWorkerShmemSize());
 
 	/* include additional requested shmem from preload libraries */
 	size = add_size(size, total_addin_request);
@@ -330,6 +333,7 @@ CreateOrAttachShmemStructs(void)
 	PgArchShmemInit();
 	ApplyLauncherShmemInit();
 	SlotSyncShmemInit();
+	DataChecksumsWorkerShmemInit();
 
 	/*
 	 * Set up other modules that need some shared memory space
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 87027f27eb..e51cfb3a98 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -18,6 +18,7 @@
 #include <unistd.h>
 
 #include "access/parallel.h"
+#include "access/xlog.h"
 #include "commands/async.h"
 #include "miscadmin.h"
 #include "pgstat.h"
@@ -573,6 +574,18 @@ ProcessProcSignalBarrier(void)
 					case PROCSIGNAL_BARRIER_SMGRRELEASE:
 						processed = ProcessBarrierSmgrRelease();
 						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON:
+						processed = AbsorbChecksumsOnInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_ON:
+						processed = AbsorbChecksumsOnBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF:
+						processed = AbsorbChecksumsOffInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_OFF:
+						processed = AbsorbChecksumsOffBarrier();
+						break;
 				}
 
 				/*
diff --git a/src/backend/storage/page/README b/src/backend/storage/page/README
index e30d7ac59a..73c36a6390 100644
--- a/src/backend/storage/page/README
+++ b/src/backend/storage/page/README
@@ -10,7 +10,9 @@ http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf, discussed
 2010/12/22 on -hackers list.
 
 Current implementation requires this be enabled system-wide at initdb time, or
-by using the pg_checksums tool on an offline cluster.
+by using the pg_checksums tool on an offline cluster.  Checksums can also be
+enabled at runtime using pg_enable_data_checksums(), and disabled by using
+pg_disable_data_checksums().
 
 The checksum is not valid at all times on a data page!!
 The checksum is valid when the page leaves the shared pool and is checked
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index be6f1f62d2..112c05ee7e 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -100,7 +100,7 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 	 */
 	if (!PageIsNew(page))
 	{
-		if (DataChecksumsEnabled())
+		if (DataChecksumsNeedVerify())
 		{
 			checksum = pg_checksum_page((char *) page, blkno);
 
@@ -1512,7 +1512,7 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 	static char *pageCopy = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return (char *) page;
 
 	/*
@@ -1542,7 +1542,7 @@ void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
 	/* If we don't need a checksum, just return */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return;
 
 	((PageHeader) page)->pd_checksum = pg_checksum_page((char *) page, blkno);
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index cc2ffc78aa..2b5671e2ca 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -359,6 +359,8 @@ pgstat_tracks_io_bktype(BackendType bktype)
 		case B_WAL_SUMMARIZER:
 			return false;
 
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 		case B_AUTOVAC_LAUNCHER:
 		case B_AUTOVAC_WORKER:
 		case B_BACKEND:
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index 16144c2b72..a0f3570759 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -114,6 +114,8 @@ CHECKPOINT_DELAY_COMPLETE	"Waiting for a backend that blocks a checkpoint from c
 CHECKPOINT_DELAY_START	"Waiting for a backend that blocks a checkpoint from starting."
 CHECKPOINT_DONE	"Waiting for a checkpoint to complete."
 CHECKPOINT_START	"Waiting for a checkpoint to start."
+CHECKSUM_ENABLE_STARTCONDITION	"Waiting for data checksums enabling to start."
+CHECKSUM_ENABLE_FINISHCONDITION	"Waiting for data checksums to be enabled."
 EXECUTE_GATHER	"Waiting for activity from a child process while executing a <literal>Gather</literal> plan node."
 HASH_BATCH_ALLOCATE	"Waiting for an elected Parallel Hash participant to allocate a hash table."
 HASH_BATCH_ELECT	"Waiting to elect a Parallel Hash participant to allocate a hash table."
@@ -345,6 +347,7 @@ WALSummarizer	"Waiting to read or update WAL summarization state."
 DSMRegistry	"Waiting to read or update the dynamic shared memory registry."
 InjectionPoint	"Waiting to read or update information related to injection points."
 SerialControl	"Waiting to read or update shared <filename>pg_serial</filename> state."
+DataChecksumsWorker	"Waiting for data checksumsworker."
 
 #
 # END OF PREDEFINED LWLOCKS (DO NOT CHANGE THIS LINE)
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index f7b50e0b5a..9f52541eda 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -246,6 +246,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		cmdtype = PROGRESS_COMMAND_BASEBACKUP;
 	else if (pg_strcasecmp(cmd, "COPY") == 0)
 		cmdtype = PROGRESS_COMMAND_COPY;
+	else if (pg_strcasecmp(cmd, "DATACHECKSUMS") == 0)
+		cmdtype = PROGRESS_COMMAND_DATACHECKSUMS;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -1115,9 +1117,6 @@ pg_stat_get_db_checksum_failures(PG_FUNCTION_ARGS)
 	int64		result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
@@ -1133,9 +1132,6 @@ pg_stat_get_db_checksum_last_failure(PG_FUNCTION_ARGS)
 	TimestampTz result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index ef60f41b8c..8d15c15edb 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -290,6 +290,12 @@ GetBackendTypeDesc(BackendType backendType)
 		case B_CHECKPOINTER:
 			backendDesc = "checkpointer";
 			break;
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+			backendDesc = "datachecksumsworker launcher";
+			break;
+		case B_DATACHECKSUMSWORKER_WORKER:
+			backendDesc = "datachecksumsworker worker";
+			break;
 		case B_LOGGER:
 			backendDesc = "logger";
 			break;
@@ -841,7 +847,8 @@ InitializeSessionUserIdStandalone(void)
 	 * workers, in slot sync worker and in background workers.
 	 */
 	Assert(!IsUnderPostmaster || AmAutoVacuumWorkerProcess() ||
-		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess());
+		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess() ||
+		   AmDataChecksumsWorkerProcess());
 
 	/* call only once */
 	Assert(!OidIsValid(AuthenticatedUserId));
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index a024b1151d..513b163ed7 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -720,6 +720,11 @@ InitPostgres(const char *in_dbname, Oid dboid,
 	 */
 	SharedInvalBackendInit(false);
 
+	/*
+	 * Set up backend local cache of Controldata values.
+	 */
+	InitLocalControldata();
+
 	ProcSignalInit(MyCancelKeyValid, MyCancelKey);
 
 	/*
@@ -864,7 +869,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
 					 errhint("You should immediately run CREATE USER \"%s\" SUPERUSER;.",
 							 username != NULL ? username : "postgres")));
 	}
-	else if (AmBackgroundWorkerProcess())
+	else if (AmBackgroundWorkerProcess() || AmDataChecksumsWorkerProcess())
 	{
 		if (username == NULL && !OidIsValid(useroid))
 		{
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 8a67f01200..3ab7ead97d 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -474,6 +474,14 @@ static const struct config_enum_entry wal_compression_options[] = {
 	{NULL, 0, false}
 };
 
+static const struct config_enum_entry data_checksums_options[] = {
+	{"on", DATA_CHECKSUMS_ON, true},
+	{"off", DATA_CHECKSUMS_OFF, true},
+	{"inprogress-on", DATA_CHECKSUMS_INPROGRESS_ON, true},
+	{"inprogress-off", DATA_CHECKSUMS_INPROGRESS_OFF, true},
+	{NULL, 0, false}
+};
+
 /*
  * Options for enum values stored in other modules
  */
@@ -599,7 +607,7 @@ static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
 static int	num_os_semaphores;
-static bool data_checksums;
+static int	data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
@@ -1919,17 +1927,6 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
-	{
-		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
-			gettext_noop("Shows whether data checksums are turned on for this cluster."),
-			NULL,
-			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
-		},
-		&data_checksums,
-		false,
-		NULL, NULL, NULL
-	},
-
 	{
 		{"syslog_sequence_numbers", PGC_SIGHUP, LOGGING_WHERE,
 			gettext_noop("Add sequence number to syslog messages to avoid duplicate suppression."),
@@ -5207,6 +5204,17 @@ struct config_enum ConfigureNamesEnum[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows whether data checksums are turned on for this cluster."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+		},
+		&data_checksums,
+		DATA_CHECKSUMS_OFF, data_checksums_options,
+		NULL, NULL, show_data_checksums
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index f5f7ff1045..3bda3adb04 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -577,7 +577,7 @@ main(int argc, char *argv[])
 		mode == PG_MODE_DISABLE)
 		pg_fatal("data checksums are already disabled in cluster");
 
-	if (ControlFile->data_checksum_version > 0 &&
+	if (ControlFile->data_checksum_version == DATA_CHECKSUMS_ON &&
 		mode == PG_MODE_ENABLE)
 		pg_fatal("data checksums are already enabled in cluster");
 
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index 854c6887a2..7118f9069c 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -13,6 +13,7 @@
 
 #include "common/string.h"
 #include "pg_upgrade.h"
+#include "storage/bufpage.h"
 
 
 /*
@@ -693,6 +694,14 @@ check_control_data(ControlData *oldctrl,
 	 * check_for_isn_and_int8_passing_mismatch().
 	 */
 
+	/*
+	 * If data checksums are in any in-progress state then disallow the
+	 * upgrade. The user should either let the process finish, or turn off
+	 * data checksums, before retrying.
+	 */
+	if (oldctrl->data_checksum_version > PG_DATA_CHECKSUM_VERSION)
+		pg_fatal("checksums are being enabled in the old cluster");
+
 	/*
 	 * We might eventually allow upgrades from checksum to no-checksum
 	 * clusters.
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 34ad46c067..dd7c75abe3 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -117,7 +117,7 @@ extern PGDLLIMPORT int wal_level;
  * of the bits make it to disk, but the checksum wouldn't match.  Also WAL-log
  * them if forced by wal_log_hints=on.
  */
-#define XLogHintBitIsNeeded() (DataChecksumsEnabled() || wal_log_hints)
+#define XLogHintBitIsNeeded() (wal_log_hints || DataChecksumsNeedWrite())
 
 /* Do we need to WAL-log information required only for Hot Standby and logical replication? */
 #define XLogStandbyInfoActive() (wal_level >= WAL_LEVEL_REPLICA)
@@ -230,7 +230,19 @@ extern XLogRecPtr GetXLogWriteRecPtr(void);
 
 extern uint64 GetSystemIdentifier(void);
 extern char *GetMockAuthenticationNonce(void);
-extern bool DataChecksumsEnabled(void);
+extern bool DataChecksumsNeedWrite(void);
+extern bool DataChecksumsNeedVerify(void);
+extern bool DataChecksumsOnInProgress(void);
+extern bool DataChecksumsOffInProgress(void);
+extern void SetDataChecksumsOnInProgress(void);
+extern void SetDataChecksumsOn(void);
+extern void SetDataChecksumsOff(void);
+extern bool AbsorbChecksumsOnInProgressBarrier(void);
+extern bool AbsorbChecksumsOffInProgressBarrier(void);
+extern bool AbsorbChecksumsOnBarrier(void);
+extern bool AbsorbChecksumsOffBarrier(void);
+extern const char *show_data_checksums(void);
+extern void InitLocalControldata(void);
 extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
 extern Size XLOGShmemSize(void);
 extern void XLOGShmemInit(void);
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index d9cf51a0f9..80a354cb74 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,6 +25,7 @@
 #include "lib/stringinfo.h"
 #include "pgtime.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/relfilelocator.h"
 
 
@@ -289,6 +290,12 @@ typedef struct xl_restore_point
 	char		rp_name[MAXFNAMELEN];
 } xl_restore_point;
 
+/* Information logged when data checksum level is changed */
+typedef struct xl_checksum_state
+{
+	ChecksumType new_checksumtype;
+} xl_checksum_state;
+
 /* Overwrite of prior contrecord */
 typedef struct xl_overwrite_contrecord
 {
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index e80ff8e414..17da375aa6 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -80,6 +80,7 @@ typedef struct CheckPoint
 /* 0xC0 is used in Postgres 9.5-11 */
 #define XLOG_OVERWRITE_CONTRECORD		0xD0
 #define XLOG_CHECKPOINT_REDO			0xE0
+#define XLOG_CHECKSUMS					0xF0
 
 
 /*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index f23321a41f..2451fee403 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12155,6 +12155,23 @@
   proname => 'jsonb_subscript_handler', prorettype => 'internal',
   proargtypes => 'internal', prosrc => 'jsonb_subscript_handler' },
 
+# data checksum management functions
+{ oid => '9258',
+  descr => 'disable data checksums',
+  proname => 'pg_disable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => '',
+  prosrc => 'disable_data_checksums' },
+
+{ oid => '9257',
+  descr => 'enable data checksums',
+  proname => 'pg_enable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'int4 int4 bool', proallargtypes => '{int4,int4,bool}',
+  proargmodes => '{i,i,i}',
+  proargnames => '{cost_delay,cost_limit,fast}',
+  prosrc => 'enable_data_checksums' },
+
 # collation management functions
 { oid => '3445', descr => 'import collations from operating system',
   proname => 'pg_import_system_collations', procost => '100',
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 5616d64523..8e3c180392 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -155,4 +155,22 @@
 #define PROGRESS_COPY_TYPE_PIPE 3
 #define PROGRESS_COPY_TYPE_CALLBACK 4
 
+/* Progress parameters for PROGRESS_DATACHECKSUMS */
+#define PROGRESS_DATACHECKSUMS_PHASE 0
+#define PROGRESS_DATACHECKSUMS_TOTAL_DB 1
+#define PROGRESS_DATACHECKSUMS_TOTAL_REL 2
+#define PROGRESS_DATACHECKSUMS_PROCESSED_DB 3
+#define PROGRESS_DATACHECKSUMS_PROCESSED_REL 4
+#define PROGRESS_DATACHECKSUMS_CUR_DB 5
+#define PROGRESS_DATACHECKSUMS_CUR_REL 6
+#define PROGRESS_DATACHECKSUMS_CUR_REL_TOTAL_BLOCKS 7
+#define PROGRESS_DATACHECKSUMS_CUR_REL_PROCESSED_BLOCKS 8
+
+/* Phases of datachecksumsworker operation */
+#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING 0
+#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING 1
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS 2
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL 3
+#define PROGRESS_DATACHECKSUMS_DONE 4
+
 #endif
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index e26d108a47..55e507683a 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -357,6 +357,9 @@ typedef enum BackendType
 	B_WAL_SUMMARIZER,
 	B_WAL_WRITER,
 
+	B_DATACHECKSUMSWORKER_LAUNCHER,
+	B_DATACHECKSUMSWORKER_WORKER,
+
 	/*
 	 * Logger is not connected to shared memory and does not have a PGPROC
 	 * entry.
@@ -380,6 +383,7 @@ extern PGDLLIMPORT BackendType MyBackendType;
 #define AmWalReceiverProcess()		(MyBackendType == B_WAL_RECEIVER)
 #define AmWalSummarizerProcess()	(MyBackendType == B_WAL_SUMMARIZER)
 #define AmWalWriterProcess()		(MyBackendType == B_WAL_WRITER)
+#define AmDataChecksumsWorkerProcess()	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
 
 extern const char *GetBackendTypeDesc(BackendType backendType);
 
diff --git a/src/include/postmaster/datachecksumsworker.h b/src/include/postmaster/datachecksumsworker.h
new file mode 100644
index 0000000000..59c9000d64
--- /dev/null
+++ b/src/include/postmaster/datachecksumsworker.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.h
+ *	  header file for checksum helper background worker
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/postmaster/datachecksumsworker.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef DATACHECKSUMSWORKER_H
+#define DATACHECKSUMSWORKER_H
+
+/* Shared memory */
+extern Size DataChecksumsWorkerShmemSize(void);
+extern void DataChecksumsWorkerShmemInit(void);
+
+/* Start the background processes for enabling or disabling checksums */
+void		StartDataChecksumsWorkerLauncher(bool enable_checksums,
+											 int cost_delay,
+											 int cost_limit,
+											 bool fast);
+
+/* Background worker entrypoints */
+void		DataChecksumsWorkerLauncherMain(Datum arg);
+void		DataChecksumsWorkerMain(Datum arg);
+
+#endif							/* DATACHECKSUMSWORKER_H */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 6222d46e53..9ff4127e58 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -204,7 +204,17 @@ typedef PageHeaderData *PageHeader;
  * handling pages.
  */
 #define PG_PAGE_LAYOUT_VERSION		4
+
+/*
+ * Checksum version 0 is used for when data checksums are disabled.
+ * PG_DATA_CHECKSUM_VERSION defines that data checksums are enabled in the
+ * cluster and PG_DATA_CHECKSUM_INPROGRESS_{ON|OFF}_VERSION defines that data
+ * checksums are either currently being enabled or disabled.
+ */
 #define PG_DATA_CHECKSUM_VERSION	1
+#define PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION	2
+#define PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION	3
+
 
 /* ----------------------------------------------------------------
  *						page support functions
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 08e9d598ce..e3e6ec3d4a 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -15,6 +15,14 @@
 
 #include "storage/block.h"
 
+typedef enum ChecksumType
+{
+	DATA_CHECKSUMS_OFF = 0,
+	DATA_CHECKSUMS_ON,
+	DATA_CHECKSUMS_INPROGRESS_ON,
+	DATA_CHECKSUMS_INPROGRESS_OFF
+} ChecksumType;
+
 /*
  * Compute the checksum for a Postgres page.  The page must be aligned on a
  * 4-byte boundary.
diff --git a/src/include/storage/lwlocklist.h b/src/include/storage/lwlocklist.h
index 6a2f64c54f..64e2ea5a50 100644
--- a/src/include/storage/lwlocklist.h
+++ b/src/include/storage/lwlocklist.h
@@ -83,3 +83,4 @@ PG_LWLOCK(49, WALSummarizer)
 PG_LWLOCK(50, DSMRegistry)
 PG_LWLOCK(51, InjectionPoint)
 PG_LWLOCK(52, SerialControl)
+PG_LWLOCK(53, DataChecksumsWorker)
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 5a3dd5d2d4..64137092a9 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -449,9 +449,10 @@ extern PGDLLIMPORT PGPROC *PreparedXactProcs;
  * Background writer, checkpointer, WAL writer, WAL summarizer, and archiver
  * run during normal operation.  Startup process and WAL receiver also consume
  * 2 slots, but WAL writer is launched only after startup has exited, so we
- * only need 6 slots.
+ * only need 6 slots to cover these. The DataChecksums worker and launcher
+ * can consume 2 slots when data checksums are enabled or disabled.
  */
-#define NUM_AUXILIARY_PROCS		6
+#define NUM_AUXILIARY_PROCS		8
 
 /* configurable options */
 extern PGDLLIMPORT int DeadlockTimeout;
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 221073def3..dbe68d7eed 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -54,6 +54,11 @@ typedef enum
 typedef enum
 {
 	PROCSIGNAL_BARRIER_SMGRRELEASE, /* ask smgr to close files */
+
+	PROCSIGNAL_BARRIER_CHECKSUM_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_ON,
 } ProcSignalBarrierType;
 
 /*
diff --git a/src/include/utils/backend_progress.h b/src/include/utils/backend_progress.h
index 7b63d38f97..063d3ef86c 100644
--- a/src/include/utils/backend_progress.h
+++ b/src/include/utils/backend_progress.h
@@ -28,6 +28,7 @@ typedef enum ProgressCommandType
 	PROGRESS_COMMAND_CREATE_INDEX,
 	PROGRESS_COMMAND_BASEBACKUP,
 	PROGRESS_COMMAND_COPY,
+	PROGRESS_COMMAND_DATACHECKSUMS,
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	20
diff --git a/src/test/Makefile b/src/test/Makefile
index 511a72e623..278ce3e8a8 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,16 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = perl postmaster regress isolation modules authentication recovery subscription
+SUBDIRS = \
+		perl \
+		postmaster \
+		regress \
+		isolation \
+		modules \
+		authentication \
+		recovery \
+		subscription \
+		checksum
 
 ifeq ($(with_icu),yes)
 SUBDIRS += icu
diff --git a/src/test/checksum/.gitignore b/src/test/checksum/.gitignore
new file mode 100644
index 0000000000..871e943d50
--- /dev/null
+++ b/src/test/checksum/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/checksum/Makefile b/src/test/checksum/Makefile
new file mode 100644
index 0000000000..fd03bf73df
--- /dev/null
+++ b/src/test/checksum/Makefile
@@ -0,0 +1,23 @@
+#-------------------------------------------------------------------------
+#
+# Makefile for src/test/checksum
+#
+# Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/checksum/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/test/checksum
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
+
+clean distclean maintainer-clean:
+	rm -rf tmp_check
diff --git a/src/test/checksum/README b/src/test/checksum/README
new file mode 100644
index 0000000000..0f0317060b
--- /dev/null
+++ b/src/test/checksum/README
@@ -0,0 +1,22 @@
+src/test/checksum/README
+
+Regression tests for data checksums
+===================================
+
+This directory contains a test suite for enabling data checksums
+in a running cluster.
+
+Running the tests
+=================
+
+    make check
+
+or
+
+    make installcheck
+
+NOTE: This creates a temporary installation (in the case of "check"),
+with multiple nodes, be they master or standby(s) for the purpose of
+the tests.
+
+NOTE: This requires the --enable-tap-tests argument to configure.
diff --git a/src/test/checksum/meson.build b/src/test/checksum/meson.build
new file mode 100644
index 0000000000..5f96b5c246
--- /dev/null
+++ b/src/test/checksum/meson.build
@@ -0,0 +1,15 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+tests += {
+  'name': 'checksums',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'tests': [
+      't/001_basic.pl',
+      't/002_restarts.pl',
+      't/003_standby_restarts.pl',
+      't/004_offline.pl',
+    ],
+  },
+}
diff --git a/src/test/checksum/t/001_basic.pl b/src/test/checksum/t/001_basic.pl
new file mode 100644
index 0000000000..4c64f6a14f
--- /dev/null
+++ b/src/test/checksum/t/001_basic.pl
@@ -0,0 +1,88 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are turned off
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable data checksums
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+# Wait for checksums to become enabled
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Enable data checksums again which should be a no-op..
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+# ..and make sure we can still read/write data
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums again
+$node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are disabled');
+
+# Test reading again
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure previously checksummed pages can be read back');
+
+# Re-enable checksums and make sure that the underlying data has changed to
+# ensure that checksums will be different.
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back the data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$node->stop;
+
+done_testing();
diff --git a/src/test/checksum/t/002_restarts.pl b/src/test/checksum/t/002_restarts.pl
new file mode 100644
index 0000000000..2697b72225
--- /dev/null
+++ b/src/test/checksum/t/002_restarts.pl
@@ -0,0 +1,92 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# restarting the processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, '1', "ensure checksums aren't enabled yet");
+
+$bsession->quit;
+$node->stop;
+$node->start;
+
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'inprogress-on', "ensure checksums aren't enabled yet");
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are turned on');
+
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+$result = $node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are turned off');
+
+done_testing();
diff --git a/src/test/checksum/t/003_standby_restarts.pl b/src/test/checksum/t/003_standby_restarts.pl
new file mode 100644
index 0000000000..6c0fe8f3bf
--- /dev/null
+++ b/src/test/checksum/t/003_standby_restarts.pl
@@ -0,0 +1,131 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# streaming replication
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize primary node
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+$node_primary->init(allows_streaming => 1, no_data_checksums => 1);
+$node_primary->start;
+my $backup_name = 'my_backup';
+
+# Take backup
+$node_primary->backup($backup_name);
+
+# Create streaming standby linking to primary
+my $node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $backup_name,
+	has_streaming => 1);
+$node_standby_1->start;
+
+# Create some content on the primary to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Wait for standbys to catch up
+$node_primary->wait_for_catchup($node_standby_1, 'replay',
+	$node_primary->lsn('insert'));
+
+# Check that checksums are turned off on all nodes
+my $result = $node_primary->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on primary');
+
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on standby_1');
+
+# Enable checksums for the cluster
+$node_primary->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+# Ensure that the primary switches to "inprogress-on"
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	"inprogress-on");
+is($result, 1, 'ensure checksums are in progress on primary');
+
+# Wait for checksum enable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Ensure that the standby has switched to "inprogress-on" or "on".  Normally it
+# would be "inprogress-on", but it is theoretically possible for the primary to
+# complete the checksum enabling *and* have the standby replay that record
+# before we reach the check below.
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting = 'off' FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'f');
+is($result, 1, 'ensure standby has absorbed the inprogress-on barrier');
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+
+is(($result eq 'inprogress-on' || $result eq 'on'),
+	1, 'ensure checksums are on, or in progress, on standby_1');
+
+# Insert some more data which should be checksummed on INSERT
+$node_primary->safe_psql('postgres',
+	"INSERT INTO t VALUES (generate_series(1, 10000));");
+
+# Wait for checksums enabled on the primary
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the primary');
+
+# Wait for checksums enabled on the standby
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the standby');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, '20000', 'ensure we can safely read all data with checksums');
+
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+# Disable checksums and ensure it's propagated to standby and that we can
+# still read all data
+$node_primary->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+# Wait for checksum disable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure data checksums are disabled on the primary 2');
+
+# Ensure that the standby has switched to off
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are off on standby_1');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, "20000", 'ensure we can safely read all data without checksums');
+
+done_testing();
diff --git a/src/test/checksum/t/004_offline.pl b/src/test/checksum/t/004_offline.pl
new file mode 100644
index 0000000000..9cee62c9b5
--- /dev/null
+++ b/src/test/checksum/t/004_offline.pl
@@ -0,0 +1,101 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums offline from various states
+# of checksum processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable checksums offline using pg_checksums
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are enabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums offline again using pg_checksums
+$node->stop;
+$node->checksum_disable_offline;
+$node->start;
+
+# Ensure that checksums are disabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, 1, 'ensure checksums are in the process of being enabled');
+
+# Turn the cluster off and enable checksums offline, then start back up
+$bsession->quit;
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are now enabled even though processing wasn't
+# restarted
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+done_testing();
diff --git a/src/test/meson.build b/src/test/meson.build
index 67376e4b7f..2ce656a82a 100644
--- a/src/test/meson.build
+++ b/src/test/meson.build
@@ -8,6 +8,7 @@ subdir('postmaster')
 subdir('recovery')
 subdir('subscription')
 subdir('modules')
+subdir('checksum')
 
 if ssl.found()
   subdir('ssl')
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 007571e948..72c6d066bd 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3566,6 +3566,40 @@ sub advance_wal
 	}
 }
 
+=item $node->checksum_enable_offline()
+
+Enable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_enable_offline
+{
+	my ($self) = @_;
+
+	print "### Enabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-e');
+	return;
+}
+
+=item checksum_disable_offline
+
+Disable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_disable_offline
+{
+	my ($self) = @_;
+
+	print "### Disabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-d');
+	return;
+}
+
 =pod
 
 =back
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 1847bbfa95..443977c3fa 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -396,6 +396,7 @@ CheckPointStmt
 CheckpointStatsData
 CheckpointerRequest
 CheckpointerShmemStruct
+ChecksumType
 Chromosome
 CkptSortItem
 CkptTsStatus
@@ -582,6 +583,10 @@ DataPageDeleteStack
 DataTypesUsageChecks
 DataTypesUsageVersionCheck
 DatabaseInfo
+DataChecksumsWorkerDatabase
+DataChecksumsWorkerResult
+DataChecksumsWorkerResultEntry
+DataChecksumsWorkerShmemStruct
 DateADT
 DateTimeErrorExtra
 Datum
@@ -4079,6 +4084,7 @@ xl_btree_split
 xl_btree_unlink_page
 xl_btree_update
 xl_btree_vacuum
+xl_checksum_state
 xl_clog_truncate
 xl_commit_ts_truncate
 xl_dbase_create_file_copy_rec
-- 
2.39.3 (Apple Git-146)

#16Daniel Gustafsson
daniel@yesql.se
In reply to: Daniel Gustafsson (#15)
1 attachment(s)
Re: Changing the state of data checksums in a running cluster

On 5 Nov 2024, at 13:51, Daniel Gustafsson <daniel@yesql.se> wrote:

On 4 Nov 2024, at 12:27, Daniel Gustafsson <daniel@yesql.se> wrote:

Attached is a rebased v6 fixing the tests to handle that checksums are now on
by default, no other changes are made as no outstanding review comments or
identified bugs exist.

Does anyone object to going ahead with this?

And a new rebase to cope with recent changes,

..and one more since I forgot to git add the new expected output testfile.

--
Daniel Gustafsson

Attachments:

v8-0001-Online-enabling-and-disabling-of-data-checksums.patchapplication/octet-stream; name=v8-0001-Online-enabling-and-disabling-of-data-checksums.patch; x-unix-mode=0644Download
From a5e90dde77ae3d64f8073fdf7cc8f4dd868ce13c Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Fri, 11 Oct 2024 09:26:13 +0200
Subject: [PATCH v8] Online enabling and disabling of data checksums

This allows data checksums to be enabled, or disabled, in a running
cluster without restricting access to the cluster during processing.

Data checksums could prior to this only be enabled during initdb or
when the cluster is offline using the pg_checksums app. This commit
introduce functionality to enable, or disable, data checksums while
the cluster is running regardless of how it was initialized.

A background worker launcher process is responsible for launching a
dynamic per-database background worker which will mark all buffers
dirty for all relation with storage in order for them to have data
checksums calcuated on write.  Once all relations in all databases
have been processed, the data_checksums state will be set to on and
the cluster will at that point be identical to one which had data
checksums enabled during initialization or via offline processing.

When data checksums are being enabled, concurrent I/O operations
from backends other than the data checksums worker will write the
checksums but not verify them on reading.  Only when all backends
have absorbed the procsignalbarrier for setting data_checksums to
on will they also start verifying checksums on reading.  The same
process is repeated during disabling; all backends write checksums
but do not verify them until the barrier for setting the state to
off has been absorbed by all.  This in-progress state is used to
ensure there are no false negatives (or positives) due to reading
a checksum which is not in sync with the page.

This work is based on an earlier version of this patch which was
reviewed by among others Heikki Linnakangas, Robert Haas, Andres
Freund, Tomas Vondra, Michael Banck and Andrey Borodin.

Author: Daniel Gustafsson <daniel@yesql.se>
Author: Magnus Hagander <magnus@hagander.net>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com
Discussion: https://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de
Discussion: https://postgr.es/m/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com
---
 doc/src/sgml/func.sgml                        |   71 +
 doc/src/sgml/glossary.sgml                    |   23 +
 doc/src/sgml/monitoring.sgml                  |  206 ++-
 doc/src/sgml/ref/pg_checksums.sgml            |    6 +
 doc/src/sgml/wal.sgml                         |   57 +-
 src/backend/access/rmgrdesc/xlogdesc.c        |   24 +
 src/backend/access/transam/xlog.c             |  450 +++++-
 src/backend/access/transam/xlogfuncs.c        |   41 +
 src/backend/backup/basebackup.c               |    6 +-
 src/backend/catalog/system_functions.sql      |   11 +
 src/backend/catalog/system_views.sql          |   21 +
 src/backend/postmaster/Makefile               |    1 +
 src/backend/postmaster/bgworker.c             |    7 +
 src/backend/postmaster/datachecksumsworker.c  | 1376 +++++++++++++++++
 src/backend/postmaster/meson.build            |    1 +
 src/backend/replication/logical/decode.c      |    1 +
 src/backend/storage/ipc/ipci.c                |    4 +
 src/backend/storage/ipc/procsignal.c          |   13 +
 src/backend/storage/page/README               |    4 +-
 src/backend/storage/page/bufpage.c            |    6 +-
 src/backend/utils/activity/pgstat_io.c        |    2 +
 .../utils/activity/wait_event_names.txt       |    3 +
 src/backend/utils/adt/pgstatfuncs.c           |    8 +-
 src/backend/utils/init/miscinit.c             |    9 +-
 src/backend/utils/init/postinit.c             |    7 +-
 src/backend/utils/misc/guc_tables.c           |   32 +-
 src/bin/pg_checksums/pg_checksums.c           |    2 +-
 src/bin/pg_upgrade/controldata.c              |    9 +
 src/include/access/xlog.h                     |   16 +-
 src/include/access/xlog_internal.h            |    7 +
 src/include/catalog/pg_control.h              |    1 +
 src/include/catalog/pg_proc.dat               |   17 +
 src/include/commands/progress.h               |   18 +
 src/include/miscadmin.h                       |    4 +
 src/include/postmaster/datachecksumsworker.h  |   31 +
 src/include/storage/bufpage.h                 |   10 +
 src/include/storage/checksum.h                |    8 +
 src/include/storage/lwlocklist.h              |    1 +
 src/include/storage/proc.h                    |    5 +-
 src/include/storage/procsignal.h              |    5 +
 src/include/utils/backend_progress.h          |    1 +
 src/test/Makefile                             |   11 +-
 src/test/checksum/.gitignore                  |    2 +
 src/test/checksum/Makefile                    |   23 +
 src/test/checksum/README                      |   22 +
 src/test/checksum/meson.build                 |   15 +
 src/test/checksum/t/001_basic.pl              |   88 ++
 src/test/checksum/t/002_restarts.pl           |   92 ++
 src/test/checksum/t/003_standby_restarts.pl   |  131 ++
 src/test/checksum/t/004_offline.pl            |  101 ++
 src/test/meson.build                          |    1 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |   34 +
 src/test/regress/expected/rules.out           |   28 +
 src/tools/pgindent/typedefs.list              |    6 +
 54 files changed, 3030 insertions(+), 49 deletions(-)
 create mode 100644 src/backend/postmaster/datachecksumsworker.c
 create mode 100644 src/include/postmaster/datachecksumsworker.h
 create mode 100644 src/test/checksum/.gitignore
 create mode 100644 src/test/checksum/Makefile
 create mode 100644 src/test/checksum/README
 create mode 100644 src/test/checksum/meson.build
 create mode 100644 src/test/checksum/t/001_basic.pl
 create mode 100644 src/test/checksum/t/002_restarts.pl
 create mode 100644 src/test/checksum/t/003_standby_restarts.pl
 create mode 100644 src/test/checksum/t/004_offline.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 73979f20ff..bd621e924c 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29606,6 +29606,77 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
 
   </sect2>
 
+  <sect2 id="functions-admin-checksum">
+   <title>Data Checksum Functions</title>
+
+   <para>
+    The functions shown in <xref linkend="functions-checksums-table" /> can
+    be used to enable or disable data checksums in a running cluster.
+    See <xref linkend="checksums" /> for details.
+   </para>
+
+   <table id="functions-checksums-table">
+    <title>Data Checksum Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_enable_data_checksums</primary>
+        </indexterm>
+        <function>pg_enable_data_checksums</function> ( <optional><parameter>cost_delay</parameter> <type>int</type>, <parameter>cost_limit</parameter> <type>int</type></optional> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Initiates data checksums for the cluster. This will switch the data
+        checksums mode to <literal>inprogress-on</literal> as well as start a
+        background worker that will process all pages in the database and
+        enable checksums on them. When all data pages have had checksums
+        enabled, the cluster will automatically switch data checksums mode to
+        <literal>on</literal>.
+       </para>
+       <para>
+        If <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter> are
+        specified, the speed of the process is throttled using the same principles as
+        <link linkend="runtime-config-resource-vacuum-cost">Cost-based Vacuum Delay</link>.
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_disable_data_checksums</primary>
+        </indexterm>
+        <function>pg_disable_data_checksums</function> ()
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Disables data checksum validation and calculation for the cluster. This
+        will switch the data checksum mode to <literal>inprogress-off</literal>
+        while data checksums are being disabled. When all active backends have
+        stopped validating data checksums, the data checksum mode will be
+        changed to <literal>off</literal>.  At this point the data pages will
+        still have checksums recorded but they are not updated when pages are
+        modified.
+       </para></entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+
+  </sect2>
+
   <sect2 id="functions-admin-dbobject">
    <title>Database Object Management Functions</title>
 
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index f54f25c1c6..57a9188dc1 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -159,6 +159,8 @@
      (but not the autovacuum workers),
      the <glossterm linkend="glossary-background-writer">background writer</glossterm>,
      the <glossterm linkend="glossary-checkpointer">checkpointer</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker">data checksums worker</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker-launcher">data checksums worker launcher</glossterm>,
      the <glossterm linkend="glossary-logger">logger</glossterm>,
      the <glossterm linkend="glossary-startup-process">startup process</glossterm>,
      the <glossterm linkend="glossary-wal-archiver">WAL archiver</glossterm>,
@@ -548,6 +550,27 @@
    </glossdef>
   </glossentry>
 
+  <glossentry id="glossary-data-checksums-worker">
+   <glossterm>Data Checksums Worker</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which enables or disables data checksums in a specific database.
+    </para>
+   </glossdef>
+  </glossentry>
+
+  <glossentry id="glossary-data-checksums-worker-launcher">
+   <glossterm>Data Checksums Worker Launcher</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which starts <glossterm linkend="glossary-data-checksums-worker"> processes</glossterm>
+     for each database.
+    </para>
+   </glossdef>
+  </glossentry>
+
   <glossentry id="glossary-db-cluster">
    <glossterm>Database cluster</glossterm>
    <glossdef>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 331315f8d3..aeb96aeceb 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3496,8 +3496,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Number of data page checksum failures detected in this
-       database (or on a shared object), or NULL if data checksums are not
-       enabled.
+       database. Detected failures are reported regardless of the
+       <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -3507,8 +3507,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Time at which the last data page checksum failure was detected in
-       this database (or on a shared object), or NULL if data checksums are not
-       enabled.
+       this database (or on a shared object). Last failure is reported
+       regardless of the <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -6670,6 +6670,204 @@ FROM pg_stat_get_backend_idset() AS backendid;
 
  </sect2>
 
+ <sect2 id="data-checksum-progress-reporting">
+  <title>Data Checksum Progress Reporting</title>
+
+  <indexterm>
+   <primary>pg_stat_progress_data_checksums</primary>
+  </indexterm>
+
+  <para>
+   When data checksums are being enabled on a running cluster, the
+   <structname>pg_stat_progress_data_checksums</structname> view will contain
+   a row for each background worker which is currently calculating checksums
+   for the data pages.
+  </para>
+
+  <table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
+   <title><structname>pg_stat_progress_data_checksums</structname> View</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        Column Type
+       </para>
+       <para>
+        Description>
+       </para>
+      </entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>pid</structfield> <type>integer</type>
+       </para>
+       <para>
+        Process ID of a datachecksumworker process.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>phase</structfield> <type>text</type>
+       </para>
+       <para>
+        Current processing phase, see <xref linkend="datachecksum-phases"/>
+        for description of the phases.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of databases which will be processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of relations which will be processed, or
+        <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of relations yet.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of databases which have been processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of relations which have been processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_current</structfield> <type>oid</type>
+       </para>
+       <para>
+        OID of the database currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current</structfield> <type>oid</type>
+       </para>
+       <para>
+        OID of the relation currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current_blocks</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of blocks in the relation currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current_blocks_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks which have been processed in the relation currently
+        being processed.
+       </para>
+      </entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+  <table id="datachecksum-phases">
+   <title>Data Checksum Phases</title>
+   <tgroup cols="2">
+    <colspec colname="col1" colwidth="1*"/>
+    <colspec colname="col2" colwidth="2*"/>
+    <thead>
+     <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+    <tbody>
+     <row>
+      <entry><literal>enabling</literal></entry>
+      <entry>
+       The command is currently enabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>disabling</literal></entry>
+      <entry>
+       The command is currently disabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on backends</literal></entry>
+      <entry>
+       The command is currently waiting for backends to acknowledge the data
+       checksum operation.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on temporary tables</literal></entry>
+      <entry>
+       The command is currently waiting for all temporary tables which existed
+       at the time the command was started to be removed.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>done</literal></entry>
+      <entry>
+       The command has finished processing and is exiting.
+      </entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+ </sect2>
+
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index 95043aa329..0343710af5 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -45,6 +45,12 @@ PostgreSQL documentation
    exit status is nonzero if the operation failed.
   </para>
 
+  <para>
+   When enabling checksums, if checksums were in the process of being enabled
+   when the cluster was shut down, <application>pg_checksums</application>
+   will still process all relations regardless of the online processing.
+  </para>
+
   <para>
    When verifying checksums, every file in the cluster is scanned. When
    enabling checksums, each relation file block with a changed checksum is
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index a34cddb5ed..448a9c0c94 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -248,9 +248,10 @@
   <para>
    Checksums are normally enabled when the cluster is initialized using <link
    linkend="app-initdb-data-checksums"><application>initdb</application></link>.
-   They can also be enabled or disabled at a later time as an offline
-   operation. Data checksums are enabled or disabled at the full cluster
-   level, and cannot be specified individually for databases or tables.
+   They can also be enabled or disabled at a later time either as an offline
+   operation or online in a running cluster allowing concurrent access. Data
+   checksums are enabled or disabled at the full cluster level, and cannot be
+   specified individually for databases or tables.
   </para>
 
   <para>
@@ -267,7 +268,7 @@
   </para>
 
   <sect2 id="checksums-offline-enable-disable">
-   <title>Off-line Enabling of Checksums</title>
+   <title>Offline Enabling of Checksums</title>
 
    <para>
     The <link linkend="app-pgchecksums"><application>pg_checksums</application></link>
@@ -276,6 +277,54 @@
    </para>
 
   </sect2>
+
+  <sect2 id="checksums-online-enable-disable">
+   <title>Online Enabling of Checksums</title>
+
+   <para>
+    Checksums can be enabled or disabled online, by calling the appropriate
+    <link linkend="functions-admin-checksum">functions</link>.
+   </para>
+
+   <para>
+    Enabling checksums will put the cluster checksum mode in
+    <literal>inprogress-on</literal> mode.  During this time, checksums will be
+    written but not verified. In addition to this, a background worker process
+    is started that enables checksums on all existing data in the cluster. Once
+    this worker has completed processing all databases in the cluster, the
+    checksum mode will automatically switch to <literal>on</literal>. The
+    processing will consume two background worker processes, make sure that
+    <varname>max_worker_processes</varname> allows for at least two more
+    additional processes.
+   </para>
+
+   <para>
+    The process will initially wait for all open transactions to finish before
+    it starts, so that it can be certain that there are no tables that have been
+    created inside a transaction that has not committed yet and thus would not
+    be visible to the process enabling checksums. It will also, for each database,
+    wait for all pre-existing temporary tables to get removed before it finishes.
+    If long-lived temporary tables are used in the application it may be necessary
+    to terminate these application connections to allow the process to complete.
+   </para>
+
+   <para>
+    If the cluster is stopped while in <literal>inprogress-on</literal> mode, for
+    any reason, then this process must be restarted manually. To do this,
+    re-execute the function <function>pg_enable_data_checksums()</function>
+    once the cluster has been restarted. The background worker will attempt
+    to resume the work from where it was interrupted.
+   </para>
+
+   <note>
+    <para>
+     Enabling checksums can cause significant I/O to the system, as most of the
+     database pages will need to be rewritten, and will be written both to the
+     data files and the WAL.
+    </para>
+   </note>
+
+  </sect2>
  </sect1>
 
   <sect1 id="wal-intro">
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 363294d623..465153e5d1 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -18,6 +18,7 @@
 #include "access/xlog.h"
 #include "access/xlog_internal.h"
 #include "catalog/pg_control.h"
+#include "storage/bufpage.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -167,6 +168,26 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
 		memcpy(&wal_level, rec, sizeof(int));
 		appendStringInfo(buf, "wal_level %s", get_wal_level_string(wal_level));
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state xlrec;
+
+		memcpy(&xlrec, rec, sizeof(xl_checksum_state));
+		switch (xlrec.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_VERSION:
+				appendStringInfoString(buf, "on");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				appendStringInfoString(buf, "inprogress-off");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				appendStringInfoString(buf, "inprogress-on");
+				break;
+			default:
+				appendStringInfoString(buf, "off");
+		}
+	}
 }
 
 const char *
@@ -218,6 +239,9 @@ xlog_identify(uint8 info)
 		case XLOG_CHECKPOINT_REDO:
 			id = "CHECKPOINT_REDO";
 			break;
+		case XLOG_CHECKSUMS:
+			id = "CHECKSUMS";
+			break;
 	}
 
 	return id;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 6f58412bca..3d676f0f62 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -647,6 +647,16 @@ static XLogRecPtr LocalMinRecoveryPoint;
 static TimeLineID LocalMinRecoveryPointTLI;
 static bool updateMinRecoveryPoint = true;
 
+/*
+ * Local state fror Controlfile data_checksum_version. After initialization
+ * this is only updated when absorbing a procsignal barrier during interrupt
+ * processing. The reason for keeping a copy in backend-private memory is to
+ * avoid locking for interrogating checksum state. Possible values are the
+ * checksum versions defined in storage/bufpage.h as well as zero when data
+ * checksums are disabled.
+ */
+static uint32 LocalDataChecksumVersion = 0;
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -715,6 +725,8 @@ static void WALInsertLockAcquireExclusive(void);
 static void WALInsertLockRelease(void);
 static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);
 
+static void XLogChecksums(ChecksumType new_type);
+
 /*
  * Insert an XLOG record represented by an already-constructed chain of data
  * chunks.  This is a low-level routine; to construct the WAL record header
@@ -828,9 +840,10 @@ XLogInsertRecord(XLogRecData *rdata,
 		 * only happen just after a checkpoint, so it's better to be slow in
 		 * this case and fast otherwise.
 		 *
-		 * Also check to see if fullPageWrites was just turned on or there's a
-		 * running backup (which forces full-page writes); if we weren't
-		 * already doing full-page writes then go back and recompute.
+		 * Also check to see if fullPageWrites was just turned on, there's a
+		 * running backup or if checksums are enabled (all of which forces
+		 * full-page writes); if we weren't already doing full-page writes
+		 * then go back and recompute.
 		 *
 		 * If we aren't doing full-page writes then RedoRecPtr doesn't
 		 * actually affect the contents of the XLOG record, so we'll update
@@ -843,7 +856,9 @@ XLogInsertRecord(XLogRecData *rdata,
 			Assert(RedoRecPtr < Insert->RedoRecPtr);
 			RedoRecPtr = Insert->RedoRecPtr;
 		}
-		doPageWrites = (Insert->fullPageWrites || Insert->runningBackups > 0);
+		doPageWrites = (Insert->fullPageWrites ||
+						Insert->runningBackups > 0 ||
+						DataChecksumsNeedWrite());
 
 		if (doPageWrites &&
 			(!prevDoPageWrites ||
@@ -4546,9 +4561,7 @@ ReadControlFile(void)
 
 	CalculateCheckpointSegments();
 
-	/* Make the initdb settings visible as GUC variables, too */
-	SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no",
-					PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
+	LocalDataChecksumVersion = ControlFile->data_checksum_version;
 }
 
 /*
@@ -4582,13 +4595,345 @@ GetMockAuthenticationNonce(void)
 }
 
 /*
- * Are checksums enabled for data pages?
+ * DataChecksumsNeedWrite
+ *		Returns whether data checksums must be written or not
+ *
+ * Returns true iff data checksums are enabled or are in the process of being
+ * enabled. During "inprogress-on" and "inprogress-off" states checksums must
+ * be written even though they are not verified (see datachecksumsworker.c for
+ * a longer discussion).
+ *
+ * This function is intended for callsites which are about to write a data page
+ * to storage, and need to know whether to re-calculate the checksum for the
+ * page header. Calling this function must be performed as close to the write
+ * operation as possible to keep the critical section short.
+ */
+bool
+DataChecksumsNeedWrite(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * DataChecksumsNeedVerify
+ *		Returns whether data checksums must be verified or not
+ *
+ * Data checksums are only verified if they are fully enabled in the cluster.
+ * During the "inprogress-on" and "inprogress-off" states they are only
+ * updated, not verified (see datachecksumsworker.c for a longer discussion).
+ *
+ * This function is intended for callsites which have read data and are about
+ * to perform checksum validation based on the result of this.  Calling this
+ * function must be performed as close to the validation call as possible to
+ * keep the critical section short. This is in order to protect against time of
+ * check/time of use situations around data checksum validation.
+ */
+bool
+DataChecksumsNeedVerify(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
+}
+
+/*
+ * DataChecksumsOnInProgress
+ *		Returns whether data checksums are being enabled
+ *
+ * Most operations don't need to worry about the "inprogress" states, and
+ * should use DataChecksumsNeedVerify() or DataChecksumsNeedWrite(). The
+ * "inprogress-on" state for enabling checksums is used when the checksum
+ * worker is setting checksums on all pages, it can thus be used to check for
+ * aborted checksum processing which need to be restarted.
+ */
+inline bool
+DataChecksumsOnInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+}
+
+/*
+ * DataChecksumsOffInProgress
+ *		Returns whether data checksums are being disabled
+ *
+ * The "inprogress-off" state for disabling checksums is used for when the
+ * worker resets the catalog state.  DataChecksumsNeedVerify() or
+ * DataChecksumsNeedWrite() should be used for deciding whether to read/write
+ * checksums.
  */
 bool
-DataChecksumsEnabled(void)
+DataChecksumsOffInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * SetDataChecksumsOnInProgress
+ *		Sets the data checksum state to "inprogress-on" to enable checksums
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on". See
+ * SetDataChecksumsOn below for a description on how this state change works.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOnInProgress(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile != NULL);
+
+	/*
+	 * The state transition is performed in a critical section with
+	 * checkpoints held off to provide crash safety.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state change in all backends to ensure that all backends are in
+	 * "inprogress-on". Once done we know that all backends are writing data
+	 * checksums.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOn
+ *		Enables data checksums cluster-wide
+ *
+ * Enabling data checksums is performed using two barriers, the first one to
+ * set the checksums state to "inprogress-on" (which is performed by
+ * SetDataChecksumsOnInProgress()) and the second one to set the state to "on"
+ * (performed here).
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on".  This state
+ * requires data checksums to be written but not verified. This ensures that
+ * all data pages can be checksummed without the risk of false negatives in
+ * validation during the process.  When all existing pages are guaranteed to
+ * have checksums, and all new pages will be initiated with checksums, the
+ * state can be changed to "on". Once the state is "on" checksums will be both
+ * written and verified. See datachecksumsworker.c for a longer discussion on
+ * how data checksums can be enabled in a running cluster.
+ *
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOn(void)
 {
+	uint64		barrier;
+
 	Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/*
+	 * The only allowed state transition to "on" is from "inprogress-on" since
+	 * that state ensures that all pages will have data checksums written.
+	 */
+	if (ControlFile->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+		elog(ERROR, "checksums not in \"inprogress-on\" mode");
+	}
+
+	LWLockRelease(ControlFileLock);
+
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state transition of "on" in all backends. When done we know that
+	 * data checksums are enabled in all backends and data checksums are both
+	 * written and verified.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOff
+ *		Disables data checksums cluster-wide
+ *
+ * Disabling data checksums must be performed with two sets of barriers, each
+ * carrying a different state. The state is first set to "inprogress-off"
+ * during which checksums are still written but not verified. This ensures that
+ * backends which have yet to observe the state change from "on" won't get
+ * validation errors on concurrently modified pages. Once all backends have
+ * changed to "inprogress-off", the barrier for moving to "off" can be emitted.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOff(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/* If data checksums are already disabled there is nothing to do */
+	if (ControlFile->data_checksum_version == 0)
+	{
+		LWLockRelease(ControlFileLock);
+		return;
+	}
+
+	/*
+	 * If data checksums are currently enabled we first transition to the
+	 * "inprogress-off" state during which backends continue to write
+	 * checksums without verifying them. When all backends are in
+	 * "inprogress-off" the next transition to "off" can be performed, after
+	 * which all data checksum processing is disabled.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+
+		MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+		START_CRIT_SECTION();
+
+		XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+		LWLockRelease(ControlFileLock);
+
+		barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+
+		END_CRIT_SECTION();
+		MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+
+		/*
+		 * Update local state in all backends to ensure that any backend in
+		 * "on" state is changed to "inprogress-off".
+		 */
+		WaitForProcSignalBarrier(barrier);
+
+		/*
+		 * At this point we know that no backends are verifying data checksums
+		 * during reading. Next, we can safely move to state "off" to also
+		 * stop writing checksums.
+		 */
+	}
+	else
+	{
+		/*
+		 * Ending up here implies that the checksums state is "inprogress-on"
+		 * or "inprogress-off" and we can transition directly to "off" from
+		 * there.
+		 */
+		LWLockRelease(ControlFileLock);
+	}
+
+	/*
+	 * Ensure that we don't incur a checkpoint during disabling checksums.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(0);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = 0;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * ProcSignalBarrier absorption functions for enabling and disabling data
+ * checksums in a running cluster. The procsignalbarriers are emitted in the
+ * SetDataChecksums* functions.
+ */
+bool
+AbsorbChecksumsOnInProgressBarrier(void)
+{
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOnBarrier(void)
+{
+	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffInProgressBarrier(void)
+{
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffBarrier(void)
+{
+	LocalDataChecksumVersion = 0;
+	return true;
+}
+
+/*
+ * InitLocalControlData
+ *
+ * Set up backend local caches of controldata variables which may change at
+ * any point during runtime and thus require special cased locking. So far
+ * this only applies to data_checksum_version, but it's intended to be general
+ * purpose enough to handle future cases.
+ */
+void
+InitLocalControldata(void)
+{
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	LocalDataChecksumVersion = ControlFile->data_checksum_version;
+	LWLockRelease(ControlFileLock);
+}
+
+/* guc hook */
+const char *
+show_data_checksums(void)
+{
+	if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)
+		return "on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		return "inprogress-on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+		return "inprogress-off";
+	else
+		return "off";
 }
 
 /*
@@ -6148,6 +6493,33 @@ StartupXLOG(void)
 	 */
 	CompleteCommitTsInitialization();
 
+	/*
+	 * If we reach this point with checksums being enabled ("inprogress-on"
+	 * state), we notify the user that they need to manually restart the
+	 * process to enable checksums. This is because we cannot launch a dynamic
+	 * background worker directly from here, it has to be launched from a
+	 * regular backend.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		ereport(WARNING,
+				(errmsg("data checksums are being enabled, but no worker is running"),
+				 errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
+
+	/*
+	 * If data checksums were being disabled when the cluster was shut down,
+	 * we know that we have a state where all backends have stopped validating
+	 * checksums and we can move to off instead of prompting the user to
+	 * perform any action.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+	{
+		XLogChecksums(0);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = 0;
+		LWLockRelease(ControlFileLock);
+	}
+
 	/*
 	 * All done with end-of-recovery actions.
 	 *
@@ -8155,6 +8527,24 @@ XLogReportParameters(void)
 	}
 }
 
+/*
+ * Log the new state of checksums
+ */
+static void
+XLogChecksums(ChecksumType new_type)
+{
+	xl_checksum_state xlrec;
+	XLogRecPtr	recptr;
+
+	xlrec.new_checksumtype = new_type;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, sizeof(xl_checksum_state));
+
+	recptr = XLogInsert(RM_XLOG_ID, XLOG_CHECKSUMS);
+	XLogFlush(recptr);
+}
+
 /*
  * Update full_page_writes in shared memory, and write an
  * XLOG_FPW_CHANGE record if necessary.
@@ -8583,6 +8973,46 @@ xlog_redo(XLogReaderState *record)
 	{
 		/* nothing to do here, just for informational purposes */
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state state;
+		uint64		barrier;
+
+		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = state.new_checksumtype;
+		LWLockRelease(ControlFileLock);
+
+		/*
+		 * Block on a procsignalbarrier to await all processes having seen the
+		 * change to checksum status. Once the barrier has been passed we can
+		 * initiate the corresponding processing.
+		 */
+		switch (state.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			default:
+				Assert(state.new_checksumtype == 0);
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+		}
+	}
 }
 
 /*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index b0c6d7c687..bf6f5a8347 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -26,6 +26,7 @@
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "postmaster/datachecksumsworker.h"
 #include "replication/walreceiver.h"
 #include "storage/fd.h"
 #include "storage/latch.h"
@@ -748,3 +749,43 @@ pg_promote(PG_FUNCTION_ARGS)
 						   wait_seconds)));
 	PG_RETURN_BOOL(false);
 }
+
+/*
+ * Disables data checksums for the cluster, if applicable. Starts a background
+ * worker which turns off the data checksums.
+ */
+Datum
+disable_data_checksums(PG_FUNCTION_ARGS)
+{
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	StartDataChecksumsWorkerLauncher(false, 0, 0, false);
+	PG_RETURN_VOID();
+}
+
+/*
+ * Enables data checksums for the cluster, if applicable.  Supports vacuum-
+ * like cost based throttling to limit system load. Starts a background worker
+ * which updates data checksums on existing data.
+ */
+Datum
+enable_data_checksums(PG_FUNCTION_ARGS)
+{
+	int			cost_delay = PG_GETARG_INT32(0);
+	int			cost_limit = PG_GETARG_INT32(1);
+	bool		fast = PG_GETARG_BOOL(2);
+
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	if (cost_delay < 0)
+		ereport(ERROR, errmsg("cost delay cannot be a negative value"));
+
+	if (cost_limit <= 0)
+		ereport(ERROR, errmsg("cost limit must be greater than zero"));
+
+	StartDataChecksumsWorkerLauncher(true, cost_delay, cost_limit, fast);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index e2ed9081d1..122ec9c3b1 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1613,7 +1613,8 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	 * enabled for this cluster, and if this is a relation file, then verify
 	 * the checksum.
 	 */
-	if (!noverify_checksums && DataChecksumsEnabled() &&
+	if (!noverify_checksums &&
+		DataChecksumsNeedWrite() &&
 		RelFileNumberIsValid(relfilenumber))
 		verify_checksum = true;
 
@@ -2006,6 +2007,9 @@ verify_page_checksum(Page page, XLogRecPtr start_lsn, BlockNumber blkno,
 	if (PageIsNew(page) || PageGetLSN(page) >= start_lsn)
 		return true;
 
+	if (!DataChecksumsNeedVerify())
+		return true;
+
 	/* Perform the actual checksum calculation. */
 	checksum = pg_checksum_page(page, blkno);
 
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index c51dfca802..2301a338fb 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -668,6 +668,13 @@ LANGUAGE INTERNAL
 CALLED ON NULL INPUT VOLATILE
 AS 'pg_set_attribute_stats';
 
+CREATE OR REPLACE FUNCTION
+  pg_enable_data_checksums(cost_delay integer DEFAULT 0,
+                           cost_limit integer DEFAULT 100,
+						   fast boolean DEFAULT false)
+  RETURNS void STRICT VOLATILE LANGUAGE internal AS 'enable_data_checksums'
+  PARALLEL RESTRICTED;
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
@@ -791,6 +798,10 @@ REVOKE EXECUTE ON FUNCTION pg_ls_logicalmapdir() FROM PUBLIC;
 
 REVOKE EXECUTE ON FUNCTION pg_ls_replslotdir(text) FROM PUBLIC;
 
+REVOKE EXECUTE ON FUNCTION pg_enable_data_checksums(integer, integer, boolean) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_disable_data_checksums() FROM public;
+
 --
 -- We also set up some things as accessible to standard roles.
 --
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 3456b821bc..841d416fa5 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1323,6 +1323,27 @@ CREATE VIEW pg_stat_progress_copy AS
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_data_checksums AS
+	SELECT
+		S.pid AS pid, S.datid AS datid, D.datname AS datname,
+		CASE S.param1 WHEN 0 THEN 'enabling'
+                      WHEN 1 THEN 'disabling'
+					  WHEN 2 THEN 'waiting'
+					  WHEN 3 THEN 'waiting on backends'
+					  WHEN 4 THEN 'waiting on temporary tables'
+					  WHEN 5 THEN 'done'
+					  END AS phase,
+		CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+		CASE S.param3 WHEN -1 THEN NULL ELSE S.param3 END AS relations_total,
+		S.param4 AS databases_processed,
+		S.param5 AS relations_processed,
+		S.param6 AS databases_current,
+		S.param7 AS relation_current,
+		S.param8 AS relation_current_blocks,
+		S.param9 AS relation_current_blocks_processed
+	FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+		LEFT JOIN pg_database D ON S.datid = D.oid;
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/postmaster/Makefile b/src/backend/postmaster/Makefile
index db08543d19..e112d4b53e 100644
--- a/src/backend/postmaster/Makefile
+++ b/src/backend/postmaster/Makefile
@@ -18,6 +18,7 @@ OBJS = \
 	bgworker.o \
 	bgwriter.o \
 	checkpointer.o \
+	datachecksumsworker.o \
 	fork_process.o \
 	interrupt.o \
 	launch_backend.o \
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 07bc5517fc..662cd12681 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -18,6 +18,7 @@
 #include "pgstat.h"
 #include "port/atomics.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "replication/logicallauncher.h"
 #include "replication/logicalworker.h"
@@ -132,6 +133,12 @@ static const struct
 	},
 	{
 		"TablesyncWorkerMain", TablesyncWorkerMain
+	},
+	{
+		"DataChecksumsWorkerLauncherMain", DataChecksumsWorkerLauncherMain
+	},
+	{
+		"DataChecksumsWorkerMain", DataChecksumsWorkerMain
 	}
 };
 
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
new file mode 100644
index 0000000000..de7a077f9c
--- /dev/null
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -0,0 +1,1376 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.c
+ *	  Background worker for enabling or disabling data checksums online
+ *
+ * When enabling data checksums on a database at initdb time or when shut down
+ * with pg_checksums, no extra process is required as each page is checksummed,
+ * and verified, when accessed.  When enabling checksums on an already running
+ * cluster, this worker will ensure that all pages are checksummed before
+ * verification of the checksums is turned on. In the case of disabling
+ * checksums, the state transition is performed only in the control file, no
+ * changes are performed on the data pages.
+ *
+ * Checksums can be either enabled or disabled cluster-wide, with on/off being
+ * the end state for data_checksums.
+ *
+ * Enabling checksums
+ * ------------------
+ * When enabling checksums in an online cluster, data_checksums will be set to
+ * "inprogress-on" which signals that write operations MUST compute and write
+ * the checksum on the data page, but during reading the checksum SHALL NOT be
+ * verified. This ensures that all objects created during checksumming will
+ * have checksums set, but no reads will fail due to incorrect checksum. The
+ * DataChecksumsWorker will compile a list of databases which exist at the
+ * start of checksumming, and all of these which haven't been dropped during
+ * the processing MUST have been processed successfully in order for checksums
+ * to be enabled. Any new relation created during processing will see the
+ * in-progress state and will automatically be checksummed.
+ *
+ * For each database, all relations which have storage are read and every data
+ * page is marked dirty to force a write with the checksum. This will generate
+ * a lot of WAL as the entire database is read and written.
+ *
+ * If the processing is interrupted by a cluster restart, it will be restarted
+ * from the beginning again as state isn't persisted.
+ *
+ * Disabling checksums
+ * -------------------
+ * When disabling checksums, data_checksums will be set to "inprogress-off"
+ * which signals that checksums are written but no longer verified. This ensure
+ * that backends which have yet to move from the "on" state will still be able
+ * to process data checksum validation.
+ *
+ * Synchronization and Correctness
+ * -------------------------------
+ * The processes involved in enabling, or disabling, data checksums in an
+ * online cluster must be properly synchronized with the normal backends
+ * serving concurrent queries to ensure correctness. Correctness is defined
+ * as the following:
+ *
+ *    - Backends SHALL NOT violate local data_checksums state
+ *    - Data checksums SHALL NOT be considered enabled cluster-wide until all
+ *      currently connected backends have the local state "enabled"
+ *
+ * There are two levels of synchronization required for enabling data checksums
+ * in an online cluster: (i) changing state in the active backends ("on",
+ * "off", "inprogress-on" and "inprogress-off"), and (ii) ensuring no
+ * incompatible objects and processes are left in a database when workers end.
+ * The former deals with cluster-wide agreement on data checksum state and the
+ * latter with ensuring that any concurrent activity cannot break the data
+ * checksum contract during processing.
+ *
+ * Synchronizing the state change is done with procsignal barriers, where the
+ * WAL logging backend updating the global state in the controlfile will wait
+ * for all other backends to absorb the barrier. Barrier absorption will happen
+ * during interrupt processing, which means that connected backends will change
+ * state at different times. To prevent data checksum state changes when
+ * writing and verifying checksums, interrupts shall be held off before
+ * interrogating state and resumed when the IO operation has been performed.
+ *
+ *   When Enabling Data Checksums
+ *   ----------------------------
+ *   A process which fails to observe data checksums being enabled can induce
+ *   two types of errors: failing to write the checksum when modifying the page
+ *   and failing to validate the data checksum on the page when reading it.
+ *
+ *   When processing starts all backends belong to one of the below sets, with
+ *   one set being empty:
+ *
+ *   Bd: Backends in "off" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   If processing is started in an online cluster then all backends are in Bd.
+ *   If processing was halted by the cluster shutting down, the controlfile
+ *   state "inprogress-on" will be observed on system startup and all backends
+ *   will be in Bd. Backends transition Bd -> Bi via a procsignalbarrier.  When
+ *   the DataChecksumsWorker has finished writing checksums on all pages and
+ *   enables data checksums cluster-wide, there are four sets of backends where
+ *   Bd shall be an empty set:
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   Backends in Bi and Be will write checksums when modifying a page, but only
+ *   backends in Be will verify the checksum during reading. The Bg backend is
+ *   blocked waiting for all backends in Bi to process interrupts and move to
+ *   Be. Any backend starting while Bg is waiting on the procsignalbarrier will
+ *   observe the global state being "on" and will thus automatically belong to
+ *   Be.  Checksums are enabled cluster-wide when Bi is an empty set. Bi and Be
+ *   are compatible sets while still operating based on their local state as
+ *   both write data checksums.
+ *
+ *   When Disabling Data Checksums
+ *   -----------------------------
+ *   A process which fails to observe that data checksums have been disabled
+ *   can induce two types of errors: writing the checksum when modifying the
+ *   page and validating a data checksum which is no longer correct due to
+ *   modifications to the page.
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bo: Backends in "inprogress-off" state
+ *
+ *   Backends transition from the Be state to Bd like so: Be -> Bo -> Bd
+ *
+ *   The goal is to transition all backends to Bd making the others empty sets.
+ *   Backends in Bo write data checksums, but don't validate them, such that
+ *   backends still in Be can continue to validate pages until the barrier has
+ *   been absorbed such that they are in Bo. Once all backends are in Bo, the
+ *   barrier to transition to "off" can be raised and all backends can safely
+ *   stop writing data checksums as no backend is enforcing data checksum
+ *   validation any longer.
+ *
+ *
+ * Potential optimizations
+ * -----------------------
+ * Below are some potential optimizations and improvements which were brought
+ * up during reviews of this feature, but which weren't implemented in the
+ * initial version. These are ideas listed without any validation on their
+ * feasibility or potential payoff. More discussion on these can be found on
+ * the -hackers threads linked to in the commit message of this feature.
+ *
+ *   * Launching datachecksumsworker for resuming operation from the startup
+ *     process: Currently users have to restart processing manually after a
+ *     restart since dynamic background worker cannot be started from the
+ *     postmaster. Changing the startup process could make restarting the
+ *     processing automatic on cluster restart.
+ *   * Avoid dirtying the page when checksums already match: Iff the checksum
+ *     on the page happens to already match we still dirty the page. It should
+ *     be enough to only do the log_newpage_buffer() call in that case.
+ *   * Invent a lightweight WAL record that doesn't contain the full-page
+ *     image but just the block number: On replay, the redo routine would read
+ *     the page from disk.
+ *   * Teach pg_checksums to avoid checksummed pages when pg_checksums is used
+ *     to enable checksums on a cluster which is in inprogress-on state and
+ *     may have checksummed pages (make pg_checksums be able to resume an
+ *     online operation).
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/postmaster/datachecksumsworker.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/htup_details.h"
+#include "access/xact.h"
+#include "access/xloginsert.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_database.h"
+#include "commands/progress.h"
+#include "commands/vacuum.h"
+#include "common/relpath.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "postmaster/bgworker.h"
+#include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/bufmgr.h"
+#include "storage/checksum.h"
+#include "storage/ipc.h"
+#include "storage/lmgr.h"
+#include "storage/lwlock.h"
+#include "storage/procarray.h"
+#include "storage/smgr.h"
+#include "tcop/tcopprot.h"
+#include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
+#include "utils/ps_status.h"
+#include "utils/syscache.h"
+
+/*
+ * Number of times we retry to open a database before giving up and consider
+ * it to have failed processing.
+ */
+#define DATACHECKSUMSWORKER_MAX_DB_RETRIES 5
+
+typedef enum
+{
+	DATACHECKSUMSWORKER_SUCCESSFUL = 0,
+	DATACHECKSUMSWORKER_ABORTED,
+	DATACHECKSUMSWORKER_FAILED,
+	DATACHECKSUMSWORKER_RETRYDB,
+} DataChecksumsWorkerResult;
+
+/*
+ * Signaling between backends calling pg_enable/disable_data_checksums, the
+ * checksums launcher process, and the checksums worker process.
+ *
+ * This struct is protected by DataChecksumsWorkerLock
+ */
+typedef struct DataChecksumsWorkerShmemStruct
+{
+	/*
+	 * These are set by pg_enable/disable_data_checksums, to tell the launcher
+	 * what the target state is.
+	 */
+	bool		launch_enable_checksums;	/* True if checksums are being
+											 * enabled, else false */
+	int			launch_cost_delay;
+	int			launch_cost_limit;
+	bool		launch_fast;
+
+	/*
+	 * Is a launcher process is currently running?
+	 *
+	 * This is set by the launcher process, after it has read the above
+	 * launch_* parameters.
+	 */
+	bool		launcher_running;
+
+	/*
+	 * These fields indicate the target state that the launcher is currently
+	 * working towards. They can be different from the corresponding launch_*
+	 * fields, if a new pg_enable/disable_data_checksums() call was made while
+	 * the launcher/worker was already running.
+	 *
+	 * The below members are set when the launcher starts, and are only
+	 * accessed read-only by the single worker. Thus, we can access these
+	 * without a lock. If multiple workers, or dynamic cost parameters, are
+	 * supported at some point then this would need to be revisited.
+	 */
+	bool		enabling_checksums; /* True if checksums are being enabled,
+									 * else false */
+	int			cost_delay;
+	int			cost_limit;
+	bool		immediate_checkpoint;
+
+	/*
+	 * Signaling between the launcher and the worker process.
+	 *
+	 * As there is only a single worker, and the launcher won't read these
+	 * until the worker exits, they can be accessed without the need for a
+	 * lock. If multiple workers are supported then this will have to be
+	 * revisited.
+	 */
+
+	/* result, set by worker before exiting */
+	DataChecksumsWorkerResult success;
+
+	/*
+	 * tells the worker process whether it should also process the shared
+	 * catalogs
+	 */
+	bool		process_shared_catalogs;
+} DataChecksumsWorkerShmemStruct;
+
+/* Shared memory segment for datachecksumsworker */
+static DataChecksumsWorkerShmemStruct *DataChecksumsWorkerShmem;
+
+/* Bookkeeping for work to do */
+typedef struct DataChecksumsWorkerDatabase
+{
+	Oid			dboid;
+	char	   *dbname;
+} DataChecksumsWorkerDatabase;
+
+typedef struct DataChecksumsWorkerResultEntry
+{
+	Oid			dboid;
+	DataChecksumsWorkerResult result;
+	int			retries;
+} DataChecksumsWorkerResultEntry;
+
+
+/*
+ * Flag set by the interrupt handler
+ */
+static volatile sig_atomic_t abort_requested = false;
+
+/*
+ * Have we set the DataChecksumsWorkerShmemStruct->launcher_running flag?
+ * If we have, we need to clear it before exiting!
+ */
+static volatile sig_atomic_t launcher_running = false;
+
+/*
+ * Are we enabling data checksums, or disabling them?
+ */
+static bool enabling_checksums;
+
+/* Prototypes */
+static List *BuildDatabaseList(void);
+static List *BuildRelationList(bool temp_relations, bool include_shared);
+static void FreeDatabaseList(List *dblist);
+static DataChecksumsWorkerResult ProcessDatabase(DataChecksumsWorkerDatabase *db);
+static bool ProcessAllDatabases(bool immediate_checkpoint);
+static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy);
+static void launcher_cancel_handler(SIGNAL_ARGS);
+static void WaitForAllTransactionsToFinish(void);
+
+/*
+ * StartDataChecksumsWorkerLauncher
+ *		Main entry point for datachecksumsworker launcher process
+ *
+ * The main entrypoint for starting data checksums processing for enabling as
+ * well as disabling.
+ */
+void
+StartDataChecksumsWorkerLauncher(bool enable_checksums,
+								 int cost_delay,
+								 int cost_limit,
+								 bool fast)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	bool		launcher_running;
+
+	/* the cost delay settings have no effect when disabling */
+	Assert(enable_checksums || cost_delay == 0);
+	Assert(enable_checksums || cost_limit == 0);
+
+	/* store the desired state in shared memory */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	DataChecksumsWorkerShmem->launch_enable_checksums = enable_checksums;
+	DataChecksumsWorkerShmem->launch_cost_delay = cost_delay;
+	DataChecksumsWorkerShmem->launch_cost_limit = cost_limit;
+	DataChecksumsWorkerShmem->launch_fast = fast;
+
+	/* is the launcher already running? */
+	launcher_running = DataChecksumsWorkerShmem->launcher_running;
+
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * Launch a new launcher process, if it's not running already.
+	 *
+	 * If the launcher is currently busy enabling the checksums, and we want
+	 * them disabled (or vice versa), the launcher will notice that at latest
+	 * when it's about to exit, and will loop back process the new request. So
+	 * if the launcher is already running, we don't need to do anything more
+	 * here to abort it.
+	 *
+	 * If you call pg_enable/disable_data_checksums() twice in a row, before
+	 * the launcher has had a chance to start up, we still end up launching it
+	 * twice.  That's OK, the second invocation will see that a launcher is
+	 * already running and exit quickly.
+	 *
+	 * TODO: We could optimize here and skip launching the launcher, if we are
+	 * already in the desired state, i.e. if the checksums are already enabled
+	 * and you call pg_enable_data_checksums().
+	 */
+	if (!launcher_running)
+	{
+		/*
+		 * Prepare the BackgroundWorker and launch it.
+		 */
+		memset(&bgw, 0, sizeof(bgw));
+		bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+		bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+		snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+		snprintf(bgw.bgw_function_name, BGW_MAXLEN, "DataChecksumsWorkerLauncherMain");
+		snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker launcher");
+		snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker launcher");
+		bgw.bgw_restart_time = BGW_NEVER_RESTART;
+		bgw.bgw_notify_pid = MyProcPid;
+		bgw.bgw_main_arg = (Datum) 0;
+
+		if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+			ereport(ERROR,
+					(errmsg("failed to start background worker to process data checksums")));
+	}
+}
+
+/*
+ * ProcessSingleRelationFork
+ *		Enable data checksums in a single relation/fork.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy)
+{
+	BlockNumber numblocks = RelationGetNumberOfBlocksInFork(reln, forkNum);
+	BlockNumber blknum;
+	char		activity[NAMEDATALEN * 2 + 128];
+	char	   *relns;
+
+	relns = get_namespace_name(RelationGetNamespace(reln));
+
+	if (!relns)
+		return false;
+
+	/* Report the current relation to pgstat_activity */
+	snprintf(activity, sizeof(activity) - 1, "processing: %s.%s (%s, %dblocks)",
+			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	/*
+	 * We are looping over the blocks which existed at the time of process
+	 * start, which is safe since new blocks are created with checksums set
+	 * already due to the state being "inprogress-on".
+	 */
+	for (blknum = 0; blknum < numblocks; blknum++)
+	{
+		Buffer		buf = ReadBufferExtended(reln, forkNum, blknum, RBM_NORMAL, strategy);
+
+		/* Need to get an exclusive lock before we can flag as dirty */
+		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+
+		/*
+		 * Mark the buffer as dirty and force a full page write.  We have to
+		 * re-write the page to WAL even if the checksum hasn't changed,
+		 * because if there is a replica it might have a slightly different
+		 * version of the page with an invalid checksum, caused by unlogged
+		 * changes (e.g. hintbits) on the master happening while checksums
+		 * were off. This can happen if there was a valid checksum on the page
+		 * at one point in the past, so only when checksums are first on, then
+		 * off, and then turned on again. Iff wal_level is set to "minimal",
+		 * this could be avoided iff the checksum is calculated to be correct.
+		 */
+		START_CRIT_SECTION();
+		MarkBufferDirty(buf);
+		log_newpage_buffer(buf, false);
+		END_CRIT_SECTION();
+
+		UnlockReleaseBuffer(buf);
+
+		/*
+		 * This is the only place where we check if we are asked to abort, the
+		 * abortion will bubble up from here. It's safe to check this without
+		 * a lock, because if we miss it being set, we will try again soon.
+		 */
+		Assert(enabling_checksums);
+		if (!DataChecksumsWorkerShmem->launch_enable_checksums)
+			abort_requested = true;
+		if (abort_requested)
+			return false;
+
+		vacuum_delay_point();
+	}
+
+	pfree(relns);
+	return true;
+}
+
+/*
+ * ProcessSingleRelationByOid
+ *		Process a single relation based on oid.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationByOid(Oid relationId, BufferAccessStrategy strategy)
+{
+	Relation	rel;
+	ForkNumber	fnum;
+	bool		aborted = false;
+
+	StartTransactionCommand();
+
+	rel = try_relation_open(relationId, AccessShareLock);
+	if (rel == NULL)
+	{
+		/*
+		 * Relation no longer exists. We don't consider this an error since
+		 * there are no pages in it that need data checksums, and thus return
+		 * true. The worker operates off a list of relations generated at the
+		 * start of processing, so relations being dropped in the meantime is
+		 * to be expected.
+		 */
+		CommitTransactionCommand();
+		pgstat_report_activity(STATE_IDLE, NULL);
+		return true;
+	}
+	RelationGetSmgr(rel);
+
+	for (fnum = 0; fnum <= MAX_FORKNUM; fnum++)
+	{
+		if (smgrexists(rel->rd_smgr, fnum))
+		{
+			if (!ProcessSingleRelationFork(rel, fnum, strategy))
+			{
+				aborted = true;
+				break;
+			}
+		}
+	}
+	relation_close(rel, AccessShareLock);
+	elog(DEBUG2,
+		 "data checksum processing done for relation with OID %u: %s",
+		 relationId, (aborted ? "aborted" : "finished"));
+
+	CommitTransactionCommand();
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return !aborted;
+}
+
+/*
+ * ProcessDatabase
+ *		Enable data checksums in a single database.
+ *
+ * We do this by launching a dynamic background worker into this database, and
+ * waiting for it to finish.  We have to do this in a separate worker, since
+ * each process can only be connected to one database during its lifetime.
+ */
+static DataChecksumsWorkerResult
+ProcessDatabase(DataChecksumsWorkerDatabase *db)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	BgwHandleStatus status;
+	pid_t		pid;
+	char		activity[NAMEDATALEN + 64];
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_FAILED;
+
+	memset(&bgw, 0, sizeof(bgw));
+	bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+	bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+	snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+	snprintf(bgw.bgw_function_name, BGW_MAXLEN, "%s", "DataChecksumsWorkerMain");
+	snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker worker");
+	snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker worker");
+	bgw.bgw_restart_time = BGW_NEVER_RESTART;
+	bgw.bgw_notify_pid = MyProcPid;
+	bgw.bgw_main_arg = ObjectIdGetDatum(db->dboid);
+
+	/*
+	 * If there are no worker slots available, make sure we retry processing
+	 * this database. This will make the datachecksumsworker move on to the
+	 * next database and quite likely fail with the same problem. TODO: Maybe
+	 * we need a backoff to avoid running through all the databases here in
+	 * short order.
+	 */
+	if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+	{
+		ereport(WARNING,
+				(errmsg("failed to start worker for enabling data checksums in database \"%s\", retrying",
+						db->dbname),
+				 errhint("The max_worker_processes setting might be too low.")));
+		return DATACHECKSUMSWORKER_RETRYDB;
+	}
+
+	status = WaitForBackgroundWorkerStartup(bgw_handle, &pid);
+	if (status == BGWH_STOPPED)
+	{
+		ereport(WARNING,
+				(errmsg("could not start background worker for enabling data checksums in database \"%s\"",
+						db->dbname),
+				 errhint("More details on the error might be found in the server log.")));
+		return DATACHECKSUMSWORKER_FAILED;
+	}
+
+	/*
+	 * If the postmaster crashed we cannot end up with a processed database so
+	 * we have no alternative other than exiting. When enabling checksums we
+	 * won't at this time have changed the pg_control version to enabled so
+	 * when the cluster comes back up processing will have to be restarted.
+	 * When disabling, the pg_control version will be set to off before this
+	 * so when the cluster comes up checksums will be off as expected.
+	 */
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("cannot enable data checksums without the postmaster process"),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	Assert(status == BGWH_STARTED);
+	ereport(DEBUG1,
+			(errmsg("initiating data checksum processing in database \"%s\"",
+					db->dbname)));
+
+	snprintf(activity, sizeof(activity) - 1,
+			 "Waiting for worker in database %s (pid %d)", db->dbname, pid);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	status = WaitForBackgroundWorkerShutdown(bgw_handle);
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("postmaster exited during data checksum processing in \"%s\"",
+						db->dbname),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	if (DataChecksumsWorkerShmem->success == DATACHECKSUMSWORKER_ABORTED)
+		ereport(LOG,
+				(errmsg("data checksums processing was aborted in database \"%s\"",
+						db->dbname)));
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return DataChecksumsWorkerShmem->success;
+}
+
+/*
+ * launcher_exit
+ *
+ * Internal routine for cleaning up state when the launcher process exits. We
+ * need to clean up the abort flag to ensure that processing can be restarted
+ * again after it was previously aborted.
+ */
+static void
+launcher_exit(int code, Datum arg)
+{
+	if (launcher_running)
+	{
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		launcher_running = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		LWLockRelease(DataChecksumsWorkerLock);
+	}
+}
+
+/*
+ * launcher_cancel_handler
+ *
+ * Internal routine for reacting to SIGINT and flagging the worker to abort.
+ * The worker won't be interrupted immediately but will check for abort flag
+ * between each block in a relation.
+ */
+static void
+launcher_cancel_handler(SIGNAL_ARGS)
+{
+	int			save_errno = errno;
+
+	abort_requested = true;
+
+	/*
+	 * There is no sleeping in the main loop, the flag will be checked
+	 * periodically in ProcessSingleRelationFork. The worker does however
+	 * sleep when waiting for concurrent transactions to end so we still need
+	 * to set the latch.
+	 */
+	SetLatch(MyLatch);
+
+	errno = save_errno;
+}
+
+/*
+ * WaitForAllTransactionsToFinish
+ *		Blocks awaiting all current transactions to finish
+ *
+ * Returns when all transactions which are active at the call of the function
+ * have ended, or if the postmaster dies while waiting. If the postmaster dies
+ * the abort flag will be set to indicate that the caller of this shouldn't
+ * proceed.
+ *
+ * NB: this will return early, if aborted by SIGINT or if the target state
+ * is changed while we're running.
+ */
+static void
+WaitForAllTransactionsToFinish(void)
+{
+	TransactionId waitforxid;
+
+	LWLockAcquire(XidGenLock, LW_SHARED);
+	waitforxid = XidFromFullTransactionId(TransamVariables->nextXid);
+	LWLockRelease(XidGenLock);
+
+	while (TransactionIdPrecedes(GetOldestActiveTransactionId(), waitforxid))
+	{
+		char		activity[64];
+		int			rc;
+
+		/* Oldest running xid is older than us, so wait */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for current transactions to finish (waiting for %u)",
+				 waitforxid);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		rc = WaitLatch(MyLatch,
+					   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+					   5000,
+					   WAIT_EVENT_CHECKSUM_ENABLE_STARTCONDITION);
+
+		/*
+		 * If the postmaster died we won't be able to enable checksums
+		 * cluster-wide so abort and hope to continue when restarted.
+		 */
+		if (rc & WL_POSTMASTER_DEATH)
+			ereport(FATAL,
+					(errmsg("postmaster exited during data checksum processing"),
+					 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_SHARED);
+		if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			abort_requested = true;
+		LWLockRelease(DataChecksumsWorkerLock);
+		if (abort_requested)
+			break;
+	}
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+	return;
+}
+
+/*
+ * DataChecksumsWorkerLauncherMain
+ *
+ * Main function for launching dynamic background workers for processing data
+ * checksums in databases. This function has the bgworker management, with
+ * ProcessAllDatabases being responsible for looping over the databases and
+ * initiating processing.
+ */
+void
+DataChecksumsWorkerLauncherMain(Datum arg)
+{
+	on_shmem_exit(launcher_exit, 0);
+
+	ereport(DEBUG1,
+			errmsg("background worker \"datachecksumsworker\" launcher started"));
+
+	pqsignal(SIGTERM, die);
+	pqsignal(SIGINT, launcher_cancel_handler);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_LAUNCHER;
+	init_ps_display(NULL);
+
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	if (DataChecksumsWorkerShmem->launcher_running)
+	{
+		/* Launcher was already running, let it finish */
+		LWLockRelease(DataChecksumsWorkerLock);
+		return;
+	}
+
+	launcher_running = true;
+
+	/*
+	 * Initialize a connection to shared catalogs only.
+	 */
+	BackgroundWorkerInitializeConnectionByOid(InvalidOid, InvalidOid, 0);
+
+	/* Initialize backend status information */
+	pgstat_bestart();
+
+	enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+	DataChecksumsWorkerShmem->launcher_running = true;
+	DataChecksumsWorkerShmem->enabling_checksums = enabling_checksums;
+	DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+	DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+	DataChecksumsWorkerShmem->immediate_checkpoint = DataChecksumsWorkerShmem->launch_fast;
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * The target state can change while we are busy enabling/disabling
+	 * checksums, if the user calls pg_disable/enable_data_checksums() before
+	 * we are finished with the previous request. In that case, we will loop
+	 * back here, to process the new request.
+	 */
+again:
+
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	/*
+	 * If we're asked to enable checksums, we need to check if processing was
+	 * previously interrupted such that we should resume rather than start
+	 * from scratch.
+	 */
+	if (enabling_checksums)
+	{
+		/*
+		 * If we are asked to enable checksums in a cluster which already has
+		 * checksums enabled, exit immediately as there is nothing more to do.
+		 * Hold interrupts to make sure state doesn't change during checking.
+		 */
+		HOLD_INTERRUPTS();
+		if (DataChecksumsNeedVerify())
+		{
+			RESUME_INTERRUPTS();
+			goto done;
+		}
+		RESUME_INTERRUPTS();
+
+		/*
+		 * Initialize progress and indicate that we are waiting on the other
+		 * backends to clear the procsignalbarrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS);
+
+		/*
+		 * Set the state to inprogress-on and wait on the procsignal barrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_ENABLING);
+		SetDataChecksumsOnInProgress();
+
+		if (!ProcessAllDatabases(DataChecksumsWorkerShmem->immediate_checkpoint))
+		{
+			/*
+			 * If the target state changed during processing then it's not a
+			 * failure, so restart processing instead.
+			 */
+			LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+			if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			{
+				LWLockRelease(DataChecksumsWorkerLock);
+				goto done;
+			}
+			LWLockRelease(DataChecksumsWorkerLock);
+			ereport(ERROR,
+					(errmsg("unable to enable data checksums in cluster")));
+		}
+
+		SetDataChecksumsOn();
+	}
+	else
+	{
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_DISABLING);
+		SetDataChecksumsOff();
+	}
+
+done:
+
+	/*
+	 * All done. But before we exit, check if the target state was changed
+	 * while we were running. In that case we will have to start all over
+	 * again.
+	 */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+	if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+	{
+		DataChecksumsWorkerShmem->enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+		DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+		LWLockRelease(DataChecksumsWorkerLock);
+		goto again;
+	}
+
+	/* Shut down progress reporting as we are done */
+	pgstat_progress_end_command();
+
+	launcher_running = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	LWLockRelease(DataChecksumsWorkerLock);
+}
+
+/*
+ * ProcessAllDatabases
+ *		Compute the list of all databases and process checksums in each
+ *
+ * This will repeatedly generate a list of databases to process for enabling
+ * checksums. Until no new databases are found, this will loop around computing
+ * a new list and comparing it to the already seen ones.
+ *
+ * If immediate_checkpoint is set to true then a CHECKPOINT_IMMEDIATE will be
+ * issued. This is useful for testing but should be avoided in production use
+ * as it may affect cluster performance drastically.
+ */
+static bool
+ProcessAllDatabases(bool immediate_checkpoint)
+{
+	List	   *DatabaseList;
+	HTAB	   *ProcessedDatabases = NULL;
+	HASHCTL		hash_ctl;
+	bool		found_failed = false;
+	int			flags;
+
+	/* Initialize a hash tracking all processed databases */
+	memset(&hash_ctl, 0, sizeof(hash_ctl));
+	hash_ctl.keysize = sizeof(Oid);
+	hash_ctl.entrysize = sizeof(DataChecksumsWorkerResultEntry);
+	ProcessedDatabases = hash_create("Processed databases",
+									 64,
+									 &hash_ctl,
+									 HASH_ELEM | HASH_BLOBS);
+
+	/*
+	 * Set up so first run processes shared catalogs, but not once in every
+	 * db.
+	 */
+	DataChecksumsWorkerShmem->process_shared_catalogs = true;
+
+	/*
+	 * Get a list of all databases to process. This may include databases that
+	 * were created during our runtime.  Since a database can be created as a
+	 * copy of any other database (which may not have existed in our last
+	 * run), we have to repeat this loop until no new databases show up in the
+	 * list.
+	 */
+	DatabaseList = BuildDatabaseList();
+
+	/*
+	 * Update progress reporting with the total number of databases we need to
+	 * process. This number should not be changed during processing, the
+	 * columns for processed databases is instead increased such that it can
+	 * be compared against the total.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_DB,
+								 list_length(DatabaseList));
+
+	while (true)
+	{
+		int			processed_databases = 0;
+
+		foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+		{
+			DataChecksumsWorkerResult result;
+			DataChecksumsWorkerResultEntry *entry;
+			bool		found;
+
+			/*
+			 * Indicate which database is being processed set the number of
+			 * relations to -1 to clear field from previous values. -1 will
+			 * translate to NULL in the progress view.
+			 */
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_CUR_DB, db->dboid);
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_REL, -1);
+
+			/*
+			 * Check if this database has been processed already, and if so
+			 * whether it should be retried or skipped.
+			 */
+			entry = (DataChecksumsWorkerResultEntry *) hash_search(ProcessedDatabases, &db->dboid,
+																   HASH_FIND, NULL);
+
+			if (entry)
+			{
+				if (entry->result == DATACHECKSUMSWORKER_RETRYDB)
+				{
+					/*
+					 * Limit the number of retries to avoid infinite looping
+					 * in case there simply won't be enough workers in the
+					 * cluster to finish this operation.
+					 */
+					if (entry->retries > DATACHECKSUMSWORKER_MAX_DB_RETRIES)
+						entry->result = DATACHECKSUMSWORKER_FAILED;
+				}
+
+				/* Skip if this database has been processed already */
+				if (entry->result != DATACHECKSUMSWORKER_RETRYDB)
+					continue;
+			}
+
+			result = ProcessDatabase(db);
+			processed_databases++;
+
+			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
+			{
+				/*
+				 * If one database has completed shared catalogs, we don't
+				 * have to process them again.
+				 */
+				if (DataChecksumsWorkerShmem->process_shared_catalogs)
+					DataChecksumsWorkerShmem->process_shared_catalogs = false;
+			}
+			else if (result == DATACHECKSUMSWORKER_ABORTED)
+			{
+				/* Abort flag set, so exit the whole process */
+				return false;
+			}
+
+			entry = hash_search(ProcessedDatabases, &db->dboid, HASH_ENTER, &found);
+			entry->dboid = db->dboid;
+			entry->result = result;
+			if (!found)
+				entry->retries = 0;
+			else
+				entry->retries++;
+		}
+
+		elog(DEBUG1,
+			 "%i databases processed for data checksum enabling, %s",
+			 processed_databases,
+			 (processed_databases ? "process with restart" : "process completed"));
+
+		FreeDatabaseList(DatabaseList);
+
+		/*
+		 * If no databases were processed in this run of the loop, we have now
+		 * finished all databases and no concurrently created ones can exist.
+		 */
+		if (processed_databases == 0)
+			break;
+
+		/*
+		 * Re-generate the list of databases for another pass. Since we wait
+		 * for all pre-existing transactions finish, this way we can be
+		 * certain that there are no databases left without checksums.
+		 */
+		WaitForAllTransactionsToFinish();
+		DatabaseList = BuildDatabaseList();
+	}
+
+	/*
+	 * ProcessedDatabases now has all databases and the results of their
+	 * processing. Failure to enable checksums for a database can be because
+	 * they actually failed for some reason, or because the database was
+	 * dropped between us getting the database list and trying to process it.
+	 * Get a fresh list of databases to detect the second case where the
+	 * database was dropped before we had started processing it. If a database
+	 * still exists, but enabling checksums failed then we fail the entire
+	 * checksumming process and exit with an error.
+	 */
+	WaitForAllTransactionsToFinish();
+	DatabaseList = BuildDatabaseList();
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+	{
+		DataChecksumsWorkerResult *entry;
+		bool		found;
+
+		entry = hash_search(ProcessedDatabases, (void *) &db->dboid,
+							HASH_FIND, &found);
+
+		/*
+		 * We are only interested in the databases where the failed database
+		 * still exists.
+		 */
+		if (found && *entry == DATACHECKSUMSWORKER_FAILED)
+		{
+			ereport(WARNING,
+					(errmsg("failed to enable data checksums in \"%s\"",
+							db->dbname)));
+			found_failed = found;
+			continue;
+		}
+	}
+
+	FreeDatabaseList(DatabaseList);
+
+	if (found_failed)
+	{
+		/* Disable checksums on cluster, because we failed */
+		SetDataChecksumsOff();
+		ereport(ERROR,
+				(errmsg("data checksums failed to get enabled in all databases, aborting"),
+				 errhint("The server log might have more information on the cause of the error.")));
+	}
+
+	/*
+	 * Force a checkpoint to get everything out to disk. The use of immediate
+	 * checkpoints is for running tests, as they would otherwise not execute
+	 * in such a way that they can reliably be placed under timeout control.
+	 */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_IMMEDIATE;
+	RequestCheckpoint(flags);
+
+	return true;
+}
+
+/*
+ * DataChecksumsWorkerShmemSize
+ *		Compute required space for datachecksumsworker-related shared memory
+ */
+Size
+DataChecksumsWorkerShmemSize(void)
+{
+	Size		size;
+
+	size = sizeof(DataChecksumsWorkerShmemStruct);
+	size = MAXALIGN(size);
+
+	return size;
+}
+
+/*
+ * DataChecksumsWorkerShmemInit
+ *		Allocate and initialize datachecksumsworker-related shared memory
+ */
+void
+DataChecksumsWorkerShmemInit(void)
+{
+	bool		found;
+
+	DataChecksumsWorkerShmem = (DataChecksumsWorkerShmemStruct *)
+		ShmemInitStruct("DataChecksumsWorker Data",
+						DataChecksumsWorkerShmemSize(),
+						&found);
+
+	MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
+
+	/*
+	 * Even if this is a redundant assignment, we want to be explicit about
+	 * our intent for readability, since we want to be able to query this
+	 * state in case of restartability.
+	 */
+	DataChecksumsWorkerShmem->launch_enable_checksums = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	DataChecksumsWorkerShmem->launch_fast = false;
+}
+
+/*
+ * BuildDatabaseList
+ *		Compile a list of all currently available databases in the cluster
+ *
+ * This creates the list of databases for the datachecksumsworker workers to
+ * add checksums to. If the caller wants to ensure that no concurrently
+ * running CREATE DATABASE calls exist, this needs to be preceded by a call
+ * to WaitForAllTransactionsToFinish().
+ */
+static List *
+BuildDatabaseList(void)
+{
+	List	   *DatabaseList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(DatabaseRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_database pgdb = (Form_pg_database) GETSTRUCT(tup);
+		DataChecksumsWorkerDatabase *db;
+
+		oldctx = MemoryContextSwitchTo(ctx);
+
+		db = (DataChecksumsWorkerDatabase *) palloc0(sizeof(DataChecksumsWorkerDatabase));
+
+		db->dboid = pgdb->oid;
+		db->dbname = pstrdup(NameStr(pgdb->datname));
+
+		DatabaseList = lappend(DatabaseList, db);
+
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return DatabaseList;
+}
+
+static void
+FreeDatabaseList(List *dblist)
+{
+	if (!dblist)
+		return;
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, dblist)
+	{
+		if (db->dbname != NULL)
+			pfree(db->dbname);
+	}
+
+	list_free_deep(dblist);
+}
+
+/*
+ * BuildRelationList
+ *		Compile a list of relations in the database
+ *
+ * Returns a list of OIDs for the request relation types. If temp_relations
+ * is True then only temporary relations are returned. If temp_relations is
+ * False then non-temporary relations which have data checksums are returned.
+ * If include_shared is True then shared relations are included as well in a
+ * non-temporary list. include_shared has no relevance when building a list of
+ * temporary relations.
+ */
+static List *
+BuildRelationList(bool temp_relations, bool include_shared)
+{
+	List	   *RelationList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(RelationRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_class pgc = (Form_pg_class) GETSTRUCT(tup);
+
+		/*
+		 * Only include temporary relations when asked for a temp relation
+		 * list.
+		 */
+		if (pgc->relpersistence == RELPERSISTENCE_TEMP)
+		{
+			if (!temp_relations)
+				continue;
+		}
+		else
+		{
+			/*
+			 * If we are only interested in temp relations then continue
+			 * immediately as the current relation isn't a temp relation.
+			 */
+			if (temp_relations)
+				continue;
+
+			if (!RELKIND_HAS_STORAGE(pgc->relkind))
+				continue;
+
+			if (pgc->relisshared && !include_shared)
+				continue;
+		}
+
+		oldctx = MemoryContextSwitchTo(ctx);
+		RelationList = lappend_oid(RelationList, pgc->oid);
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return RelationList;
+}
+
+/*
+ * DataChecksumsWorkerMain
+ *
+ * Main function for enabling checksums in a single database, This is the
+ * function set as the bgw_function_name in the dynamic background worker
+ * process initiated for each database by the worker launcher. After enabling
+ * data checksums in each applicable relation in the database, it will wait for
+ * all temporary relations that were present when the function started to
+ * disappear before returning. This is required since we cannot rewrite
+ * existing temporary relations with data checksums.
+ */
+void
+DataChecksumsWorkerMain(Datum arg)
+{
+	Oid			dboid = DatumGetObjectId(arg);
+	List	   *RelationList = NIL;
+	List	   *InitialTempTableList = NIL;
+	BufferAccessStrategy strategy;
+	bool		aborted = false;
+
+	enabling_checksums = true;
+
+	pqsignal(SIGTERM, die);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_WORKER;
+	init_ps_display(NULL);
+
+	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
+											  BGWORKER_BYPASS_ALLOWCONN);
+
+	/*
+	 * Get a list of all temp tables present as we start in this database. We
+	 * need to wait until they are all gone until we are done, since we cannot
+	 * access these relations and modify them.
+	 */
+	InitialTempTableList = BuildRelationList(true, false);
+
+	/*
+	 * Enable vacuum cost delay, if any.
+	 */
+	Assert(DataChecksumsWorkerShmem->enabling_checksums);
+	VacuumCostDelay = DataChecksumsWorkerShmem->cost_delay;
+	VacuumCostLimit = DataChecksumsWorkerShmem->cost_limit;
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumCostPageHit = 0;
+	VacuumCostPageMiss = 0;
+	VacuumCostPageDirty = 0;
+
+	/*
+	 * Create and set the vacuum strategy as our buffer strategy.
+	 */
+	strategy = GetAccessStrategy(BAS_VACUUM);
+
+	RelationList = BuildRelationList(false,
+									 DataChecksumsWorkerShmem->process_shared_catalogs);
+	foreach_oid(reloid, RelationList)
+	{
+		if (!ProcessSingleRelationByOid(reloid, strategy))
+		{
+			aborted = true;
+			break;
+		}
+	}
+	list_free(RelationList);
+
+	if (aborted)
+	{
+		DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+		ereport(DEBUG1,
+				(errmsg("data checksum processing aborted in database OID %u",
+						dboid)));
+		return;
+	}
+
+	/*
+	 * Wait for all temp tables that existed when we started to go away. This
+	 * is necessary since we cannot "reach" them to enable checksums. Any temp
+	 * tables created after we started will already have checksums in them
+	 * (due to the "inprogress-on" state), so no need to wait for those.
+	 */
+	for (;;)
+	{
+		List	   *CurrentTempTables;
+		int			numleft;
+		char		activity[64];
+
+		CurrentTempTables = BuildRelationList(true, false);
+		numleft = 0;
+		foreach_oid(tmptbloid, InitialTempTableList)
+		{
+			if (list_member_oid(CurrentTempTables, tmptbloid))
+				numleft++;
+		}
+		list_free(CurrentTempTables);
+
+		if (numleft == 0)
+			break;
+
+		/*
+		 * At least one temp table is left to wait for, indicate in pgstat
+		 * activity and progress reporting.
+		 */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for %d temp tables to be removed", numleft);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		(void) WaitLatch(MyLatch,
+						 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+						 5000,
+						 WAIT_EVENT_CHECKSUM_ENABLE_FINISHCONDITION);
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		aborted = DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums;
+		LWLockRelease(DataChecksumsWorkerLock);
+
+		if (aborted || abort_requested)
+		{
+			DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+			ereport(DEBUG1,
+					(errmsg("data checksum processing aborted in database OID %u",
+							dboid)));
+			return;
+		}
+	}
+
+	list_free(InitialTempTableList);
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
+}
diff --git a/src/backend/postmaster/meson.build b/src/backend/postmaster/meson.build
index 0ea4bbe084..3aacb2e0e7 100644
--- a/src/backend/postmaster/meson.build
+++ b/src/backend/postmaster/meson.build
@@ -6,6 +6,7 @@ backend_sources += files(
   'bgworker.c',
   'bgwriter.c',
   'checkpointer.c',
+  'datachecksumsworker.c',
   'fork_process.c',
   'interrupt.c',
   'launch_backend.c',
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index e73576ad12..86f2ba1e2a 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -186,6 +186,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 		case XLOG_FPW_CHANGE:
 		case XLOG_FPI_FOR_HINT:
 		case XLOG_FPI:
+		case XLOG_CHECKSUMS:
 		case XLOG_OVERWRITE_CONTRECORD:
 		case XLOG_CHECKPOINT_REDO:
 			break;
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 7783ba854f..d68df6d323 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -30,6 +30,8 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "postmaster/postmaster.h"
 #include "postmaster/walsummarizer.h"
 #include "replication/logicallauncher.h"
 #include "replication/origin.h"
@@ -148,6 +150,7 @@ CalculateShmemSize(int *num_semaphores)
 	size = add_size(size, WaitEventCustomShmemSize());
 	size = add_size(size, InjectionPointShmemSize());
 	size = add_size(size, SlotSyncShmemSize());
+	size = add_size(size, DataChecksumsWorkerShmemSize());
 
 	/* include additional requested shmem from preload libraries */
 	size = add_size(size, total_addin_request);
@@ -330,6 +333,7 @@ CreateOrAttachShmemStructs(void)
 	PgArchShmemInit();
 	ApplyLauncherShmemInit();
 	SlotSyncShmemInit();
+	DataChecksumsWorkerShmemInit();
 
 	/*
 	 * Set up other modules that need some shared memory space
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 87027f27eb..e51cfb3a98 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -18,6 +18,7 @@
 #include <unistd.h>
 
 #include "access/parallel.h"
+#include "access/xlog.h"
 #include "commands/async.h"
 #include "miscadmin.h"
 #include "pgstat.h"
@@ -573,6 +574,18 @@ ProcessProcSignalBarrier(void)
 					case PROCSIGNAL_BARRIER_SMGRRELEASE:
 						processed = ProcessBarrierSmgrRelease();
 						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON:
+						processed = AbsorbChecksumsOnInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_ON:
+						processed = AbsorbChecksumsOnBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF:
+						processed = AbsorbChecksumsOffInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_OFF:
+						processed = AbsorbChecksumsOffBarrier();
+						break;
 				}
 
 				/*
diff --git a/src/backend/storage/page/README b/src/backend/storage/page/README
index e30d7ac59a..73c36a6390 100644
--- a/src/backend/storage/page/README
+++ b/src/backend/storage/page/README
@@ -10,7 +10,9 @@ http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf, discussed
 2010/12/22 on -hackers list.
 
 Current implementation requires this be enabled system-wide at initdb time, or
-by using the pg_checksums tool on an offline cluster.
+by using the pg_checksums tool on an offline cluster.  Checksums can also be
+enabled at runtime using pg_enable_data_checksums(), and disabled by using
+pg_disable_data_checksums().
 
 The checksum is not valid at all times on a data page!!
 The checksum is valid when the page leaves the shared pool and is checked
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index be6f1f62d2..112c05ee7e 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -100,7 +100,7 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 	 */
 	if (!PageIsNew(page))
 	{
-		if (DataChecksumsEnabled())
+		if (DataChecksumsNeedVerify())
 		{
 			checksum = pg_checksum_page((char *) page, blkno);
 
@@ -1512,7 +1512,7 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 	static char *pageCopy = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return (char *) page;
 
 	/*
@@ -1542,7 +1542,7 @@ void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
 	/* If we don't need a checksum, just return */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return;
 
 	((PageHeader) page)->pd_checksum = pg_checksum_page((char *) page, blkno);
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index cc2ffc78aa..2b5671e2ca 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -359,6 +359,8 @@ pgstat_tracks_io_bktype(BackendType bktype)
 		case B_WAL_SUMMARIZER:
 			return false;
 
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 		case B_AUTOVAC_LAUNCHER:
 		case B_AUTOVAC_WORKER:
 		case B_BACKEND:
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index 16144c2b72..a0f3570759 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -114,6 +114,8 @@ CHECKPOINT_DELAY_COMPLETE	"Waiting for a backend that blocks a checkpoint from c
 CHECKPOINT_DELAY_START	"Waiting for a backend that blocks a checkpoint from starting."
 CHECKPOINT_DONE	"Waiting for a checkpoint to complete."
 CHECKPOINT_START	"Waiting for a checkpoint to start."
+CHECKSUM_ENABLE_STARTCONDITION	"Waiting for data checksums enabling to start."
+CHECKSUM_ENABLE_FINISHCONDITION	"Waiting for data checksums to be enabled."
 EXECUTE_GATHER	"Waiting for activity from a child process while executing a <literal>Gather</literal> plan node."
 HASH_BATCH_ALLOCATE	"Waiting for an elected Parallel Hash participant to allocate a hash table."
 HASH_BATCH_ELECT	"Waiting to elect a Parallel Hash participant to allocate a hash table."
@@ -345,6 +347,7 @@ WALSummarizer	"Waiting to read or update WAL summarization state."
 DSMRegistry	"Waiting to read or update the dynamic shared memory registry."
 InjectionPoint	"Waiting to read or update information related to injection points."
 SerialControl	"Waiting to read or update shared <filename>pg_serial</filename> state."
+DataChecksumsWorker	"Waiting for data checksumsworker."
 
 #
 # END OF PREDEFINED LWLOCKS (DO NOT CHANGE THIS LINE)
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index f7b50e0b5a..9f52541eda 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -246,6 +246,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		cmdtype = PROGRESS_COMMAND_BASEBACKUP;
 	else if (pg_strcasecmp(cmd, "COPY") == 0)
 		cmdtype = PROGRESS_COMMAND_COPY;
+	else if (pg_strcasecmp(cmd, "DATACHECKSUMS") == 0)
+		cmdtype = PROGRESS_COMMAND_DATACHECKSUMS;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -1115,9 +1117,6 @@ pg_stat_get_db_checksum_failures(PG_FUNCTION_ARGS)
 	int64		result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
@@ -1133,9 +1132,6 @@ pg_stat_get_db_checksum_last_failure(PG_FUNCTION_ARGS)
 	TimestampTz result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index ef60f41b8c..8d15c15edb 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -290,6 +290,12 @@ GetBackendTypeDesc(BackendType backendType)
 		case B_CHECKPOINTER:
 			backendDesc = "checkpointer";
 			break;
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+			backendDesc = "datachecksumsworker launcher";
+			break;
+		case B_DATACHECKSUMSWORKER_WORKER:
+			backendDesc = "datachecksumsworker worker";
+			break;
 		case B_LOGGER:
 			backendDesc = "logger";
 			break;
@@ -841,7 +847,8 @@ InitializeSessionUserIdStandalone(void)
 	 * workers, in slot sync worker and in background workers.
 	 */
 	Assert(!IsUnderPostmaster || AmAutoVacuumWorkerProcess() ||
-		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess());
+		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess() ||
+		   AmDataChecksumsWorkerProcess());
 
 	/* call only once */
 	Assert(!OidIsValid(AuthenticatedUserId));
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index a024b1151d..513b163ed7 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -720,6 +720,11 @@ InitPostgres(const char *in_dbname, Oid dboid,
 	 */
 	SharedInvalBackendInit(false);
 
+	/*
+	 * Set up backend local cache of Controldata values.
+	 */
+	InitLocalControldata();
+
 	ProcSignalInit(MyCancelKeyValid, MyCancelKey);
 
 	/*
@@ -864,7 +869,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
 					 errhint("You should immediately run CREATE USER \"%s\" SUPERUSER;.",
 							 username != NULL ? username : "postgres")));
 	}
-	else if (AmBackgroundWorkerProcess())
+	else if (AmBackgroundWorkerProcess() || AmDataChecksumsWorkerProcess())
 	{
 		if (username == NULL && !OidIsValid(useroid))
 		{
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 8a67f01200..3ab7ead97d 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -474,6 +474,14 @@ static const struct config_enum_entry wal_compression_options[] = {
 	{NULL, 0, false}
 };
 
+static const struct config_enum_entry data_checksums_options[] = {
+	{"on", DATA_CHECKSUMS_ON, true},
+	{"off", DATA_CHECKSUMS_OFF, true},
+	{"inprogress-on", DATA_CHECKSUMS_INPROGRESS_ON, true},
+	{"inprogress-off", DATA_CHECKSUMS_INPROGRESS_OFF, true},
+	{NULL, 0, false}
+};
+
 /*
  * Options for enum values stored in other modules
  */
@@ -599,7 +607,7 @@ static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
 static int	num_os_semaphores;
-static bool data_checksums;
+static int	data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
@@ -1919,17 +1927,6 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
-	{
-		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
-			gettext_noop("Shows whether data checksums are turned on for this cluster."),
-			NULL,
-			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
-		},
-		&data_checksums,
-		false,
-		NULL, NULL, NULL
-	},
-
 	{
 		{"syslog_sequence_numbers", PGC_SIGHUP, LOGGING_WHERE,
 			gettext_noop("Add sequence number to syslog messages to avoid duplicate suppression."),
@@ -5207,6 +5204,17 @@ struct config_enum ConfigureNamesEnum[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows whether data checksums are turned on for this cluster."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+		},
+		&data_checksums,
+		DATA_CHECKSUMS_OFF, data_checksums_options,
+		NULL, NULL, show_data_checksums
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index f5f7ff1045..3bda3adb04 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -577,7 +577,7 @@ main(int argc, char *argv[])
 		mode == PG_MODE_DISABLE)
 		pg_fatal("data checksums are already disabled in cluster");
 
-	if (ControlFile->data_checksum_version > 0 &&
+	if (ControlFile->data_checksum_version == DATA_CHECKSUMS_ON &&
 		mode == PG_MODE_ENABLE)
 		pg_fatal("data checksums are already enabled in cluster");
 
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index 854c6887a2..7118f9069c 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -13,6 +13,7 @@
 
 #include "common/string.h"
 #include "pg_upgrade.h"
+#include "storage/bufpage.h"
 
 
 /*
@@ -693,6 +694,14 @@ check_control_data(ControlData *oldctrl,
 	 * check_for_isn_and_int8_passing_mismatch().
 	 */
 
+	/*
+	 * If data checksums are in any in-progress state then disallow the
+	 * upgrade. The user should either let the process finish, or turn off
+	 * data checksums, before retrying.
+	 */
+	if (oldctrl->data_checksum_version > PG_DATA_CHECKSUM_VERSION)
+		pg_fatal("checksums are being enabled in the old cluster");
+
 	/*
 	 * We might eventually allow upgrades from checksum to no-checksum
 	 * clusters.
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 34ad46c067..dd7c75abe3 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -117,7 +117,7 @@ extern PGDLLIMPORT int wal_level;
  * of the bits make it to disk, but the checksum wouldn't match.  Also WAL-log
  * them if forced by wal_log_hints=on.
  */
-#define XLogHintBitIsNeeded() (DataChecksumsEnabled() || wal_log_hints)
+#define XLogHintBitIsNeeded() (wal_log_hints || DataChecksumsNeedWrite())
 
 /* Do we need to WAL-log information required only for Hot Standby and logical replication? */
 #define XLogStandbyInfoActive() (wal_level >= WAL_LEVEL_REPLICA)
@@ -230,7 +230,19 @@ extern XLogRecPtr GetXLogWriteRecPtr(void);
 
 extern uint64 GetSystemIdentifier(void);
 extern char *GetMockAuthenticationNonce(void);
-extern bool DataChecksumsEnabled(void);
+extern bool DataChecksumsNeedWrite(void);
+extern bool DataChecksumsNeedVerify(void);
+extern bool DataChecksumsOnInProgress(void);
+extern bool DataChecksumsOffInProgress(void);
+extern void SetDataChecksumsOnInProgress(void);
+extern void SetDataChecksumsOn(void);
+extern void SetDataChecksumsOff(void);
+extern bool AbsorbChecksumsOnInProgressBarrier(void);
+extern bool AbsorbChecksumsOffInProgressBarrier(void);
+extern bool AbsorbChecksumsOnBarrier(void);
+extern bool AbsorbChecksumsOffBarrier(void);
+extern const char *show_data_checksums(void);
+extern void InitLocalControldata(void);
 extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
 extern Size XLOGShmemSize(void);
 extern void XLOGShmemInit(void);
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index d9cf51a0f9..80a354cb74 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,6 +25,7 @@
 #include "lib/stringinfo.h"
 #include "pgtime.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/relfilelocator.h"
 
 
@@ -289,6 +290,12 @@ typedef struct xl_restore_point
 	char		rp_name[MAXFNAMELEN];
 } xl_restore_point;
 
+/* Information logged when data checksum level is changed */
+typedef struct xl_checksum_state
+{
+	ChecksumType new_checksumtype;
+} xl_checksum_state;
+
 /* Overwrite of prior contrecord */
 typedef struct xl_overwrite_contrecord
 {
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index e80ff8e414..17da375aa6 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -80,6 +80,7 @@ typedef struct CheckPoint
 /* 0xC0 is used in Postgres 9.5-11 */
 #define XLOG_OVERWRITE_CONTRECORD		0xD0
 #define XLOG_CHECKPOINT_REDO			0xE0
+#define XLOG_CHECKSUMS					0xF0
 
 
 /*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index f23321a41f..2451fee403 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12155,6 +12155,23 @@
   proname => 'jsonb_subscript_handler', prorettype => 'internal',
   proargtypes => 'internal', prosrc => 'jsonb_subscript_handler' },
 
+# data checksum management functions
+{ oid => '9258',
+  descr => 'disable data checksums',
+  proname => 'pg_disable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => '',
+  prosrc => 'disable_data_checksums' },
+
+{ oid => '9257',
+  descr => 'enable data checksums',
+  proname => 'pg_enable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'int4 int4 bool', proallargtypes => '{int4,int4,bool}',
+  proargmodes => '{i,i,i}',
+  proargnames => '{cost_delay,cost_limit,fast}',
+  prosrc => 'enable_data_checksums' },
+
 # collation management functions
 { oid => '3445', descr => 'import collations from operating system',
   proname => 'pg_import_system_collations', procost => '100',
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 5616d64523..8e3c180392 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -155,4 +155,22 @@
 #define PROGRESS_COPY_TYPE_PIPE 3
 #define PROGRESS_COPY_TYPE_CALLBACK 4
 
+/* Progress parameters for PROGRESS_DATACHECKSUMS */
+#define PROGRESS_DATACHECKSUMS_PHASE 0
+#define PROGRESS_DATACHECKSUMS_TOTAL_DB 1
+#define PROGRESS_DATACHECKSUMS_TOTAL_REL 2
+#define PROGRESS_DATACHECKSUMS_PROCESSED_DB 3
+#define PROGRESS_DATACHECKSUMS_PROCESSED_REL 4
+#define PROGRESS_DATACHECKSUMS_CUR_DB 5
+#define PROGRESS_DATACHECKSUMS_CUR_REL 6
+#define PROGRESS_DATACHECKSUMS_CUR_REL_TOTAL_BLOCKS 7
+#define PROGRESS_DATACHECKSUMS_CUR_REL_PROCESSED_BLOCKS 8
+
+/* Phases of datachecksumsworker operation */
+#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING 0
+#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING 1
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS 2
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL 3
+#define PROGRESS_DATACHECKSUMS_DONE 4
+
 #endif
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index e26d108a47..55e507683a 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -357,6 +357,9 @@ typedef enum BackendType
 	B_WAL_SUMMARIZER,
 	B_WAL_WRITER,
 
+	B_DATACHECKSUMSWORKER_LAUNCHER,
+	B_DATACHECKSUMSWORKER_WORKER,
+
 	/*
 	 * Logger is not connected to shared memory and does not have a PGPROC
 	 * entry.
@@ -380,6 +383,7 @@ extern PGDLLIMPORT BackendType MyBackendType;
 #define AmWalReceiverProcess()		(MyBackendType == B_WAL_RECEIVER)
 #define AmWalSummarizerProcess()	(MyBackendType == B_WAL_SUMMARIZER)
 #define AmWalWriterProcess()		(MyBackendType == B_WAL_WRITER)
+#define AmDataChecksumsWorkerProcess()	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
 
 extern const char *GetBackendTypeDesc(BackendType backendType);
 
diff --git a/src/include/postmaster/datachecksumsworker.h b/src/include/postmaster/datachecksumsworker.h
new file mode 100644
index 0000000000..59c9000d64
--- /dev/null
+++ b/src/include/postmaster/datachecksumsworker.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.h
+ *	  header file for checksum helper background worker
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/postmaster/datachecksumsworker.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef DATACHECKSUMSWORKER_H
+#define DATACHECKSUMSWORKER_H
+
+/* Shared memory */
+extern Size DataChecksumsWorkerShmemSize(void);
+extern void DataChecksumsWorkerShmemInit(void);
+
+/* Start the background processes for enabling or disabling checksums */
+void		StartDataChecksumsWorkerLauncher(bool enable_checksums,
+											 int cost_delay,
+											 int cost_limit,
+											 bool fast);
+
+/* Background worker entrypoints */
+void		DataChecksumsWorkerLauncherMain(Datum arg);
+void		DataChecksumsWorkerMain(Datum arg);
+
+#endif							/* DATACHECKSUMSWORKER_H */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 6222d46e53..9ff4127e58 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -204,7 +204,17 @@ typedef PageHeaderData *PageHeader;
  * handling pages.
  */
 #define PG_PAGE_LAYOUT_VERSION		4
+
+/*
+ * Checksum version 0 is used for when data checksums are disabled.
+ * PG_DATA_CHECKSUM_VERSION defines that data checksums are enabled in the
+ * cluster and PG_DATA_CHECKSUM_INPROGRESS_{ON|OFF}_VERSION defines that data
+ * checksums are either currently being enabled or disabled.
+ */
 #define PG_DATA_CHECKSUM_VERSION	1
+#define PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION	2
+#define PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION	3
+
 
 /* ----------------------------------------------------------------
  *						page support functions
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 08e9d598ce..e3e6ec3d4a 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -15,6 +15,14 @@
 
 #include "storage/block.h"
 
+typedef enum ChecksumType
+{
+	DATA_CHECKSUMS_OFF = 0,
+	DATA_CHECKSUMS_ON,
+	DATA_CHECKSUMS_INPROGRESS_ON,
+	DATA_CHECKSUMS_INPROGRESS_OFF
+} ChecksumType;
+
 /*
  * Compute the checksum for a Postgres page.  The page must be aligned on a
  * 4-byte boundary.
diff --git a/src/include/storage/lwlocklist.h b/src/include/storage/lwlocklist.h
index 6a2f64c54f..64e2ea5a50 100644
--- a/src/include/storage/lwlocklist.h
+++ b/src/include/storage/lwlocklist.h
@@ -83,3 +83,4 @@ PG_LWLOCK(49, WALSummarizer)
 PG_LWLOCK(50, DSMRegistry)
 PG_LWLOCK(51, InjectionPoint)
 PG_LWLOCK(52, SerialControl)
+PG_LWLOCK(53, DataChecksumsWorker)
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 5a3dd5d2d4..64137092a9 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -449,9 +449,10 @@ extern PGDLLIMPORT PGPROC *PreparedXactProcs;
  * Background writer, checkpointer, WAL writer, WAL summarizer, and archiver
  * run during normal operation.  Startup process and WAL receiver also consume
  * 2 slots, but WAL writer is launched only after startup has exited, so we
- * only need 6 slots.
+ * only need 6 slots to cover these. The DataChecksums worker and launcher
+ * can consume 2 slots when data checksums are enabled or disabled.
  */
-#define NUM_AUXILIARY_PROCS		6
+#define NUM_AUXILIARY_PROCS		8
 
 /* configurable options */
 extern PGDLLIMPORT int DeadlockTimeout;
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 221073def3..dbe68d7eed 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -54,6 +54,11 @@ typedef enum
 typedef enum
 {
 	PROCSIGNAL_BARRIER_SMGRRELEASE, /* ask smgr to close files */
+
+	PROCSIGNAL_BARRIER_CHECKSUM_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_ON,
 } ProcSignalBarrierType;
 
 /*
diff --git a/src/include/utils/backend_progress.h b/src/include/utils/backend_progress.h
index 7b63d38f97..063d3ef86c 100644
--- a/src/include/utils/backend_progress.h
+++ b/src/include/utils/backend_progress.h
@@ -28,6 +28,7 @@ typedef enum ProgressCommandType
 	PROGRESS_COMMAND_CREATE_INDEX,
 	PROGRESS_COMMAND_BASEBACKUP,
 	PROGRESS_COMMAND_COPY,
+	PROGRESS_COMMAND_DATACHECKSUMS,
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	20
diff --git a/src/test/Makefile b/src/test/Makefile
index 511a72e623..278ce3e8a8 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,16 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = perl postmaster regress isolation modules authentication recovery subscription
+SUBDIRS = \
+		perl \
+		postmaster \
+		regress \
+		isolation \
+		modules \
+		authentication \
+		recovery \
+		subscription \
+		checksum
 
 ifeq ($(with_icu),yes)
 SUBDIRS += icu
diff --git a/src/test/checksum/.gitignore b/src/test/checksum/.gitignore
new file mode 100644
index 0000000000..871e943d50
--- /dev/null
+++ b/src/test/checksum/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/checksum/Makefile b/src/test/checksum/Makefile
new file mode 100644
index 0000000000..fd03bf73df
--- /dev/null
+++ b/src/test/checksum/Makefile
@@ -0,0 +1,23 @@
+#-------------------------------------------------------------------------
+#
+# Makefile for src/test/checksum
+#
+# Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/checksum/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/test/checksum
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
+
+clean distclean maintainer-clean:
+	rm -rf tmp_check
diff --git a/src/test/checksum/README b/src/test/checksum/README
new file mode 100644
index 0000000000..0f0317060b
--- /dev/null
+++ b/src/test/checksum/README
@@ -0,0 +1,22 @@
+src/test/checksum/README
+
+Regression tests for data checksums
+===================================
+
+This directory contains a test suite for enabling data checksums
+in a running cluster.
+
+Running the tests
+=================
+
+    make check
+
+or
+
+    make installcheck
+
+NOTE: This creates a temporary installation (in the case of "check"),
+with multiple nodes, be they master or standby(s) for the purpose of
+the tests.
+
+NOTE: This requires the --enable-tap-tests argument to configure.
diff --git a/src/test/checksum/meson.build b/src/test/checksum/meson.build
new file mode 100644
index 0000000000..5f96b5c246
--- /dev/null
+++ b/src/test/checksum/meson.build
@@ -0,0 +1,15 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+tests += {
+  'name': 'checksums',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'tests': [
+      't/001_basic.pl',
+      't/002_restarts.pl',
+      't/003_standby_restarts.pl',
+      't/004_offline.pl',
+    ],
+  },
+}
diff --git a/src/test/checksum/t/001_basic.pl b/src/test/checksum/t/001_basic.pl
new file mode 100644
index 0000000000..4c64f6a14f
--- /dev/null
+++ b/src/test/checksum/t/001_basic.pl
@@ -0,0 +1,88 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are turned off
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable data checksums
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+# Wait for checksums to become enabled
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Enable data checksums again which should be a no-op..
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+# ..and make sure we can still read/write data
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums again
+$node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are disabled');
+
+# Test reading again
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure previously checksummed pages can be read back');
+
+# Re-enable checksums and make sure that the underlying data has changed to
+# ensure that checksums will be different.
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back the data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$node->stop;
+
+done_testing();
diff --git a/src/test/checksum/t/002_restarts.pl b/src/test/checksum/t/002_restarts.pl
new file mode 100644
index 0000000000..2697b72225
--- /dev/null
+++ b/src/test/checksum/t/002_restarts.pl
@@ -0,0 +1,92 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# restarting the processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, '1', "ensure checksums aren't enabled yet");
+
+$bsession->quit;
+$node->stop;
+$node->start;
+
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'inprogress-on', "ensure checksums aren't enabled yet");
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are turned on');
+
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+$result = $node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are turned off');
+
+done_testing();
diff --git a/src/test/checksum/t/003_standby_restarts.pl b/src/test/checksum/t/003_standby_restarts.pl
new file mode 100644
index 0000000000..6c0fe8f3bf
--- /dev/null
+++ b/src/test/checksum/t/003_standby_restarts.pl
@@ -0,0 +1,131 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# streaming replication
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize primary node
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+$node_primary->init(allows_streaming => 1, no_data_checksums => 1);
+$node_primary->start;
+my $backup_name = 'my_backup';
+
+# Take backup
+$node_primary->backup($backup_name);
+
+# Create streaming standby linking to primary
+my $node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $backup_name,
+	has_streaming => 1);
+$node_standby_1->start;
+
+# Create some content on the primary to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Wait for standbys to catch up
+$node_primary->wait_for_catchup($node_standby_1, 'replay',
+	$node_primary->lsn('insert'));
+
+# Check that checksums are turned off on all nodes
+my $result = $node_primary->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on primary');
+
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on standby_1');
+
+# Enable checksums for the cluster
+$node_primary->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+# Ensure that the primary switches to "inprogress-on"
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	"inprogress-on");
+is($result, 1, 'ensure checksums are in progress on primary');
+
+# Wait for checksum enable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Ensure that the standby has switched to "inprogress-on" or "on".  Normally it
+# would be "inprogress-on", but it is theoretically possible for the primary to
+# complete the checksum enabling *and* have the standby replay that record
+# before we reach the check below.
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting = 'off' FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'f');
+is($result, 1, 'ensure standby has absorbed the inprogress-on barrier');
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+
+is(($result eq 'inprogress-on' || $result eq 'on'),
+	1, 'ensure checksums are on, or in progress, on standby_1');
+
+# Insert some more data which should be checksummed on INSERT
+$node_primary->safe_psql('postgres',
+	"INSERT INTO t VALUES (generate_series(1, 10000));");
+
+# Wait for checksums enabled on the primary
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the primary');
+
+# Wait for checksums enabled on the standby
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the standby');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, '20000', 'ensure we can safely read all data with checksums');
+
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+# Disable checksums and ensure it's propagated to standby and that we can
+# still read all data
+$node_primary->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+# Wait for checksum disable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure data checksums are disabled on the primary 2');
+
+# Ensure that the standby has switched to off
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are off on standby_1');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, "20000", 'ensure we can safely read all data without checksums');
+
+done_testing();
diff --git a/src/test/checksum/t/004_offline.pl b/src/test/checksum/t/004_offline.pl
new file mode 100644
index 0000000000..9cee62c9b5
--- /dev/null
+++ b/src/test/checksum/t/004_offline.pl
@@ -0,0 +1,101 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums offline from various states
+# of checksum processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable checksums offline using pg_checksums
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are enabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums offline again using pg_checksums
+$node->stop;
+$node->checksum_disable_offline;
+$node->start;
+
+# Ensure that checksums are disabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, 1, 'ensure checksums are in the process of being enabled');
+
+# Turn the cluster off and enable checksums offline, then start back up
+$bsession->quit;
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are now enabled even though processing wasn't
+# restarted
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+done_testing();
diff --git a/src/test/meson.build b/src/test/meson.build
index 67376e4b7f..2ce656a82a 100644
--- a/src/test/meson.build
+++ b/src/test/meson.build
@@ -8,6 +8,7 @@ subdir('postmaster')
 subdir('recovery')
 subdir('subscription')
 subdir('modules')
+subdir('checksum')
 
 if ssl.found()
   subdir('ssl')
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index e5526c7565..0ee435d628 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3574,6 +3574,40 @@ sub advance_wal
 	}
 }
 
+=item $node->checksum_enable_offline()
+
+Enable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_enable_offline
+{
+	my ($self) = @_;
+
+	print "### Enabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-e');
+	return;
+}
+
+=item checksum_disable_offline
+
+Disable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_disable_offline
+{
+	my ($self) = @_;
+
+	print "### Disabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-d');
+	return;
+}
+
 =pod
 
 =back
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2b47013f11..47ea97ab9a 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2032,6 +2032,34 @@ pg_stat_progress_create_index| SELECT s.pid,
     s.param15 AS partitions_done
    FROM (pg_stat_get_progress_info('CREATE INDEX'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
+pg_stat_progress_data_checksums| SELECT s.pid,
+    s.datid,
+    d.datname,
+        CASE s.param1
+            WHEN 0 THEN 'enabling'::text
+            WHEN 1 THEN 'disabling'::text
+            WHEN 2 THEN 'waiting'::text
+            WHEN 3 THEN 'waiting on backends'::text
+            WHEN 4 THEN 'waiting on temporary tables'::text
+            WHEN 5 THEN 'done'::text
+            ELSE NULL::text
+        END AS phase,
+        CASE s.param2
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param2
+        END AS databases_total,
+        CASE s.param3
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param3
+        END AS relations_total,
+    s.param4 AS databases_processed,
+    s.param5 AS relations_processed,
+    s.param6 AS databases_current,
+    s.param7 AS relation_current,
+    s.param8 AS relation_current_blocks,
+    s.param9 AS relation_current_blocks_processed
+   FROM (pg_stat_get_progress_info('DATACHECKSUMS'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 1847bbfa95..443977c3fa 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -396,6 +396,7 @@ CheckPointStmt
 CheckpointStatsData
 CheckpointerRequest
 CheckpointerShmemStruct
+ChecksumType
 Chromosome
 CkptSortItem
 CkptTsStatus
@@ -582,6 +583,10 @@ DataPageDeleteStack
 DataTypesUsageChecks
 DataTypesUsageVersionCheck
 DatabaseInfo
+DataChecksumsWorkerDatabase
+DataChecksumsWorkerResult
+DataChecksumsWorkerResultEntry
+DataChecksumsWorkerShmemStruct
 DateADT
 DateTimeErrorExtra
 Datum
@@ -4079,6 +4084,7 @@ xl_btree_split
 xl_btree_unlink_page
 xl_btree_update
 xl_btree_vacuum
+xl_checksum_state
 xl_clog_truncate
 xl_commit_ts_truncate
 xl_dbase_create_file_copy_rec
-- 
2.39.3 (Apple Git-146)

#17Tomas Vondra
tomas@vondra.me
In reply to: Daniel Gustafsson (#16)
2 attachment(s)
Re: Changing the state of data checksums in a running cluster

Hi,

Unfortunately it seems we're not out of the woods yet :-(

I started doing some more testing on the v8 patch. My plan was to do
some stress testing with physical replication, random restarts and stuff
like that. But I ran into issues before that.

Attached is a reproducer script, that does this:

1) initializes an instance with a small (scale 10) pgbench database

2) runs a pgbench in the background, and flips checksums

3) restarts the database with fast or immediate mode

4) watches for checksums state until it reaches expected value

5) restarts the instance

Of course, the restart interrupts the checksum enable, with this message
in the log:

WARNING: data checksums are being enabled, but no worker is running
1731024482.102 2024-11-08 01:08:02.102 CET [267066] [startup:]
[672d5660.4133a:7] [2024-11-08 01:08:00 CET] [/0] HINT: If checksums
were being enabled during shutdown then processing must be manually
restarted.

That's expected, of course. So I did

SELECT pg_enable_data_checksums()

and "datachecksumsworker launcher" appeared in pg_stat_activity, but
nothing else was happening. It also says:

Waiting for worker in database template0 (pid 258442)

But there are no workers with that PID. Not in the OS, not in the view,
not in the server log. Seems a bit weird. Maybe it already completed,
but then why is there a launcher waiting for it?

Ultimately I tried running CHECKPOINT, And that apparently did the
trick, and the instance restarted. But then on start it hits an assert that:

(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)

But this only happens in the final stop is -m immediate. If I change it
to "-m fast" it works.

I haven't looked into the details, but I guess it's related to the issue
with controlfile update we dealt with about a month ago.

Attached is the test.sh file (make sure to tweak the paths), and an
example of the backtraces. I've seen various processes hitting that.

Two more comments:

* It's a bit surprising that pg_disable_data_checksums() flips the state
right away, while pg_enable_data_checksums() waits for a checkpoint. I
guess it's correct, but maybe the docs should mention this difference?

* The docs currently say:

<para>
If the cluster is stopped while in <literal>inprogress-on</literal> mode,
for any reason, then this process must be restarted manually. To do this,
re-execute the function <function>pg_enable_data_checksums()</function>
once the cluster has been restarted. The background worker will attempt
to resume the work from where it was interrupted.
</para>

I believe that's incorrect/misleading. There's no attempt to resume work
from where it was interrupted.

regards

--
Tomas Vondra

Attachments:

test.shapplication/x-shellscript; name=test.shDownload
backtraces.txttext/plain; charset=UTF-8; name=backtraces.txtDownload
#18Tomas Vondra
tomas@vondra.me
In reply to: Tomas Vondra (#17)
Re: Changing the state of data checksums in a running cluster

Hi,

I spent a bit more time doing some testing on the last version of the
patch from [1]/messages/by-id/CA226DE1-DC9A-4675-A83C-32270C473F0B@yesql.se. And I ran into this assert in PostmasterStateMachine()
when stopping the cluster:

/* All types should be included in targetMask or remainMask */
Assert((remainMask.mask | targetMask.mask) == BTYPE_MASK_ALL.mask);

At first I was puzzled as this happens on every shutdown, but that's
because these checks were introduced by a78af0427015 a week ago. So it's
more a matter of rebasing.

However, I also noticed the progress monitoring does not really work. I
get stuff like this:

    + psql -x test -c 'select * from pg_stat_progress_data_checksums'
    -[ RECORD 1 ]---------------------+---------
    pid                               | 56811
    datid                             | 0
    datname                           |
    phase                             | enabling
    databases_total                   | 4
    relations_total                   |
    databases_processed               | 0
    relations_processed               | 0
    databases_current                 | 16384
    relation_current                  | 0
    relation_current_blocks           | 0
    relation_current_blocks_processed | 0

But I've never seen any of the "processed" fields to be non-zero (and
relations is even NULL), and the same thing applies to relation_. Also
what is the datid/datname about? It's empty, not mentioned in sgml docs,
and we already have databases_current ...

The message [2]/messages/by-id/DD25705F-E75F-4DCA-B49A-5578F4F55D94@yesql.se from 10/08 says:

I did remove parts of the progress reporting for now since it can't be
used from the dynamic backgroundworker it seems. I need to regroup
and figure out a better way there, but I wanted to address your above
find sooner rather than wait for that.

And I guess that would explain why some of the fields are not updated.
But then the later patch versions seem to imply there are no outstanding
issues / missing stuff.

regards

[1]: /messages/by-id/CA226DE1-DC9A-4675-A83C-32270C473F0B@yesql.se
/messages/by-id/CA226DE1-DC9A-4675-A83C-32270C473F0B@yesql.se

[2]: /messages/by-id/DD25705F-E75F-4DCA-B49A-5578F4F55D94@yesql.se
/messages/by-id/DD25705F-E75F-4DCA-B49A-5578F4F55D94@yesql.se

--
Tomas Vondra

#19Michael Paquier
michael@paquier.xyz
In reply to: Tomas Vondra (#18)
Re: Changing the state of data checksums in a running cluster

On Tue, Nov 26, 2024 at 11:07:12PM +0100, Tomas Vondra wrote:

I spent a bit more time doing some testing on the last version of the
patch from [1]. And I ran into this assert in PostmasterStateMachine()
when stopping the cluster:

/* All types should be included in targetMask or remainMask */
Assert((remainMask.mask | targetMask.mask) == BTYPE_MASK_ALL.mask);

At first I was puzzled as this happens on every shutdown, but that's
because these checks were introduced by a78af0427015 a week ago. So it's
more a matter of rebasing.

Looks like the CI is not really happy about this point.. (Please make
sure to refresh the patch status after a review.)
--
Michael

#20Tomas Vondra
tomas@vondra.me
In reply to: Michael Paquier (#19)
5 attachment(s)
Re: Changing the state of data checksums in a running cluster

Here's a rebased version of the patch series, addressing the issues I've
pointed out in the last round of reviews. I've kept the changes in
separate patches for clarity, but it should be squashed into a single
patch in the end.

1) v20250309-0001-Online-enabling-and-disabling-of-data-chec.patch
------------------------------------------------------------------

Original patch, rebased, resolving merge conflicts.

2) v20250309-0002-simple-post-rebase-fixes.patch
------------------------------------------------

A couple minor fixes, addressing test failures due to stuff committed
since the previous patch version. Mostly mechanical, the main change is
I don't think the pgstat_bestart() call is necessary. Or is it?

3) v20250309-0003-sync-the-data_checksums-GUC-with-the-local.patch
------------------------------------------------------------------

This is the main change, fixing failures in 002_actions.pl - the short
version is that test does "-C data_checksums", but AFAICS that does not
quite work because it does not call show_data_checksums() that early,
and instead just grabs the variable backing the GUC. Which may be out of
sync, so this patch fixes that by updating them both.

That fixes the issue, but it's it a bit strange we now have three places
tracking the state of data checksums? We have data_checksum_version in
the control file, and then data_checksums and LocalDataChecksumVersion
in the backends.

Would it be possible to "unify" the latter two? That would also mean we
don't have the duplicate constants like PG_DATA_CHECKSUM_VERSION and
DATA_CHECKSUM_VERSION. Or why do we need that?

4) v20250309-0004-make-progress-reporting-work.patch
----------------------------------------------------

The progress reporting got "mostly disabled" in an earlier version, due
to issues with the bgworkers. AFAICS the issue is the "status" row can
be updated only by a single process, which does not quite work with the
launcher + per-db workers architecture.

I've considered a couple different approaches:

a) updating the status only from the launcher

This is mostly what CREATE INDEX does with parallel builds, and there
it's mostly sufficient. But for checksums it'd mean we only have the
number of databases to process/done, and that seems unsatisfactory,
considering large clusters often have only a single large database. So
not good enough, IMHO.

b) allowing workers to update the status row, created by the launcher

I guess this would be better, we'd know the relations/blocks counts. And
I haven't tried coding this, but there would need to be some locking so
that the workers don't overwrite increments from other workers, etc.

But I don't think it'd work nicely with parallel per-db workers (which
we don't do now, but we might).

c) having one status entry per worker

I ended up doing this, it didn't require any changes to the progress
infrastructure, and it will work naturally even with multiple workers.
There will always be one row for launcher status (which only has the
number of databases total/done), and then one row per worker, with
database-level info (datid, datname, #relations, #blocks).

I removed the "DONE" phase, because that's right before the launcher
exists, and I don't think we have that for similar cases. And I added
"waiting on checkpoint" state, because that's often a long wait when the
launcher seems to do nothing, so it seems useful to communicate the
reason for that wait.

5) v20250309-0005-update-docs.patch
-----------------------------------

Minor tweaks to docs, to reflect the changes to the progress reporting
changes, and also some corrections (no resume after restart, ...).

So far this passed all my tests - both chekc-world and stress testing
(no failures / assert crashes, ...). One thing that puzzles me is I
wasn't able to reproduce the failures reported in [1]/messages/by-id/e4dbcb2c-e04a-4ba2-bff0-8d979f55960e@vondra.me - not even with
just the rebase + minimal fixes (0001 + 0002). My best theory is this
is somehow machine-specific, and my laptop is too slow or something.
I'll try with the machine I used before once it gets available.

regards

[1]: /messages/by-id/e4dbcb2c-e04a-4ba2-bff0-8d979f55960e@vondra.me
/messages/by-id/e4dbcb2c-e04a-4ba2-bff0-8d979f55960e@vondra.me

--
Tomas Vondra

Attachments:

v20250309-0001-Online-enabling-and-disabling-of-data-chec.patchtext/x-patch; charset=UTF-8; name=v20250309-0001-Online-enabling-and-disabling-of-data-chec.patchDownload
From c8e48010604a0042d22d1b0d3b31c335277b1ceb Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Fri, 7 Mar 2025 19:21:22 +0100
Subject: [PATCH v20250309 1/5] Online enabling and disabling of data checksums

This allows data checksums to be enabled, or disabled, in a running
cluster without restricting access to the cluster during processing.

Data checksums could prior to this only be enabled during initdb or
when the cluster is offline using the pg_checksums app. This commit
introduce functionality to enable, or disable, data checksums while
the cluster is running regardless of how it was initialized.

A background worker launcher process is responsible for launching a
dynamic per-database background worker which will mark all buffers
dirty for all relation with storage in order for them to have data
checksums calcuated on write.  Once all relations in all databases
have been processed, the data_checksums state will be set to on and
the cluster will at that point be identical to one which had data
checksums enabled during initialization or via offline processing.

When data checksums are being enabled, concurrent I/O operations
from backends other than the data checksums worker will write the
checksums but not verify them on reading.  Only when all backends
have absorbed the procsignalbarrier for setting data_checksums to
on will they also start verifying checksums on reading.  The same
process is repeated during disabling; all backends write checksums
but do not verify them until the barrier for setting the state to
off has been absorbed by all.  This in-progress state is used to
ensure there are no false negatives (or positives) due to reading
a checksum which is not in sync with the page.

This work is based on an earlier version of this patch which was
reviewed by among others Heikki Linnakangas, Robert Haas, Andres
Freund, Tomas Vondra, Michael Banck and Andrey Borodin.

Author: Daniel Gustafsson <daniel@yesql.se>
Author: Magnus Hagander <magnus@hagander.net>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com
Discussion: https://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de
Discussion: https://postgr.es/m/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com
---
 doc/src/sgml/func.sgml                        |   71 +
 doc/src/sgml/glossary.sgml                    |   23 +
 doc/src/sgml/monitoring.sgml                  |  207 ++-
 doc/src/sgml/ref/pg_checksums.sgml            |    6 +
 doc/src/sgml/wal.sgml                         |   57 +-
 src/backend/access/rmgrdesc/xlogdesc.c        |   24 +
 src/backend/access/transam/xlog.c             |  450 +++++-
 src/backend/access/transam/xlogfuncs.c        |   41 +
 src/backend/backup/basebackup.c               |    6 +-
 src/backend/catalog/system_functions.sql      |   11 +
 src/backend/catalog/system_views.sql          |   21 +
 src/backend/postmaster/Makefile               |    1 +
 src/backend/postmaster/bgworker.c             |    7 +
 src/backend/postmaster/datachecksumsworker.c  | 1376 +++++++++++++++++
 src/backend/postmaster/meson.build            |    1 +
 src/backend/replication/logical/decode.c      |    1 +
 src/backend/storage/ipc/ipci.c                |    4 +
 src/backend/storage/ipc/procsignal.c          |   13 +
 src/backend/storage/page/README               |    4 +-
 src/backend/storage/page/bufpage.c            |    6 +-
 src/backend/utils/activity/pgstat_io.c        |    2 +
 .../utils/activity/wait_event_names.txt       |    3 +
 src/backend/utils/adt/pgstatfuncs.c           |    8 +-
 src/backend/utils/init/miscinit.c             |    9 +-
 src/backend/utils/init/postinit.c             |    7 +-
 src/backend/utils/misc/guc_tables.c           |   32 +-
 src/bin/pg_checksums/pg_checksums.c           |    2 +-
 src/bin/pg_upgrade/controldata.c              |    9 +
 src/include/access/xlog.h                     |   16 +-
 src/include/access/xlog_internal.h            |    7 +
 src/include/catalog/pg_control.h              |    1 +
 src/include/catalog/pg_proc.dat               |   17 +
 src/include/commands/progress.h               |   18 +
 src/include/miscadmin.h                       |    4 +
 src/include/postmaster/datachecksumsworker.h  |   31 +
 src/include/storage/bufpage.h                 |   10 +
 src/include/storage/checksum.h                |    8 +
 src/include/storage/lwlocklist.h              |    1 +
 src/include/storage/proc.h                    |    5 +-
 src/include/storage/procsignal.h              |    5 +
 src/include/utils/backend_progress.h          |    1 +
 src/test/Makefile                             |   11 +-
 src/test/checksum/.gitignore                  |    2 +
 src/test/checksum/Makefile                    |   23 +
 src/test/checksum/README                      |   22 +
 src/test/checksum/meson.build                 |   15 +
 src/test/checksum/t/001_basic.pl              |   88 ++
 src/test/checksum/t/002_restarts.pl           |   92 ++
 src/test/checksum/t/003_standby_restarts.pl   |  131 ++
 src/test/checksum/t/004_offline.pl            |  101 ++
 src/test/meson.build                          |    1 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |   34 +
 src/test/regress/expected/rules.out           |   28 +
 src/tools/pgindent/typedefs.list              |    6 +
 54 files changed, 3031 insertions(+), 49 deletions(-)
 create mode 100644 src/backend/postmaster/datachecksumsworker.c
 create mode 100644 src/include/postmaster/datachecksumsworker.h
 create mode 100644 src/test/checksum/.gitignore
 create mode 100644 src/test/checksum/Makefile
 create mode 100644 src/test/checksum/README
 create mode 100644 src/test/checksum/meson.build
 create mode 100644 src/test/checksum/t/001_basic.pl
 create mode 100644 src/test/checksum/t/002_restarts.pl
 create mode 100644 src/test/checksum/t/003_standby_restarts.pl
 create mode 100644 src/test/checksum/t/004_offline.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 51dd8ad6571..a09843e4ecf 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29865,6 +29865,77 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
 
   </sect2>
 
+  <sect2 id="functions-admin-checksum">
+   <title>Data Checksum Functions</title>
+
+   <para>
+    The functions shown in <xref linkend="functions-checksums-table" /> can
+    be used to enable or disable data checksums in a running cluster.
+    See <xref linkend="checksums" /> for details.
+   </para>
+
+   <table id="functions-checksums-table">
+    <title>Data Checksum Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_enable_data_checksums</primary>
+        </indexterm>
+        <function>pg_enable_data_checksums</function> ( <optional><parameter>cost_delay</parameter> <type>int</type>, <parameter>cost_limit</parameter> <type>int</type></optional> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Initiates data checksums for the cluster. This will switch the data
+        checksums mode to <literal>inprogress-on</literal> as well as start a
+        background worker that will process all pages in the database and
+        enable checksums on them. When all data pages have had checksums
+        enabled, the cluster will automatically switch data checksums mode to
+        <literal>on</literal>.
+       </para>
+       <para>
+        If <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter> are
+        specified, the speed of the process is throttled using the same principles as
+        <link linkend="runtime-config-resource-vacuum-cost">Cost-based Vacuum Delay</link>.
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_disable_data_checksums</primary>
+        </indexterm>
+        <function>pg_disable_data_checksums</function> ()
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Disables data checksum validation and calculation for the cluster. This
+        will switch the data checksum mode to <literal>inprogress-off</literal>
+        while data checksums are being disabled. When all active backends have
+        stopped validating data checksums, the data checksum mode will be
+        changed to <literal>off</literal>.  At this point the data pages will
+        still have checksums recorded but they are not updated when pages are
+        modified.
+       </para></entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+
+  </sect2>
+
   <sect2 id="functions-admin-dbobject">
    <title>Database Object Management Functions</title>
 
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index c0f812e3f5e..547e8586d4b 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -159,6 +159,8 @@
      (but not the autovacuum workers),
      the <glossterm linkend="glossary-background-writer">background writer</glossterm>,
      the <glossterm linkend="glossary-checkpointer">checkpointer</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker">data checksums worker</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker-launcher">data checksums worker launcher</glossterm>,
      the <glossterm linkend="glossary-logger">logger</glossterm>,
      the <glossterm linkend="glossary-startup-process">startup process</glossterm>,
      the <glossterm linkend="glossary-wal-archiver">WAL archiver</glossterm>,
@@ -548,6 +550,27 @@
    </glossdef>
   </glossentry>
 
+  <glossentry id="glossary-data-checksums-worker">
+   <glossterm>Data Checksums Worker</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which enables or disables data checksums in a specific database.
+    </para>
+   </glossdef>
+  </glossentry>
+
+  <glossentry id="glossary-data-checksums-worker-launcher">
+   <glossterm>Data Checksums Worker Launcher</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which starts <glossterm linkend="glossary-data-checksums-worker"> processes</glossterm>
+     for each database.
+    </para>
+   </glossdef>
+  </glossentry>
+
   <glossentry id="glossary-db-cluster">
    <glossterm>Database cluster</glossterm>
    <glossdef>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 16646f560e8..4a6ef5f1605 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3497,8 +3497,9 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Number of data page checksum failures detected in this
-       database (or on a shared object), or NULL if data checksums are
-       disabled.
+       database (or on a shared object).
+       Detected failures are reported regardless of the
+       <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -3508,8 +3509,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Time at which the last data page checksum failure was detected in
-       this database (or on a shared object), or NULL if data checksums are
-       disabled.
+       this database (or on a shared object). Last failure is reported
+       regardless of the <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -6793,6 +6794,204 @@ FROM pg_stat_get_backend_idset() AS backendid;
 
  </sect2>
 
+ <sect2 id="data-checksum-progress-reporting">
+  <title>Data Checksum Progress Reporting</title>
+
+  <indexterm>
+   <primary>pg_stat_progress_data_checksums</primary>
+  </indexterm>
+
+  <para>
+   When data checksums are being enabled on a running cluster, the
+   <structname>pg_stat_progress_data_checksums</structname> view will contain
+   a row for each background worker which is currently calculating checksums
+   for the data pages.
+  </para>
+
+  <table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
+   <title><structname>pg_stat_progress_data_checksums</structname> View</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        Column Type
+       </para>
+       <para>
+        Description>
+       </para>
+      </entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>pid</structfield> <type>integer</type>
+       </para>
+       <para>
+        Process ID of a datachecksumworker process.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>phase</structfield> <type>text</type>
+       </para>
+       <para>
+        Current processing phase, see <xref linkend="datachecksum-phases"/>
+        for description of the phases.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of databases which will be processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of relations which will be processed, or
+        <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of relations yet.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of databases which have been processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of relations which have been processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_current</structfield> <type>oid</type>
+       </para>
+       <para>
+        OID of the database currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current</structfield> <type>oid</type>
+       </para>
+       <para>
+        OID of the relation currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current_blocks</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of blocks in the relation currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current_blocks_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks which have been processed in the relation currently
+        being processed.
+       </para>
+      </entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+  <table id="datachecksum-phases">
+   <title>Data Checksum Phases</title>
+   <tgroup cols="2">
+    <colspec colname="col1" colwidth="1*"/>
+    <colspec colname="col2" colwidth="2*"/>
+    <thead>
+     <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+    <tbody>
+     <row>
+      <entry><literal>enabling</literal></entry>
+      <entry>
+       The command is currently enabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>disabling</literal></entry>
+      <entry>
+       The command is currently disabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on backends</literal></entry>
+      <entry>
+       The command is currently waiting for backends to acknowledge the data
+       checksum operation.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on temporary tables</literal></entry>
+      <entry>
+       The command is currently waiting for all temporary tables which existed
+       at the time the command was started to be removed.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>done</literal></entry>
+      <entry>
+       The command has finished processing and is exiting.
+      </entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+ </sect2>
+
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index 95043aa329c..0343710af53 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -45,6 +45,12 @@ PostgreSQL documentation
    exit status is nonzero if the operation failed.
   </para>
 
+  <para>
+   When enabling checksums, if checksums were in the process of being enabled
+   when the cluster was shut down, <application>pg_checksums</application>
+   will still process all relations regardless of the online processing.
+  </para>
+
   <para>
    When verifying checksums, every file in the cluster is scanned. When
    enabling checksums, each relation file block with a changed checksum is
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index f3b86b26be9..2562b1a002c 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -246,9 +246,10 @@
   <para>
    Checksums can be disabled when the cluster is initialized using <link
    linkend="app-initdb-data-checksums"><application>initdb</application></link>.
-   They can also be enabled or disabled at a later time as an offline
-   operation. Data checksums are enabled or disabled at the full cluster
-   level, and cannot be specified individually for databases or tables.
+   They can also be enabled or disabled at a later time either as an offline
+   operation or online in a running cluster allowing concurrent access. Data
+   checksums are enabled or disabled at the full cluster level, and cannot be
+   specified individually for databases or tables.
   </para>
 
   <para>
@@ -265,7 +266,7 @@
   </para>
 
   <sect2 id="checksums-offline-enable-disable">
-   <title>Off-line Enabling of Checksums</title>
+   <title>Offline Enabling of Checksums</title>
 
    <para>
     The <link linkend="app-pgchecksums"><application>pg_checksums</application></link>
@@ -274,6 +275,54 @@
    </para>
 
   </sect2>
+
+  <sect2 id="checksums-online-enable-disable">
+   <title>Online Enabling of Checksums</title>
+
+   <para>
+    Checksums can be enabled or disabled online, by calling the appropriate
+    <link linkend="functions-admin-checksum">functions</link>.
+   </para>
+
+   <para>
+    Enabling checksums will put the cluster checksum mode in
+    <literal>inprogress-on</literal> mode.  During this time, checksums will be
+    written but not verified. In addition to this, a background worker process
+    is started that enables checksums on all existing data in the cluster. Once
+    this worker has completed processing all databases in the cluster, the
+    checksum mode will automatically switch to <literal>on</literal>. The
+    processing will consume two background worker processes, make sure that
+    <varname>max_worker_processes</varname> allows for at least two more
+    additional processes.
+   </para>
+
+   <para>
+    The process will initially wait for all open transactions to finish before
+    it starts, so that it can be certain that there are no tables that have been
+    created inside a transaction that has not committed yet and thus would not
+    be visible to the process enabling checksums. It will also, for each database,
+    wait for all pre-existing temporary tables to get removed before it finishes.
+    If long-lived temporary tables are used in the application it may be necessary
+    to terminate these application connections to allow the process to complete.
+   </para>
+
+   <para>
+    If the cluster is stopped while in <literal>inprogress-on</literal> mode, for
+    any reason, then this process must be restarted manually. To do this,
+    re-execute the function <function>pg_enable_data_checksums()</function>
+    once the cluster has been restarted. The background worker will attempt
+    to resume the work from where it was interrupted.
+   </para>
+
+   <note>
+    <para>
+     Enabling checksums can cause significant I/O to the system, as most of the
+     database pages will need to be rewritten, and will be written both to the
+     data files and the WAL.
+    </para>
+   </note>
+
+  </sect2>
  </sect1>
 
   <sect1 id="wal-intro">
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 58040f28656..1e3ad2ab68d 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -18,6 +18,7 @@
 #include "access/xlog.h"
 #include "access/xlog_internal.h"
 #include "catalog/pg_control.h"
+#include "storage/bufpage.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -167,6 +168,26 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
 		memcpy(&wal_level, rec, sizeof(int));
 		appendStringInfo(buf, "wal_level %s", get_wal_level_string(wal_level));
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state xlrec;
+
+		memcpy(&xlrec, rec, sizeof(xl_checksum_state));
+		switch (xlrec.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_VERSION:
+				appendStringInfoString(buf, "on");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				appendStringInfoString(buf, "inprogress-off");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				appendStringInfoString(buf, "inprogress-on");
+				break;
+			default:
+				appendStringInfoString(buf, "off");
+		}
+	}
 }
 
 const char *
@@ -218,6 +239,9 @@ xlog_identify(uint8 info)
 		case XLOG_CHECKPOINT_REDO:
 			id = "CHECKPOINT_REDO";
 			break;
+		case XLOG_CHECKSUMS:
+			id = "CHECKSUMS";
+			break;
 	}
 
 	return id;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 799fc739e18..b5190a7e104 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -647,6 +647,16 @@ static XLogRecPtr LocalMinRecoveryPoint;
 static TimeLineID LocalMinRecoveryPointTLI;
 static bool updateMinRecoveryPoint = true;
 
+/*
+ * Local state fror Controlfile data_checksum_version. After initialization
+ * this is only updated when absorbing a procsignal barrier during interrupt
+ * processing. The reason for keeping a copy in backend-private memory is to
+ * avoid locking for interrogating checksum state. Possible values are the
+ * checksum versions defined in storage/bufpage.h as well as zero when data
+ * checksums are disabled.
+ */
+static uint32 LocalDataChecksumVersion = 0;
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -715,6 +725,8 @@ static void WALInsertLockAcquireExclusive(void);
 static void WALInsertLockRelease(void);
 static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);
 
+static void XLogChecksums(ChecksumType new_type);
+
 /*
  * Insert an XLOG record represented by an already-constructed chain of data
  * chunks.  This is a low-level routine; to construct the WAL record header
@@ -828,9 +840,10 @@ XLogInsertRecord(XLogRecData *rdata,
 		 * only happen just after a checkpoint, so it's better to be slow in
 		 * this case and fast otherwise.
 		 *
-		 * Also check to see if fullPageWrites was just turned on or there's a
-		 * running backup (which forces full-page writes); if we weren't
-		 * already doing full-page writes then go back and recompute.
+		 * Also check to see if fullPageWrites was just turned on, there's a
+		 * running backup or if checksums are enabled (all of which forces
+		 * full-page writes); if we weren't already doing full-page writes
+		 * then go back and recompute.
 		 *
 		 * If we aren't doing full-page writes then RedoRecPtr doesn't
 		 * actually affect the contents of the XLOG record, so we'll update
@@ -843,7 +856,9 @@ XLogInsertRecord(XLogRecData *rdata,
 			Assert(RedoRecPtr < Insert->RedoRecPtr);
 			RedoRecPtr = Insert->RedoRecPtr;
 		}
-		doPageWrites = (Insert->fullPageWrites || Insert->runningBackups > 0);
+		doPageWrites = (Insert->fullPageWrites ||
+						Insert->runningBackups > 0 ||
+						DataChecksumsNeedWrite());
 
 		if (doPageWrites &&
 			(!prevDoPageWrites ||
@@ -4579,9 +4594,7 @@ ReadControlFile(void)
 
 	CalculateCheckpointSegments();
 
-	/* Make the initdb settings visible as GUC variables, too */
-	SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no",
-					PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
+	LocalDataChecksumVersion = ControlFile->data_checksum_version;
 }
 
 /*
@@ -4615,13 +4628,345 @@ GetMockAuthenticationNonce(void)
 }
 
 /*
- * Are checksums enabled for data pages?
+ * DataChecksumsNeedWrite
+ *		Returns whether data checksums must be written or not
+ *
+ * Returns true iff data checksums are enabled or are in the process of being
+ * enabled. During "inprogress-on" and "inprogress-off" states checksums must
+ * be written even though they are not verified (see datachecksumsworker.c for
+ * a longer discussion).
+ *
+ * This function is intended for callsites which are about to write a data page
+ * to storage, and need to know whether to re-calculate the checksum for the
+ * page header. Calling this function must be performed as close to the write
+ * operation as possible to keep the critical section short.
+ */
+bool
+DataChecksumsNeedWrite(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * DataChecksumsNeedVerify
+ *		Returns whether data checksums must be verified or not
+ *
+ * Data checksums are only verified if they are fully enabled in the cluster.
+ * During the "inprogress-on" and "inprogress-off" states they are only
+ * updated, not verified (see datachecksumsworker.c for a longer discussion).
+ *
+ * This function is intended for callsites which have read data and are about
+ * to perform checksum validation based on the result of this.  Calling this
+ * function must be performed as close to the validation call as possible to
+ * keep the critical section short. This is in order to protect against time of
+ * check/time of use situations around data checksum validation.
+ */
+bool
+DataChecksumsNeedVerify(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
+}
+
+/*
+ * DataChecksumsOnInProgress
+ *		Returns whether data checksums are being enabled
+ *
+ * Most operations don't need to worry about the "inprogress" states, and
+ * should use DataChecksumsNeedVerify() or DataChecksumsNeedWrite(). The
+ * "inprogress-on" state for enabling checksums is used when the checksum
+ * worker is setting checksums on all pages, it can thus be used to check for
+ * aborted checksum processing which need to be restarted.
+ */
+inline bool
+DataChecksumsOnInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+}
+
+/*
+ * DataChecksumsOffInProgress
+ *		Returns whether data checksums are being disabled
+ *
+ * The "inprogress-off" state for disabling checksums is used for when the
+ * worker resets the catalog state.  DataChecksumsNeedVerify() or
+ * DataChecksumsNeedWrite() should be used for deciding whether to read/write
+ * checksums.
  */
 bool
-DataChecksumsEnabled(void)
+DataChecksumsOffInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * SetDataChecksumsOnInProgress
+ *		Sets the data checksum state to "inprogress-on" to enable checksums
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on". See
+ * SetDataChecksumsOn below for a description on how this state change works.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOnInProgress(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile != NULL);
+
+	/*
+	 * The state transition is performed in a critical section with
+	 * checkpoints held off to provide crash safety.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state change in all backends to ensure that all backends are in
+	 * "inprogress-on". Once done we know that all backends are writing data
+	 * checksums.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOn
+ *		Enables data checksums cluster-wide
+ *
+ * Enabling data checksums is performed using two barriers, the first one to
+ * set the checksums state to "inprogress-on" (which is performed by
+ * SetDataChecksumsOnInProgress()) and the second one to set the state to "on"
+ * (performed here).
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on".  This state
+ * requires data checksums to be written but not verified. This ensures that
+ * all data pages can be checksummed without the risk of false negatives in
+ * validation during the process.  When all existing pages are guaranteed to
+ * have checksums, and all new pages will be initiated with checksums, the
+ * state can be changed to "on". Once the state is "on" checksums will be both
+ * written and verified. See datachecksumsworker.c for a longer discussion on
+ * how data checksums can be enabled in a running cluster.
+ *
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOn(void)
 {
+	uint64		barrier;
+
 	Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/*
+	 * The only allowed state transition to "on" is from "inprogress-on" since
+	 * that state ensures that all pages will have data checksums written.
+	 */
+	if (ControlFile->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+		elog(ERROR, "checksums not in \"inprogress-on\" mode");
+	}
+
+	LWLockRelease(ControlFileLock);
+
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state transition of "on" in all backends. When done we know that
+	 * data checksums are enabled in all backends and data checksums are both
+	 * written and verified.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOff
+ *		Disables data checksums cluster-wide
+ *
+ * Disabling data checksums must be performed with two sets of barriers, each
+ * carrying a different state. The state is first set to "inprogress-off"
+ * during which checksums are still written but not verified. This ensures that
+ * backends which have yet to observe the state change from "on" won't get
+ * validation errors on concurrently modified pages. Once all backends have
+ * changed to "inprogress-off", the barrier for moving to "off" can be emitted.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOff(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/* If data checksums are already disabled there is nothing to do */
+	if (ControlFile->data_checksum_version == 0)
+	{
+		LWLockRelease(ControlFileLock);
+		return;
+	}
+
+	/*
+	 * If data checksums are currently enabled we first transition to the
+	 * "inprogress-off" state during which backends continue to write
+	 * checksums without verifying them. When all backends are in
+	 * "inprogress-off" the next transition to "off" can be performed, after
+	 * which all data checksum processing is disabled.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+
+		MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+		START_CRIT_SECTION();
+
+		XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+		LWLockRelease(ControlFileLock);
+
+		barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+
+		END_CRIT_SECTION();
+		MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+
+		/*
+		 * Update local state in all backends to ensure that any backend in
+		 * "on" state is changed to "inprogress-off".
+		 */
+		WaitForProcSignalBarrier(barrier);
+
+		/*
+		 * At this point we know that no backends are verifying data checksums
+		 * during reading. Next, we can safely move to state "off" to also
+		 * stop writing checksums.
+		 */
+	}
+	else
+	{
+		/*
+		 * Ending up here implies that the checksums state is "inprogress-on"
+		 * or "inprogress-off" and we can transition directly to "off" from
+		 * there.
+		 */
+		LWLockRelease(ControlFileLock);
+	}
+
+	/*
+	 * Ensure that we don't incur a checkpoint during disabling checksums.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(0);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = 0;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * ProcSignalBarrier absorption functions for enabling and disabling data
+ * checksums in a running cluster. The procsignalbarriers are emitted in the
+ * SetDataChecksums* functions.
+ */
+bool
+AbsorbChecksumsOnInProgressBarrier(void)
+{
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOnBarrier(void)
+{
+	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffInProgressBarrier(void)
+{
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffBarrier(void)
+{
+	LocalDataChecksumVersion = 0;
+	return true;
+}
+
+/*
+ * InitLocalControlData
+ *
+ * Set up backend local caches of controldata variables which may change at
+ * any point during runtime and thus require special cased locking. So far
+ * this only applies to data_checksum_version, but it's intended to be general
+ * purpose enough to handle future cases.
+ */
+void
+InitLocalControldata(void)
+{
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	LocalDataChecksumVersion = ControlFile->data_checksum_version;
+	LWLockRelease(ControlFileLock);
+}
+
+/* guc hook */
+const char *
+show_data_checksums(void)
+{
+	if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)
+		return "on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		return "inprogress-on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+		return "inprogress-off";
+	else
+		return "off";
 }
 
 /*
@@ -6194,6 +6539,33 @@ StartupXLOG(void)
 	 */
 	CompleteCommitTsInitialization();
 
+	/*
+	 * If we reach this point with checksums being enabled ("inprogress-on"
+	 * state), we notify the user that they need to manually restart the
+	 * process to enable checksums. This is because we cannot launch a dynamic
+	 * background worker directly from here, it has to be launched from a
+	 * regular backend.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		ereport(WARNING,
+				(errmsg("data checksums are being enabled, but no worker is running"),
+				 errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
+
+	/*
+	 * If data checksums were being disabled when the cluster was shut down,
+	 * we know that we have a state where all backends have stopped validating
+	 * checksums and we can move to off instead of prompting the user to
+	 * perform any action.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+	{
+		XLogChecksums(0);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = 0;
+		LWLockRelease(ControlFileLock);
+	}
+
 	/*
 	 * All done with end-of-recovery actions.
 	 *
@@ -8201,6 +8573,24 @@ XLogReportParameters(void)
 	}
 }
 
+/*
+ * Log the new state of checksums
+ */
+static void
+XLogChecksums(ChecksumType new_type)
+{
+	xl_checksum_state xlrec;
+	XLogRecPtr	recptr;
+
+	xlrec.new_checksumtype = new_type;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, sizeof(xl_checksum_state));
+
+	recptr = XLogInsert(RM_XLOG_ID, XLOG_CHECKSUMS);
+	XLogFlush(recptr);
+}
+
 /*
  * Update full_page_writes in shared memory, and write an
  * XLOG_FPW_CHANGE record if necessary.
@@ -8629,6 +9019,46 @@ xlog_redo(XLogReaderState *record)
 	{
 		/* nothing to do here, just for informational purposes */
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state state;
+		uint64		barrier;
+
+		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = state.new_checksumtype;
+		LWLockRelease(ControlFileLock);
+
+		/*
+		 * Block on a procsignalbarrier to await all processes having seen the
+		 * change to checksum status. Once the barrier has been passed we can
+		 * initiate the corresponding processing.
+		 */
+		switch (state.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			default:
+				Assert(state.new_checksumtype == 0);
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+		}
+	}
 }
 
 /*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 8c3090165f0..6b1c2ed85ea 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -26,6 +26,7 @@
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "postmaster/datachecksumsworker.h"
 #include "replication/walreceiver.h"
 #include "storage/fd.h"
 #include "storage/latch.h"
@@ -748,3 +749,43 @@ pg_promote(PG_FUNCTION_ARGS)
 						   wait_seconds)));
 	PG_RETURN_BOOL(false);
 }
+
+/*
+ * Disables data checksums for the cluster, if applicable. Starts a background
+ * worker which turns off the data checksums.
+ */
+Datum
+disable_data_checksums(PG_FUNCTION_ARGS)
+{
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	StartDataChecksumsWorkerLauncher(false, 0, 0, false);
+	PG_RETURN_VOID();
+}
+
+/*
+ * Enables data checksums for the cluster, if applicable.  Supports vacuum-
+ * like cost based throttling to limit system load. Starts a background worker
+ * which updates data checksums on existing data.
+ */
+Datum
+enable_data_checksums(PG_FUNCTION_ARGS)
+{
+	int			cost_delay = PG_GETARG_INT32(0);
+	int			cost_limit = PG_GETARG_INT32(1);
+	bool		fast = PG_GETARG_BOOL(2);
+
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	if (cost_delay < 0)
+		ereport(ERROR, errmsg("cost delay cannot be a negative value"));
+
+	if (cost_limit <= 0)
+		ereport(ERROR, errmsg("cost limit must be greater than zero"));
+
+	StartDataChecksumsWorkerLauncher(true, cost_delay, cost_limit, fast);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 3f8a3c55725..598dd41bae6 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1613,7 +1613,8 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	 * enabled for this cluster, and if this is a relation file, then verify
 	 * the checksum.
 	 */
-	if (!noverify_checksums && DataChecksumsEnabled() &&
+	if (!noverify_checksums &&
+		DataChecksumsNeedWrite() &&
 		RelFileNumberIsValid(relfilenumber))
 		verify_checksum = true;
 
@@ -2006,6 +2007,9 @@ verify_page_checksum(Page page, XLogRecPtr start_lsn, BlockNumber blkno,
 	if (PageIsNew(page) || PageGetLSN(page) >= start_lsn)
 		return true;
 
+	if (!DataChecksumsNeedVerify())
+		return true;
+
 	/* Perform the actual checksum calculation. */
 	checksum = pg_checksum_page(page, blkno);
 
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 566f308e443..b18db2b5dde 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -650,6 +650,13 @@ LANGUAGE INTERNAL
 CALLED ON NULL INPUT VOLATILE PARALLEL SAFE
 AS 'pg_stat_reset_slru';
 
+CREATE OR REPLACE FUNCTION
+  pg_enable_data_checksums(cost_delay integer DEFAULT 0,
+                           cost_limit integer DEFAULT 100,
+						   fast boolean DEFAULT false)
+  RETURNS void STRICT VOLATILE LANGUAGE internal AS 'enable_data_checksums'
+  PARALLEL RESTRICTED;
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
@@ -775,6 +782,10 @@ REVOKE EXECUTE ON FUNCTION pg_ls_logicalmapdir() FROM PUBLIC;
 
 REVOKE EXECUTE ON FUNCTION pg_ls_replslotdir(text) FROM PUBLIC;
 
+REVOKE EXECUTE ON FUNCTION pg_enable_data_checksums(integer, integer, boolean) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_disable_data_checksums() FROM public;
+
 --
 -- We also set up some things as accessible to standard roles.
 --
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index a4d2cfdcaf5..0d62976dc1f 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1334,6 +1334,27 @@ CREATE VIEW pg_stat_progress_copy AS
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_data_checksums AS
+	SELECT
+		S.pid AS pid, S.datid AS datid, D.datname AS datname,
+		CASE S.param1 WHEN 0 THEN 'enabling'
+                      WHEN 1 THEN 'disabling'
+					  WHEN 2 THEN 'waiting'
+					  WHEN 3 THEN 'waiting on backends'
+					  WHEN 4 THEN 'waiting on temporary tables'
+					  WHEN 5 THEN 'done'
+					  END AS phase,
+		CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+		CASE S.param3 WHEN -1 THEN NULL ELSE S.param3 END AS relations_total,
+		S.param4 AS databases_processed,
+		S.param5 AS relations_processed,
+		S.param6 AS databases_current,
+		S.param7 AS relation_current,
+		S.param8 AS relation_current_blocks,
+		S.param9 AS relation_current_blocks_processed
+	FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+		LEFT JOIN pg_database D ON S.datid = D.oid;
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/postmaster/Makefile b/src/backend/postmaster/Makefile
index 0f4435d2d97..0c36765acfe 100644
--- a/src/backend/postmaster/Makefile
+++ b/src/backend/postmaster/Makefile
@@ -18,6 +18,7 @@ OBJS = \
 	bgworker.o \
 	bgwriter.o \
 	checkpointer.o \
+	datachecksumsworker.o \
 	fork_process.o \
 	interrupt.o \
 	launch_backend.o \
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 116ddf7b835..5720f66b0fd 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -18,6 +18,7 @@
 #include "pgstat.h"
 #include "port/atomics.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "replication/logicallauncher.h"
 #include "replication/logicalworker.h"
@@ -132,6 +133,12 @@ static const struct
 	},
 	{
 		"TablesyncWorkerMain", TablesyncWorkerMain
+	},
+	{
+		"DataChecksumsWorkerLauncherMain", DataChecksumsWorkerLauncherMain
+	},
+	{
+		"DataChecksumsWorkerMain", DataChecksumsWorkerMain
 	}
 };
 
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
new file mode 100644
index 00000000000..de7a077f9c2
--- /dev/null
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -0,0 +1,1376 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.c
+ *	  Background worker for enabling or disabling data checksums online
+ *
+ * When enabling data checksums on a database at initdb time or when shut down
+ * with pg_checksums, no extra process is required as each page is checksummed,
+ * and verified, when accessed.  When enabling checksums on an already running
+ * cluster, this worker will ensure that all pages are checksummed before
+ * verification of the checksums is turned on. In the case of disabling
+ * checksums, the state transition is performed only in the control file, no
+ * changes are performed on the data pages.
+ *
+ * Checksums can be either enabled or disabled cluster-wide, with on/off being
+ * the end state for data_checksums.
+ *
+ * Enabling checksums
+ * ------------------
+ * When enabling checksums in an online cluster, data_checksums will be set to
+ * "inprogress-on" which signals that write operations MUST compute and write
+ * the checksum on the data page, but during reading the checksum SHALL NOT be
+ * verified. This ensures that all objects created during checksumming will
+ * have checksums set, but no reads will fail due to incorrect checksum. The
+ * DataChecksumsWorker will compile a list of databases which exist at the
+ * start of checksumming, and all of these which haven't been dropped during
+ * the processing MUST have been processed successfully in order for checksums
+ * to be enabled. Any new relation created during processing will see the
+ * in-progress state and will automatically be checksummed.
+ *
+ * For each database, all relations which have storage are read and every data
+ * page is marked dirty to force a write with the checksum. This will generate
+ * a lot of WAL as the entire database is read and written.
+ *
+ * If the processing is interrupted by a cluster restart, it will be restarted
+ * from the beginning again as state isn't persisted.
+ *
+ * Disabling checksums
+ * -------------------
+ * When disabling checksums, data_checksums will be set to "inprogress-off"
+ * which signals that checksums are written but no longer verified. This ensure
+ * that backends which have yet to move from the "on" state will still be able
+ * to process data checksum validation.
+ *
+ * Synchronization and Correctness
+ * -------------------------------
+ * The processes involved in enabling, or disabling, data checksums in an
+ * online cluster must be properly synchronized with the normal backends
+ * serving concurrent queries to ensure correctness. Correctness is defined
+ * as the following:
+ *
+ *    - Backends SHALL NOT violate local data_checksums state
+ *    - Data checksums SHALL NOT be considered enabled cluster-wide until all
+ *      currently connected backends have the local state "enabled"
+ *
+ * There are two levels of synchronization required for enabling data checksums
+ * in an online cluster: (i) changing state in the active backends ("on",
+ * "off", "inprogress-on" and "inprogress-off"), and (ii) ensuring no
+ * incompatible objects and processes are left in a database when workers end.
+ * The former deals with cluster-wide agreement on data checksum state and the
+ * latter with ensuring that any concurrent activity cannot break the data
+ * checksum contract during processing.
+ *
+ * Synchronizing the state change is done with procsignal barriers, where the
+ * WAL logging backend updating the global state in the controlfile will wait
+ * for all other backends to absorb the barrier. Barrier absorption will happen
+ * during interrupt processing, which means that connected backends will change
+ * state at different times. To prevent data checksum state changes when
+ * writing and verifying checksums, interrupts shall be held off before
+ * interrogating state and resumed when the IO operation has been performed.
+ *
+ *   When Enabling Data Checksums
+ *   ----------------------------
+ *   A process which fails to observe data checksums being enabled can induce
+ *   two types of errors: failing to write the checksum when modifying the page
+ *   and failing to validate the data checksum on the page when reading it.
+ *
+ *   When processing starts all backends belong to one of the below sets, with
+ *   one set being empty:
+ *
+ *   Bd: Backends in "off" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   If processing is started in an online cluster then all backends are in Bd.
+ *   If processing was halted by the cluster shutting down, the controlfile
+ *   state "inprogress-on" will be observed on system startup and all backends
+ *   will be in Bd. Backends transition Bd -> Bi via a procsignalbarrier.  When
+ *   the DataChecksumsWorker has finished writing checksums on all pages and
+ *   enables data checksums cluster-wide, there are four sets of backends where
+ *   Bd shall be an empty set:
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   Backends in Bi and Be will write checksums when modifying a page, but only
+ *   backends in Be will verify the checksum during reading. The Bg backend is
+ *   blocked waiting for all backends in Bi to process interrupts and move to
+ *   Be. Any backend starting while Bg is waiting on the procsignalbarrier will
+ *   observe the global state being "on" and will thus automatically belong to
+ *   Be.  Checksums are enabled cluster-wide when Bi is an empty set. Bi and Be
+ *   are compatible sets while still operating based on their local state as
+ *   both write data checksums.
+ *
+ *   When Disabling Data Checksums
+ *   -----------------------------
+ *   A process which fails to observe that data checksums have been disabled
+ *   can induce two types of errors: writing the checksum when modifying the
+ *   page and validating a data checksum which is no longer correct due to
+ *   modifications to the page.
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bo: Backends in "inprogress-off" state
+ *
+ *   Backends transition from the Be state to Bd like so: Be -> Bo -> Bd
+ *
+ *   The goal is to transition all backends to Bd making the others empty sets.
+ *   Backends in Bo write data checksums, but don't validate them, such that
+ *   backends still in Be can continue to validate pages until the barrier has
+ *   been absorbed such that they are in Bo. Once all backends are in Bo, the
+ *   barrier to transition to "off" can be raised and all backends can safely
+ *   stop writing data checksums as no backend is enforcing data checksum
+ *   validation any longer.
+ *
+ *
+ * Potential optimizations
+ * -----------------------
+ * Below are some potential optimizations and improvements which were brought
+ * up during reviews of this feature, but which weren't implemented in the
+ * initial version. These are ideas listed without any validation on their
+ * feasibility or potential payoff. More discussion on these can be found on
+ * the -hackers threads linked to in the commit message of this feature.
+ *
+ *   * Launching datachecksumsworker for resuming operation from the startup
+ *     process: Currently users have to restart processing manually after a
+ *     restart since dynamic background worker cannot be started from the
+ *     postmaster. Changing the startup process could make restarting the
+ *     processing automatic on cluster restart.
+ *   * Avoid dirtying the page when checksums already match: Iff the checksum
+ *     on the page happens to already match we still dirty the page. It should
+ *     be enough to only do the log_newpage_buffer() call in that case.
+ *   * Invent a lightweight WAL record that doesn't contain the full-page
+ *     image but just the block number: On replay, the redo routine would read
+ *     the page from disk.
+ *   * Teach pg_checksums to avoid checksummed pages when pg_checksums is used
+ *     to enable checksums on a cluster which is in inprogress-on state and
+ *     may have checksummed pages (make pg_checksums be able to resume an
+ *     online operation).
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/postmaster/datachecksumsworker.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/htup_details.h"
+#include "access/xact.h"
+#include "access/xloginsert.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_database.h"
+#include "commands/progress.h"
+#include "commands/vacuum.h"
+#include "common/relpath.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "postmaster/bgworker.h"
+#include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/bufmgr.h"
+#include "storage/checksum.h"
+#include "storage/ipc.h"
+#include "storage/lmgr.h"
+#include "storage/lwlock.h"
+#include "storage/procarray.h"
+#include "storage/smgr.h"
+#include "tcop/tcopprot.h"
+#include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
+#include "utils/ps_status.h"
+#include "utils/syscache.h"
+
+/*
+ * Number of times we retry to open a database before giving up and consider
+ * it to have failed processing.
+ */
+#define DATACHECKSUMSWORKER_MAX_DB_RETRIES 5
+
+typedef enum
+{
+	DATACHECKSUMSWORKER_SUCCESSFUL = 0,
+	DATACHECKSUMSWORKER_ABORTED,
+	DATACHECKSUMSWORKER_FAILED,
+	DATACHECKSUMSWORKER_RETRYDB,
+} DataChecksumsWorkerResult;
+
+/*
+ * Signaling between backends calling pg_enable/disable_data_checksums, the
+ * checksums launcher process, and the checksums worker process.
+ *
+ * This struct is protected by DataChecksumsWorkerLock
+ */
+typedef struct DataChecksumsWorkerShmemStruct
+{
+	/*
+	 * These are set by pg_enable/disable_data_checksums, to tell the launcher
+	 * what the target state is.
+	 */
+	bool		launch_enable_checksums;	/* True if checksums are being
+											 * enabled, else false */
+	int			launch_cost_delay;
+	int			launch_cost_limit;
+	bool		launch_fast;
+
+	/*
+	 * Is a launcher process is currently running?
+	 *
+	 * This is set by the launcher process, after it has read the above
+	 * launch_* parameters.
+	 */
+	bool		launcher_running;
+
+	/*
+	 * These fields indicate the target state that the launcher is currently
+	 * working towards. They can be different from the corresponding launch_*
+	 * fields, if a new pg_enable/disable_data_checksums() call was made while
+	 * the launcher/worker was already running.
+	 *
+	 * The below members are set when the launcher starts, and are only
+	 * accessed read-only by the single worker. Thus, we can access these
+	 * without a lock. If multiple workers, or dynamic cost parameters, are
+	 * supported at some point then this would need to be revisited.
+	 */
+	bool		enabling_checksums; /* True if checksums are being enabled,
+									 * else false */
+	int			cost_delay;
+	int			cost_limit;
+	bool		immediate_checkpoint;
+
+	/*
+	 * Signaling between the launcher and the worker process.
+	 *
+	 * As there is only a single worker, and the launcher won't read these
+	 * until the worker exits, they can be accessed without the need for a
+	 * lock. If multiple workers are supported then this will have to be
+	 * revisited.
+	 */
+
+	/* result, set by worker before exiting */
+	DataChecksumsWorkerResult success;
+
+	/*
+	 * tells the worker process whether it should also process the shared
+	 * catalogs
+	 */
+	bool		process_shared_catalogs;
+} DataChecksumsWorkerShmemStruct;
+
+/* Shared memory segment for datachecksumsworker */
+static DataChecksumsWorkerShmemStruct *DataChecksumsWorkerShmem;
+
+/* Bookkeeping for work to do */
+typedef struct DataChecksumsWorkerDatabase
+{
+	Oid			dboid;
+	char	   *dbname;
+} DataChecksumsWorkerDatabase;
+
+typedef struct DataChecksumsWorkerResultEntry
+{
+	Oid			dboid;
+	DataChecksumsWorkerResult result;
+	int			retries;
+} DataChecksumsWorkerResultEntry;
+
+
+/*
+ * Flag set by the interrupt handler
+ */
+static volatile sig_atomic_t abort_requested = false;
+
+/*
+ * Have we set the DataChecksumsWorkerShmemStruct->launcher_running flag?
+ * If we have, we need to clear it before exiting!
+ */
+static volatile sig_atomic_t launcher_running = false;
+
+/*
+ * Are we enabling data checksums, or disabling them?
+ */
+static bool enabling_checksums;
+
+/* Prototypes */
+static List *BuildDatabaseList(void);
+static List *BuildRelationList(bool temp_relations, bool include_shared);
+static void FreeDatabaseList(List *dblist);
+static DataChecksumsWorkerResult ProcessDatabase(DataChecksumsWorkerDatabase *db);
+static bool ProcessAllDatabases(bool immediate_checkpoint);
+static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy);
+static void launcher_cancel_handler(SIGNAL_ARGS);
+static void WaitForAllTransactionsToFinish(void);
+
+/*
+ * StartDataChecksumsWorkerLauncher
+ *		Main entry point for datachecksumsworker launcher process
+ *
+ * The main entrypoint for starting data checksums processing for enabling as
+ * well as disabling.
+ */
+void
+StartDataChecksumsWorkerLauncher(bool enable_checksums,
+								 int cost_delay,
+								 int cost_limit,
+								 bool fast)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	bool		launcher_running;
+
+	/* the cost delay settings have no effect when disabling */
+	Assert(enable_checksums || cost_delay == 0);
+	Assert(enable_checksums || cost_limit == 0);
+
+	/* store the desired state in shared memory */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	DataChecksumsWorkerShmem->launch_enable_checksums = enable_checksums;
+	DataChecksumsWorkerShmem->launch_cost_delay = cost_delay;
+	DataChecksumsWorkerShmem->launch_cost_limit = cost_limit;
+	DataChecksumsWorkerShmem->launch_fast = fast;
+
+	/* is the launcher already running? */
+	launcher_running = DataChecksumsWorkerShmem->launcher_running;
+
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * Launch a new launcher process, if it's not running already.
+	 *
+	 * If the launcher is currently busy enabling the checksums, and we want
+	 * them disabled (or vice versa), the launcher will notice that at latest
+	 * when it's about to exit, and will loop back process the new request. So
+	 * if the launcher is already running, we don't need to do anything more
+	 * here to abort it.
+	 *
+	 * If you call pg_enable/disable_data_checksums() twice in a row, before
+	 * the launcher has had a chance to start up, we still end up launching it
+	 * twice.  That's OK, the second invocation will see that a launcher is
+	 * already running and exit quickly.
+	 *
+	 * TODO: We could optimize here and skip launching the launcher, if we are
+	 * already in the desired state, i.e. if the checksums are already enabled
+	 * and you call pg_enable_data_checksums().
+	 */
+	if (!launcher_running)
+	{
+		/*
+		 * Prepare the BackgroundWorker and launch it.
+		 */
+		memset(&bgw, 0, sizeof(bgw));
+		bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+		bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+		snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+		snprintf(bgw.bgw_function_name, BGW_MAXLEN, "DataChecksumsWorkerLauncherMain");
+		snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker launcher");
+		snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker launcher");
+		bgw.bgw_restart_time = BGW_NEVER_RESTART;
+		bgw.bgw_notify_pid = MyProcPid;
+		bgw.bgw_main_arg = (Datum) 0;
+
+		if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+			ereport(ERROR,
+					(errmsg("failed to start background worker to process data checksums")));
+	}
+}
+
+/*
+ * ProcessSingleRelationFork
+ *		Enable data checksums in a single relation/fork.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy)
+{
+	BlockNumber numblocks = RelationGetNumberOfBlocksInFork(reln, forkNum);
+	BlockNumber blknum;
+	char		activity[NAMEDATALEN * 2 + 128];
+	char	   *relns;
+
+	relns = get_namespace_name(RelationGetNamespace(reln));
+
+	if (!relns)
+		return false;
+
+	/* Report the current relation to pgstat_activity */
+	snprintf(activity, sizeof(activity) - 1, "processing: %s.%s (%s, %dblocks)",
+			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	/*
+	 * We are looping over the blocks which existed at the time of process
+	 * start, which is safe since new blocks are created with checksums set
+	 * already due to the state being "inprogress-on".
+	 */
+	for (blknum = 0; blknum < numblocks; blknum++)
+	{
+		Buffer		buf = ReadBufferExtended(reln, forkNum, blknum, RBM_NORMAL, strategy);
+
+		/* Need to get an exclusive lock before we can flag as dirty */
+		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+
+		/*
+		 * Mark the buffer as dirty and force a full page write.  We have to
+		 * re-write the page to WAL even if the checksum hasn't changed,
+		 * because if there is a replica it might have a slightly different
+		 * version of the page with an invalid checksum, caused by unlogged
+		 * changes (e.g. hintbits) on the master happening while checksums
+		 * were off. This can happen if there was a valid checksum on the page
+		 * at one point in the past, so only when checksums are first on, then
+		 * off, and then turned on again. Iff wal_level is set to "minimal",
+		 * this could be avoided iff the checksum is calculated to be correct.
+		 */
+		START_CRIT_SECTION();
+		MarkBufferDirty(buf);
+		log_newpage_buffer(buf, false);
+		END_CRIT_SECTION();
+
+		UnlockReleaseBuffer(buf);
+
+		/*
+		 * This is the only place where we check if we are asked to abort, the
+		 * abortion will bubble up from here. It's safe to check this without
+		 * a lock, because if we miss it being set, we will try again soon.
+		 */
+		Assert(enabling_checksums);
+		if (!DataChecksumsWorkerShmem->launch_enable_checksums)
+			abort_requested = true;
+		if (abort_requested)
+			return false;
+
+		vacuum_delay_point();
+	}
+
+	pfree(relns);
+	return true;
+}
+
+/*
+ * ProcessSingleRelationByOid
+ *		Process a single relation based on oid.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationByOid(Oid relationId, BufferAccessStrategy strategy)
+{
+	Relation	rel;
+	ForkNumber	fnum;
+	bool		aborted = false;
+
+	StartTransactionCommand();
+
+	rel = try_relation_open(relationId, AccessShareLock);
+	if (rel == NULL)
+	{
+		/*
+		 * Relation no longer exists. We don't consider this an error since
+		 * there are no pages in it that need data checksums, and thus return
+		 * true. The worker operates off a list of relations generated at the
+		 * start of processing, so relations being dropped in the meantime is
+		 * to be expected.
+		 */
+		CommitTransactionCommand();
+		pgstat_report_activity(STATE_IDLE, NULL);
+		return true;
+	}
+	RelationGetSmgr(rel);
+
+	for (fnum = 0; fnum <= MAX_FORKNUM; fnum++)
+	{
+		if (smgrexists(rel->rd_smgr, fnum))
+		{
+			if (!ProcessSingleRelationFork(rel, fnum, strategy))
+			{
+				aborted = true;
+				break;
+			}
+		}
+	}
+	relation_close(rel, AccessShareLock);
+	elog(DEBUG2,
+		 "data checksum processing done for relation with OID %u: %s",
+		 relationId, (aborted ? "aborted" : "finished"));
+
+	CommitTransactionCommand();
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return !aborted;
+}
+
+/*
+ * ProcessDatabase
+ *		Enable data checksums in a single database.
+ *
+ * We do this by launching a dynamic background worker into this database, and
+ * waiting for it to finish.  We have to do this in a separate worker, since
+ * each process can only be connected to one database during its lifetime.
+ */
+static DataChecksumsWorkerResult
+ProcessDatabase(DataChecksumsWorkerDatabase *db)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	BgwHandleStatus status;
+	pid_t		pid;
+	char		activity[NAMEDATALEN + 64];
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_FAILED;
+
+	memset(&bgw, 0, sizeof(bgw));
+	bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+	bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+	snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+	snprintf(bgw.bgw_function_name, BGW_MAXLEN, "%s", "DataChecksumsWorkerMain");
+	snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker worker");
+	snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker worker");
+	bgw.bgw_restart_time = BGW_NEVER_RESTART;
+	bgw.bgw_notify_pid = MyProcPid;
+	bgw.bgw_main_arg = ObjectIdGetDatum(db->dboid);
+
+	/*
+	 * If there are no worker slots available, make sure we retry processing
+	 * this database. This will make the datachecksumsworker move on to the
+	 * next database and quite likely fail with the same problem. TODO: Maybe
+	 * we need a backoff to avoid running through all the databases here in
+	 * short order.
+	 */
+	if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+	{
+		ereport(WARNING,
+				(errmsg("failed to start worker for enabling data checksums in database \"%s\", retrying",
+						db->dbname),
+				 errhint("The max_worker_processes setting might be too low.")));
+		return DATACHECKSUMSWORKER_RETRYDB;
+	}
+
+	status = WaitForBackgroundWorkerStartup(bgw_handle, &pid);
+	if (status == BGWH_STOPPED)
+	{
+		ereport(WARNING,
+				(errmsg("could not start background worker for enabling data checksums in database \"%s\"",
+						db->dbname),
+				 errhint("More details on the error might be found in the server log.")));
+		return DATACHECKSUMSWORKER_FAILED;
+	}
+
+	/*
+	 * If the postmaster crashed we cannot end up with a processed database so
+	 * we have no alternative other than exiting. When enabling checksums we
+	 * won't at this time have changed the pg_control version to enabled so
+	 * when the cluster comes back up processing will have to be restarted.
+	 * When disabling, the pg_control version will be set to off before this
+	 * so when the cluster comes up checksums will be off as expected.
+	 */
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("cannot enable data checksums without the postmaster process"),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	Assert(status == BGWH_STARTED);
+	ereport(DEBUG1,
+			(errmsg("initiating data checksum processing in database \"%s\"",
+					db->dbname)));
+
+	snprintf(activity, sizeof(activity) - 1,
+			 "Waiting for worker in database %s (pid %d)", db->dbname, pid);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	status = WaitForBackgroundWorkerShutdown(bgw_handle);
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("postmaster exited during data checksum processing in \"%s\"",
+						db->dbname),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	if (DataChecksumsWorkerShmem->success == DATACHECKSUMSWORKER_ABORTED)
+		ereport(LOG,
+				(errmsg("data checksums processing was aborted in database \"%s\"",
+						db->dbname)));
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return DataChecksumsWorkerShmem->success;
+}
+
+/*
+ * launcher_exit
+ *
+ * Internal routine for cleaning up state when the launcher process exits. We
+ * need to clean up the abort flag to ensure that processing can be restarted
+ * again after it was previously aborted.
+ */
+static void
+launcher_exit(int code, Datum arg)
+{
+	if (launcher_running)
+	{
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		launcher_running = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		LWLockRelease(DataChecksumsWorkerLock);
+	}
+}
+
+/*
+ * launcher_cancel_handler
+ *
+ * Internal routine for reacting to SIGINT and flagging the worker to abort.
+ * The worker won't be interrupted immediately but will check for abort flag
+ * between each block in a relation.
+ */
+static void
+launcher_cancel_handler(SIGNAL_ARGS)
+{
+	int			save_errno = errno;
+
+	abort_requested = true;
+
+	/*
+	 * There is no sleeping in the main loop, the flag will be checked
+	 * periodically in ProcessSingleRelationFork. The worker does however
+	 * sleep when waiting for concurrent transactions to end so we still need
+	 * to set the latch.
+	 */
+	SetLatch(MyLatch);
+
+	errno = save_errno;
+}
+
+/*
+ * WaitForAllTransactionsToFinish
+ *		Blocks awaiting all current transactions to finish
+ *
+ * Returns when all transactions which are active at the call of the function
+ * have ended, or if the postmaster dies while waiting. If the postmaster dies
+ * the abort flag will be set to indicate that the caller of this shouldn't
+ * proceed.
+ *
+ * NB: this will return early, if aborted by SIGINT or if the target state
+ * is changed while we're running.
+ */
+static void
+WaitForAllTransactionsToFinish(void)
+{
+	TransactionId waitforxid;
+
+	LWLockAcquire(XidGenLock, LW_SHARED);
+	waitforxid = XidFromFullTransactionId(TransamVariables->nextXid);
+	LWLockRelease(XidGenLock);
+
+	while (TransactionIdPrecedes(GetOldestActiveTransactionId(), waitforxid))
+	{
+		char		activity[64];
+		int			rc;
+
+		/* Oldest running xid is older than us, so wait */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for current transactions to finish (waiting for %u)",
+				 waitforxid);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		rc = WaitLatch(MyLatch,
+					   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+					   5000,
+					   WAIT_EVENT_CHECKSUM_ENABLE_STARTCONDITION);
+
+		/*
+		 * If the postmaster died we won't be able to enable checksums
+		 * cluster-wide so abort and hope to continue when restarted.
+		 */
+		if (rc & WL_POSTMASTER_DEATH)
+			ereport(FATAL,
+					(errmsg("postmaster exited during data checksum processing"),
+					 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_SHARED);
+		if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			abort_requested = true;
+		LWLockRelease(DataChecksumsWorkerLock);
+		if (abort_requested)
+			break;
+	}
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+	return;
+}
+
+/*
+ * DataChecksumsWorkerLauncherMain
+ *
+ * Main function for launching dynamic background workers for processing data
+ * checksums in databases. This function has the bgworker management, with
+ * ProcessAllDatabases being responsible for looping over the databases and
+ * initiating processing.
+ */
+void
+DataChecksumsWorkerLauncherMain(Datum arg)
+{
+	on_shmem_exit(launcher_exit, 0);
+
+	ereport(DEBUG1,
+			errmsg("background worker \"datachecksumsworker\" launcher started"));
+
+	pqsignal(SIGTERM, die);
+	pqsignal(SIGINT, launcher_cancel_handler);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_LAUNCHER;
+	init_ps_display(NULL);
+
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	if (DataChecksumsWorkerShmem->launcher_running)
+	{
+		/* Launcher was already running, let it finish */
+		LWLockRelease(DataChecksumsWorkerLock);
+		return;
+	}
+
+	launcher_running = true;
+
+	/*
+	 * Initialize a connection to shared catalogs only.
+	 */
+	BackgroundWorkerInitializeConnectionByOid(InvalidOid, InvalidOid, 0);
+
+	/* Initialize backend status information */
+	pgstat_bestart();
+
+	enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+	DataChecksumsWorkerShmem->launcher_running = true;
+	DataChecksumsWorkerShmem->enabling_checksums = enabling_checksums;
+	DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+	DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+	DataChecksumsWorkerShmem->immediate_checkpoint = DataChecksumsWorkerShmem->launch_fast;
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * The target state can change while we are busy enabling/disabling
+	 * checksums, if the user calls pg_disable/enable_data_checksums() before
+	 * we are finished with the previous request. In that case, we will loop
+	 * back here, to process the new request.
+	 */
+again:
+
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	/*
+	 * If we're asked to enable checksums, we need to check if processing was
+	 * previously interrupted such that we should resume rather than start
+	 * from scratch.
+	 */
+	if (enabling_checksums)
+	{
+		/*
+		 * If we are asked to enable checksums in a cluster which already has
+		 * checksums enabled, exit immediately as there is nothing more to do.
+		 * Hold interrupts to make sure state doesn't change during checking.
+		 */
+		HOLD_INTERRUPTS();
+		if (DataChecksumsNeedVerify())
+		{
+			RESUME_INTERRUPTS();
+			goto done;
+		}
+		RESUME_INTERRUPTS();
+
+		/*
+		 * Initialize progress and indicate that we are waiting on the other
+		 * backends to clear the procsignalbarrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS);
+
+		/*
+		 * Set the state to inprogress-on and wait on the procsignal barrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_ENABLING);
+		SetDataChecksumsOnInProgress();
+
+		if (!ProcessAllDatabases(DataChecksumsWorkerShmem->immediate_checkpoint))
+		{
+			/*
+			 * If the target state changed during processing then it's not a
+			 * failure, so restart processing instead.
+			 */
+			LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+			if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			{
+				LWLockRelease(DataChecksumsWorkerLock);
+				goto done;
+			}
+			LWLockRelease(DataChecksumsWorkerLock);
+			ereport(ERROR,
+					(errmsg("unable to enable data checksums in cluster")));
+		}
+
+		SetDataChecksumsOn();
+	}
+	else
+	{
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_DISABLING);
+		SetDataChecksumsOff();
+	}
+
+done:
+
+	/*
+	 * All done. But before we exit, check if the target state was changed
+	 * while we were running. In that case we will have to start all over
+	 * again.
+	 */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+	if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+	{
+		DataChecksumsWorkerShmem->enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+		DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+		LWLockRelease(DataChecksumsWorkerLock);
+		goto again;
+	}
+
+	/* Shut down progress reporting as we are done */
+	pgstat_progress_end_command();
+
+	launcher_running = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	LWLockRelease(DataChecksumsWorkerLock);
+}
+
+/*
+ * ProcessAllDatabases
+ *		Compute the list of all databases and process checksums in each
+ *
+ * This will repeatedly generate a list of databases to process for enabling
+ * checksums. Until no new databases are found, this will loop around computing
+ * a new list and comparing it to the already seen ones.
+ *
+ * If immediate_checkpoint is set to true then a CHECKPOINT_IMMEDIATE will be
+ * issued. This is useful for testing but should be avoided in production use
+ * as it may affect cluster performance drastically.
+ */
+static bool
+ProcessAllDatabases(bool immediate_checkpoint)
+{
+	List	   *DatabaseList;
+	HTAB	   *ProcessedDatabases = NULL;
+	HASHCTL		hash_ctl;
+	bool		found_failed = false;
+	int			flags;
+
+	/* Initialize a hash tracking all processed databases */
+	memset(&hash_ctl, 0, sizeof(hash_ctl));
+	hash_ctl.keysize = sizeof(Oid);
+	hash_ctl.entrysize = sizeof(DataChecksumsWorkerResultEntry);
+	ProcessedDatabases = hash_create("Processed databases",
+									 64,
+									 &hash_ctl,
+									 HASH_ELEM | HASH_BLOBS);
+
+	/*
+	 * Set up so first run processes shared catalogs, but not once in every
+	 * db.
+	 */
+	DataChecksumsWorkerShmem->process_shared_catalogs = true;
+
+	/*
+	 * Get a list of all databases to process. This may include databases that
+	 * were created during our runtime.  Since a database can be created as a
+	 * copy of any other database (which may not have existed in our last
+	 * run), we have to repeat this loop until no new databases show up in the
+	 * list.
+	 */
+	DatabaseList = BuildDatabaseList();
+
+	/*
+	 * Update progress reporting with the total number of databases we need to
+	 * process. This number should not be changed during processing, the
+	 * columns for processed databases is instead increased such that it can
+	 * be compared against the total.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_DB,
+								 list_length(DatabaseList));
+
+	while (true)
+	{
+		int			processed_databases = 0;
+
+		foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+		{
+			DataChecksumsWorkerResult result;
+			DataChecksumsWorkerResultEntry *entry;
+			bool		found;
+
+			/*
+			 * Indicate which database is being processed set the number of
+			 * relations to -1 to clear field from previous values. -1 will
+			 * translate to NULL in the progress view.
+			 */
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_CUR_DB, db->dboid);
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_REL, -1);
+
+			/*
+			 * Check if this database has been processed already, and if so
+			 * whether it should be retried or skipped.
+			 */
+			entry = (DataChecksumsWorkerResultEntry *) hash_search(ProcessedDatabases, &db->dboid,
+																   HASH_FIND, NULL);
+
+			if (entry)
+			{
+				if (entry->result == DATACHECKSUMSWORKER_RETRYDB)
+				{
+					/*
+					 * Limit the number of retries to avoid infinite looping
+					 * in case there simply won't be enough workers in the
+					 * cluster to finish this operation.
+					 */
+					if (entry->retries > DATACHECKSUMSWORKER_MAX_DB_RETRIES)
+						entry->result = DATACHECKSUMSWORKER_FAILED;
+				}
+
+				/* Skip if this database has been processed already */
+				if (entry->result != DATACHECKSUMSWORKER_RETRYDB)
+					continue;
+			}
+
+			result = ProcessDatabase(db);
+			processed_databases++;
+
+			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
+			{
+				/*
+				 * If one database has completed shared catalogs, we don't
+				 * have to process them again.
+				 */
+				if (DataChecksumsWorkerShmem->process_shared_catalogs)
+					DataChecksumsWorkerShmem->process_shared_catalogs = false;
+			}
+			else if (result == DATACHECKSUMSWORKER_ABORTED)
+			{
+				/* Abort flag set, so exit the whole process */
+				return false;
+			}
+
+			entry = hash_search(ProcessedDatabases, &db->dboid, HASH_ENTER, &found);
+			entry->dboid = db->dboid;
+			entry->result = result;
+			if (!found)
+				entry->retries = 0;
+			else
+				entry->retries++;
+		}
+
+		elog(DEBUG1,
+			 "%i databases processed for data checksum enabling, %s",
+			 processed_databases,
+			 (processed_databases ? "process with restart" : "process completed"));
+
+		FreeDatabaseList(DatabaseList);
+
+		/*
+		 * If no databases were processed in this run of the loop, we have now
+		 * finished all databases and no concurrently created ones can exist.
+		 */
+		if (processed_databases == 0)
+			break;
+
+		/*
+		 * Re-generate the list of databases for another pass. Since we wait
+		 * for all pre-existing transactions finish, this way we can be
+		 * certain that there are no databases left without checksums.
+		 */
+		WaitForAllTransactionsToFinish();
+		DatabaseList = BuildDatabaseList();
+	}
+
+	/*
+	 * ProcessedDatabases now has all databases and the results of their
+	 * processing. Failure to enable checksums for a database can be because
+	 * they actually failed for some reason, or because the database was
+	 * dropped between us getting the database list and trying to process it.
+	 * Get a fresh list of databases to detect the second case where the
+	 * database was dropped before we had started processing it. If a database
+	 * still exists, but enabling checksums failed then we fail the entire
+	 * checksumming process and exit with an error.
+	 */
+	WaitForAllTransactionsToFinish();
+	DatabaseList = BuildDatabaseList();
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+	{
+		DataChecksumsWorkerResult *entry;
+		bool		found;
+
+		entry = hash_search(ProcessedDatabases, (void *) &db->dboid,
+							HASH_FIND, &found);
+
+		/*
+		 * We are only interested in the databases where the failed database
+		 * still exists.
+		 */
+		if (found && *entry == DATACHECKSUMSWORKER_FAILED)
+		{
+			ereport(WARNING,
+					(errmsg("failed to enable data checksums in \"%s\"",
+							db->dbname)));
+			found_failed = found;
+			continue;
+		}
+	}
+
+	FreeDatabaseList(DatabaseList);
+
+	if (found_failed)
+	{
+		/* Disable checksums on cluster, because we failed */
+		SetDataChecksumsOff();
+		ereport(ERROR,
+				(errmsg("data checksums failed to get enabled in all databases, aborting"),
+				 errhint("The server log might have more information on the cause of the error.")));
+	}
+
+	/*
+	 * Force a checkpoint to get everything out to disk. The use of immediate
+	 * checkpoints is for running tests, as they would otherwise not execute
+	 * in such a way that they can reliably be placed under timeout control.
+	 */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_IMMEDIATE;
+	RequestCheckpoint(flags);
+
+	return true;
+}
+
+/*
+ * DataChecksumsWorkerShmemSize
+ *		Compute required space for datachecksumsworker-related shared memory
+ */
+Size
+DataChecksumsWorkerShmemSize(void)
+{
+	Size		size;
+
+	size = sizeof(DataChecksumsWorkerShmemStruct);
+	size = MAXALIGN(size);
+
+	return size;
+}
+
+/*
+ * DataChecksumsWorkerShmemInit
+ *		Allocate and initialize datachecksumsworker-related shared memory
+ */
+void
+DataChecksumsWorkerShmemInit(void)
+{
+	bool		found;
+
+	DataChecksumsWorkerShmem = (DataChecksumsWorkerShmemStruct *)
+		ShmemInitStruct("DataChecksumsWorker Data",
+						DataChecksumsWorkerShmemSize(),
+						&found);
+
+	MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
+
+	/*
+	 * Even if this is a redundant assignment, we want to be explicit about
+	 * our intent for readability, since we want to be able to query this
+	 * state in case of restartability.
+	 */
+	DataChecksumsWorkerShmem->launch_enable_checksums = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	DataChecksumsWorkerShmem->launch_fast = false;
+}
+
+/*
+ * BuildDatabaseList
+ *		Compile a list of all currently available databases in the cluster
+ *
+ * This creates the list of databases for the datachecksumsworker workers to
+ * add checksums to. If the caller wants to ensure that no concurrently
+ * running CREATE DATABASE calls exist, this needs to be preceded by a call
+ * to WaitForAllTransactionsToFinish().
+ */
+static List *
+BuildDatabaseList(void)
+{
+	List	   *DatabaseList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(DatabaseRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_database pgdb = (Form_pg_database) GETSTRUCT(tup);
+		DataChecksumsWorkerDatabase *db;
+
+		oldctx = MemoryContextSwitchTo(ctx);
+
+		db = (DataChecksumsWorkerDatabase *) palloc0(sizeof(DataChecksumsWorkerDatabase));
+
+		db->dboid = pgdb->oid;
+		db->dbname = pstrdup(NameStr(pgdb->datname));
+
+		DatabaseList = lappend(DatabaseList, db);
+
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return DatabaseList;
+}
+
+static void
+FreeDatabaseList(List *dblist)
+{
+	if (!dblist)
+		return;
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, dblist)
+	{
+		if (db->dbname != NULL)
+			pfree(db->dbname);
+	}
+
+	list_free_deep(dblist);
+}
+
+/*
+ * BuildRelationList
+ *		Compile a list of relations in the database
+ *
+ * Returns a list of OIDs for the request relation types. If temp_relations
+ * is True then only temporary relations are returned. If temp_relations is
+ * False then non-temporary relations which have data checksums are returned.
+ * If include_shared is True then shared relations are included as well in a
+ * non-temporary list. include_shared has no relevance when building a list of
+ * temporary relations.
+ */
+static List *
+BuildRelationList(bool temp_relations, bool include_shared)
+{
+	List	   *RelationList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(RelationRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_class pgc = (Form_pg_class) GETSTRUCT(tup);
+
+		/*
+		 * Only include temporary relations when asked for a temp relation
+		 * list.
+		 */
+		if (pgc->relpersistence == RELPERSISTENCE_TEMP)
+		{
+			if (!temp_relations)
+				continue;
+		}
+		else
+		{
+			/*
+			 * If we are only interested in temp relations then continue
+			 * immediately as the current relation isn't a temp relation.
+			 */
+			if (temp_relations)
+				continue;
+
+			if (!RELKIND_HAS_STORAGE(pgc->relkind))
+				continue;
+
+			if (pgc->relisshared && !include_shared)
+				continue;
+		}
+
+		oldctx = MemoryContextSwitchTo(ctx);
+		RelationList = lappend_oid(RelationList, pgc->oid);
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return RelationList;
+}
+
+/*
+ * DataChecksumsWorkerMain
+ *
+ * Main function for enabling checksums in a single database, This is the
+ * function set as the bgw_function_name in the dynamic background worker
+ * process initiated for each database by the worker launcher. After enabling
+ * data checksums in each applicable relation in the database, it will wait for
+ * all temporary relations that were present when the function started to
+ * disappear before returning. This is required since we cannot rewrite
+ * existing temporary relations with data checksums.
+ */
+void
+DataChecksumsWorkerMain(Datum arg)
+{
+	Oid			dboid = DatumGetObjectId(arg);
+	List	   *RelationList = NIL;
+	List	   *InitialTempTableList = NIL;
+	BufferAccessStrategy strategy;
+	bool		aborted = false;
+
+	enabling_checksums = true;
+
+	pqsignal(SIGTERM, die);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_WORKER;
+	init_ps_display(NULL);
+
+	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
+											  BGWORKER_BYPASS_ALLOWCONN);
+
+	/*
+	 * Get a list of all temp tables present as we start in this database. We
+	 * need to wait until they are all gone until we are done, since we cannot
+	 * access these relations and modify them.
+	 */
+	InitialTempTableList = BuildRelationList(true, false);
+
+	/*
+	 * Enable vacuum cost delay, if any.
+	 */
+	Assert(DataChecksumsWorkerShmem->enabling_checksums);
+	VacuumCostDelay = DataChecksumsWorkerShmem->cost_delay;
+	VacuumCostLimit = DataChecksumsWorkerShmem->cost_limit;
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumCostPageHit = 0;
+	VacuumCostPageMiss = 0;
+	VacuumCostPageDirty = 0;
+
+	/*
+	 * Create and set the vacuum strategy as our buffer strategy.
+	 */
+	strategy = GetAccessStrategy(BAS_VACUUM);
+
+	RelationList = BuildRelationList(false,
+									 DataChecksumsWorkerShmem->process_shared_catalogs);
+	foreach_oid(reloid, RelationList)
+	{
+		if (!ProcessSingleRelationByOid(reloid, strategy))
+		{
+			aborted = true;
+			break;
+		}
+	}
+	list_free(RelationList);
+
+	if (aborted)
+	{
+		DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+		ereport(DEBUG1,
+				(errmsg("data checksum processing aborted in database OID %u",
+						dboid)));
+		return;
+	}
+
+	/*
+	 * Wait for all temp tables that existed when we started to go away. This
+	 * is necessary since we cannot "reach" them to enable checksums. Any temp
+	 * tables created after we started will already have checksums in them
+	 * (due to the "inprogress-on" state), so no need to wait for those.
+	 */
+	for (;;)
+	{
+		List	   *CurrentTempTables;
+		int			numleft;
+		char		activity[64];
+
+		CurrentTempTables = BuildRelationList(true, false);
+		numleft = 0;
+		foreach_oid(tmptbloid, InitialTempTableList)
+		{
+			if (list_member_oid(CurrentTempTables, tmptbloid))
+				numleft++;
+		}
+		list_free(CurrentTempTables);
+
+		if (numleft == 0)
+			break;
+
+		/*
+		 * At least one temp table is left to wait for, indicate in pgstat
+		 * activity and progress reporting.
+		 */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for %d temp tables to be removed", numleft);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		(void) WaitLatch(MyLatch,
+						 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+						 5000,
+						 WAIT_EVENT_CHECKSUM_ENABLE_FINISHCONDITION);
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		aborted = DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums;
+		LWLockRelease(DataChecksumsWorkerLock);
+
+		if (aborted || abort_requested)
+		{
+			DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+			ereport(DEBUG1,
+					(errmsg("data checksum processing aborted in database OID %u",
+							dboid)));
+			return;
+		}
+	}
+
+	list_free(InitialTempTableList);
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
+}
diff --git a/src/backend/postmaster/meson.build b/src/backend/postmaster/meson.build
index 0008603cfee..ce10ef1059a 100644
--- a/src/backend/postmaster/meson.build
+++ b/src/backend/postmaster/meson.build
@@ -6,6 +6,7 @@ backend_sources += files(
   'bgworker.c',
   'bgwriter.c',
   'checkpointer.c',
+  'datachecksumsworker.c',
   'fork_process.c',
   'interrupt.c',
   'launch_backend.c',
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 24d88f368d8..00cce4c7673 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -186,6 +186,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 		case XLOG_FPW_CHANGE:
 		case XLOG_FPI_FOR_HINT:
 		case XLOG_FPI:
+		case XLOG_CHECKSUMS:
 		case XLOG_OVERWRITE_CONTRECORD:
 		case XLOG_CHECKPOINT_REDO:
 			break;
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 174eed70367..e7761a3ddb6 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -30,6 +30,8 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "postmaster/postmaster.h"
 #include "postmaster/walsummarizer.h"
 #include "replication/logicallauncher.h"
 #include "replication/origin.h"
@@ -148,6 +150,7 @@ CalculateShmemSize(int *num_semaphores)
 	size = add_size(size, WaitEventCustomShmemSize());
 	size = add_size(size, InjectionPointShmemSize());
 	size = add_size(size, SlotSyncShmemSize());
+	size = add_size(size, DataChecksumsWorkerShmemSize());
 
 	/* include additional requested shmem from preload libraries */
 	size = add_size(size, total_addin_request);
@@ -330,6 +333,7 @@ CreateOrAttachShmemStructs(void)
 	PgArchShmemInit();
 	ApplyLauncherShmemInit();
 	SlotSyncShmemInit();
+	DataChecksumsWorkerShmemInit();
 
 	/*
 	 * Set up other modules that need some shared memory space
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 7d201965503..2b13a8cd260 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -18,6 +18,7 @@
 #include <unistd.h>
 
 #include "access/parallel.h"
+#include "access/xlog.h"
 #include "commands/async.h"
 #include "miscadmin.h"
 #include "pgstat.h"
@@ -575,6 +576,18 @@ ProcessProcSignalBarrier(void)
 					case PROCSIGNAL_BARRIER_SMGRRELEASE:
 						processed = ProcessBarrierSmgrRelease();
 						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON:
+						processed = AbsorbChecksumsOnInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_ON:
+						processed = AbsorbChecksumsOnBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF:
+						processed = AbsorbChecksumsOffInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_OFF:
+						processed = AbsorbChecksumsOffBarrier();
+						break;
 				}
 
 				/*
diff --git a/src/backend/storage/page/README b/src/backend/storage/page/README
index e30d7ac59ad..73c36a63908 100644
--- a/src/backend/storage/page/README
+++ b/src/backend/storage/page/README
@@ -10,7 +10,9 @@ http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf, discussed
 2010/12/22 on -hackers list.
 
 Current implementation requires this be enabled system-wide at initdb time, or
-by using the pg_checksums tool on an offline cluster.
+by using the pg_checksums tool on an offline cluster.  Checksums can also be
+enabled at runtime using pg_enable_data_checksums(), and disabled by using
+pg_disable_data_checksums().
 
 The checksum is not valid at all times on a data page!!
 The checksum is valid when the page leaves the shared pool and is checked
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index ecc81aacfc3..8bd5fed8c85 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -98,7 +98,7 @@ PageIsVerifiedExtended(PageData *page, BlockNumber blkno, int flags)
 	 */
 	if (!PageIsNew(page))
 	{
-		if (DataChecksumsEnabled())
+		if (DataChecksumsNeedVerify())
 		{
 			checksum = pg_checksum_page(page, blkno);
 
@@ -1501,7 +1501,7 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 	static char *pageCopy = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return page;
 
 	/*
@@ -1531,7 +1531,7 @@ void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
 	/* If we don't need a checksum, just return */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return;
 
 	((PageHeader) page)->pd_checksum = pg_checksum_page(page, blkno);
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index eb575025596..e7b418a000e 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -370,6 +370,8 @@ pgstat_tracks_io_bktype(BackendType bktype)
 		case B_LOGGER:
 			return false;
 
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 		case B_AUTOVAC_LAUNCHER:
 		case B_AUTOVAC_WORKER:
 		case B_BACKEND:
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index e199f071628..c7cd48e8d33 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -115,6 +115,8 @@ CHECKPOINT_DELAY_COMPLETE	"Waiting for a backend that blocks a checkpoint from c
 CHECKPOINT_DELAY_START	"Waiting for a backend that blocks a checkpoint from starting."
 CHECKPOINT_DONE	"Waiting for a checkpoint to complete."
 CHECKPOINT_START	"Waiting for a checkpoint to start."
+CHECKSUM_ENABLE_STARTCONDITION	"Waiting for data checksums enabling to start."
+CHECKSUM_ENABLE_FINISHCONDITION	"Waiting for data checksums to be enabled."
 EXECUTE_GATHER	"Waiting for activity from a child process while executing a <literal>Gather</literal> plan node."
 HASH_BATCH_ALLOCATE	"Waiting for an elected Parallel Hash participant to allocate a hash table."
 HASH_BATCH_ELECT	"Waiting to elect a Parallel Hash participant to allocate a hash table."
@@ -346,6 +348,7 @@ WALSummarizer	"Waiting to read or update WAL summarization state."
 DSMRegistry	"Waiting to read or update the dynamic shared memory registry."
 InjectionPoint	"Waiting to read or update information related to injection points."
 SerialControl	"Waiting to read or update shared <filename>pg_serial</filename> state."
+DataChecksumsWorker	"Waiting for data checksumsworker."
 
 #
 # END OF PREDEFINED LWLOCKS (DO NOT CHANGE THIS LINE)
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 9172e1cb9d2..f6bd1e9f0ca 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -274,6 +274,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		cmdtype = PROGRESS_COMMAND_BASEBACKUP;
 	else if (pg_strcasecmp(cmd, "COPY") == 0)
 		cmdtype = PROGRESS_COMMAND_COPY;
+	else if (pg_strcasecmp(cmd, "DATACHECKSUMS") == 0)
+		cmdtype = PROGRESS_COMMAND_DATACHECKSUMS;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -1146,9 +1148,6 @@ pg_stat_get_db_checksum_failures(PG_FUNCTION_ARGS)
 	int64		result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
@@ -1164,9 +1163,6 @@ pg_stat_get_db_checksum_last_failure(PG_FUNCTION_ARGS)
 	TimestampTz result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index dc3521457c7..a071ba6f455 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -293,6 +293,12 @@ GetBackendTypeDesc(BackendType backendType)
 		case B_CHECKPOINTER:
 			backendDesc = gettext_noop("checkpointer");
 			break;
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+			backendDesc = "datachecksumsworker launcher";
+			break;
+		case B_DATACHECKSUMSWORKER_WORKER:
+			backendDesc = "datachecksumsworker worker";
+			break;
 		case B_LOGGER:
 			backendDesc = gettext_noop("logger");
 			break;
@@ -892,7 +898,8 @@ InitializeSessionUserIdStandalone(void)
 	 * workers, in slot sync worker and in background workers.
 	 */
 	Assert(!IsUnderPostmaster || AmAutoVacuumWorkerProcess() ||
-		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess());
+		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess() ||
+		   AmDataChecksumsWorkerProcess());
 
 	/* call only once */
 	Assert(!OidIsValid(AuthenticatedUserId));
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index ee1a9d5d98b..70899e6ef2d 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -739,6 +739,11 @@ InitPostgres(const char *in_dbname, Oid dboid,
 	 */
 	SharedInvalBackendInit(false);
 
+	/*
+	 * Set up backend local cache of Controldata values.
+	 */
+	InitLocalControldata();
+
 	ProcSignalInit(MyCancelKeyValid, MyCancelKey);
 
 	/*
@@ -869,7 +874,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
 					 errhint("You should immediately run CREATE USER \"%s\" SUPERUSER;.",
 							 username != NULL ? username : "postgres")));
 	}
-	else if (AmBackgroundWorkerProcess())
+	else if (AmBackgroundWorkerProcess() || AmDataChecksumsWorkerProcess())
 	{
 		if (username == NULL && !OidIsValid(useroid))
 		{
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index ad25cbb39c5..05cba3a02e3 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -476,6 +476,14 @@ static const struct config_enum_entry wal_compression_options[] = {
 	{NULL, 0, false}
 };
 
+static const struct config_enum_entry data_checksums_options[] = {
+	{"on", DATA_CHECKSUMS_ON, true},
+	{"off", DATA_CHECKSUMS_OFF, true},
+	{"inprogress-on", DATA_CHECKSUMS_INPROGRESS_ON, true},
+	{"inprogress-off", DATA_CHECKSUMS_INPROGRESS_OFF, true},
+	{NULL, 0, false}
+};
+
 /*
  * Options for enum values stored in other modules
  */
@@ -601,7 +609,7 @@ static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
 static int	num_os_semaphores;
-static bool data_checksums;
+static int	data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
@@ -1952,17 +1960,6 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
-	{
-		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
-			gettext_noop("Shows whether data checksums are turned on for this cluster."),
-			NULL,
-			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
-		},
-		&data_checksums,
-		false,
-		NULL, NULL, NULL
-	},
-
 	{
 		{"syslog_sequence_numbers", PGC_SIGHUP, LOGGING_WHERE,
 			gettext_noop("Add sequence number to syslog messages to avoid duplicate suppression."),
@@ -5299,6 +5296,17 @@ struct config_enum ConfigureNamesEnum[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows whether data checksums are turned on for this cluster."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+		},
+		&data_checksums,
+		DATA_CHECKSUMS_OFF, data_checksums_options,
+		NULL, NULL, show_data_checksums
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 867aeddc601..79c3f86357e 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -576,7 +576,7 @@ main(int argc, char *argv[])
 		mode == PG_MODE_DISABLE)
 		pg_fatal("data checksums are already disabled in cluster");
 
-	if (ControlFile->data_checksum_version > 0 &&
+	if (ControlFile->data_checksum_version == DATA_CHECKSUMS_ON &&
 		mode == PG_MODE_ENABLE)
 		pg_fatal("data checksums are already enabled in cluster");
 
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index bd49ea867bf..f48936d3b00 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -14,6 +14,7 @@
 
 #include "common/string.h"
 #include "pg_upgrade.h"
+#include "storage/bufpage.h"
 
 
 /*
@@ -735,6 +736,14 @@ check_control_data(ControlData *oldctrl,
 	 * check_for_isn_and_int8_passing_mismatch().
 	 */
 
+	/*
+	 * If data checksums are in any in-progress state then disallow the
+	 * upgrade. The user should either let the process finish, or turn off
+	 * data checksums, before retrying.
+	 */
+	if (oldctrl->data_checksum_version > PG_DATA_CHECKSUM_VERSION)
+		pg_fatal("checksums are being enabled in the old cluster");
+
 	/*
 	 * We might eventually allow upgrades from checksum to no-checksum
 	 * clusters.
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index d313099c027..0bb32c9c0fc 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -117,7 +117,7 @@ extern PGDLLIMPORT int wal_level;
  * of the bits make it to disk, but the checksum wouldn't match.  Also WAL-log
  * them if forced by wal_log_hints=on.
  */
-#define XLogHintBitIsNeeded() (DataChecksumsEnabled() || wal_log_hints)
+#define XLogHintBitIsNeeded() (wal_log_hints || DataChecksumsNeedWrite())
 
 /* Do we need to WAL-log information required only for Hot Standby and logical replication? */
 #define XLogStandbyInfoActive() (wal_level >= WAL_LEVEL_REPLICA)
@@ -230,7 +230,19 @@ extern XLogRecPtr GetXLogWriteRecPtr(void);
 
 extern uint64 GetSystemIdentifier(void);
 extern char *GetMockAuthenticationNonce(void);
-extern bool DataChecksumsEnabled(void);
+extern bool DataChecksumsNeedWrite(void);
+extern bool DataChecksumsNeedVerify(void);
+extern bool DataChecksumsOnInProgress(void);
+extern bool DataChecksumsOffInProgress(void);
+extern void SetDataChecksumsOnInProgress(void);
+extern void SetDataChecksumsOn(void);
+extern void SetDataChecksumsOff(void);
+extern bool AbsorbChecksumsOnInProgressBarrier(void);
+extern bool AbsorbChecksumsOffInProgressBarrier(void);
+extern bool AbsorbChecksumsOnBarrier(void);
+extern bool AbsorbChecksumsOffBarrier(void);
+extern const char *show_data_checksums(void);
+extern void InitLocalControldata(void);
 extern bool GetDefaultCharSignedness(void);
 extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
 extern Size XLOGShmemSize(void);
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index 2cf8d55d706..ee82d4ce73d 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,6 +25,7 @@
 #include "lib/stringinfo.h"
 #include "pgtime.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/relfilelocator.h"
 
 
@@ -289,6 +290,12 @@ typedef struct xl_restore_point
 	char		rp_name[MAXFNAMELEN];
 } xl_restore_point;
 
+/* Information logged when data checksum level is changed */
+typedef struct xl_checksum_state
+{
+	ChecksumType new_checksumtype;
+} xl_checksum_state;
+
 /* Overwrite of prior contrecord */
 typedef struct xl_overwrite_contrecord
 {
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 63e834a6ce4..9b5a50adf54 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -80,6 +80,7 @@ typedef struct CheckPoint
 /* 0xC0 is used in Postgres 9.5-11 */
 #define XLOG_OVERWRITE_CONTRECORD		0xD0
 #define XLOG_CHECKPOINT_REDO			0xE0
+#define XLOG_CHECKSUMS					0xF0
 
 
 /*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index cede992b6e2..fb0e062b984 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12246,6 +12246,23 @@
   proname => 'jsonb_subscript_handler', prorettype => 'internal',
   proargtypes => 'internal', prosrc => 'jsonb_subscript_handler' },
 
+# data checksum management functions
+{ oid => '9258',
+  descr => 'disable data checksums',
+  proname => 'pg_disable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => '',
+  prosrc => 'disable_data_checksums' },
+
+{ oid => '9257',
+  descr => 'enable data checksums',
+  proname => 'pg_enable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'int4 int4 bool', proallargtypes => '{int4,int4,bool}',
+  proargmodes => '{i,i,i}',
+  proargnames => '{cost_delay,cost_limit,fast}',
+  prosrc => 'enable_data_checksums' },
+
 # collation management functions
 { oid => '3445', descr => 'import collations from operating system',
   proname => 'pg_import_system_collations', procost => '100',
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 7c736e7b03b..1ebd0c792b4 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -157,4 +157,22 @@
 #define PROGRESS_COPY_TYPE_PIPE 3
 #define PROGRESS_COPY_TYPE_CALLBACK 4
 
+/* Progress parameters for PROGRESS_DATACHECKSUMS */
+#define PROGRESS_DATACHECKSUMS_PHASE 0
+#define PROGRESS_DATACHECKSUMS_TOTAL_DB 1
+#define PROGRESS_DATACHECKSUMS_TOTAL_REL 2
+#define PROGRESS_DATACHECKSUMS_PROCESSED_DB 3
+#define PROGRESS_DATACHECKSUMS_PROCESSED_REL 4
+#define PROGRESS_DATACHECKSUMS_CUR_DB 5
+#define PROGRESS_DATACHECKSUMS_CUR_REL 6
+#define PROGRESS_DATACHECKSUMS_CUR_REL_TOTAL_BLOCKS 7
+#define PROGRESS_DATACHECKSUMS_CUR_REL_PROCESSED_BLOCKS 8
+
+/* Phases of datachecksumsworker operation */
+#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING 0
+#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING 1
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS 2
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL 3
+#define PROGRESS_DATACHECKSUMS_DONE 4
+
 #endif
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index a2b63495eec..9923c7f518d 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -365,6 +365,9 @@ typedef enum BackendType
 	B_WAL_SUMMARIZER,
 	B_WAL_WRITER,
 
+	B_DATACHECKSUMSWORKER_LAUNCHER,
+	B_DATACHECKSUMSWORKER_WORKER,
+
 	/*
 	 * Logger is not connected to shared memory and does not have a PGPROC
 	 * entry.
@@ -389,6 +392,7 @@ extern PGDLLIMPORT BackendType MyBackendType;
 #define AmWalReceiverProcess()		(MyBackendType == B_WAL_RECEIVER)
 #define AmWalSummarizerProcess()	(MyBackendType == B_WAL_SUMMARIZER)
 #define AmWalWriterProcess()		(MyBackendType == B_WAL_WRITER)
+#define AmDataChecksumsWorkerProcess()	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
 
 #define AmSpecialWorkerProcess() \
 	(AmAutoVacuumLauncherProcess() || \
diff --git a/src/include/postmaster/datachecksumsworker.h b/src/include/postmaster/datachecksumsworker.h
new file mode 100644
index 00000000000..59c9000d646
--- /dev/null
+++ b/src/include/postmaster/datachecksumsworker.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.h
+ *	  header file for checksum helper background worker
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/postmaster/datachecksumsworker.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef DATACHECKSUMSWORKER_H
+#define DATACHECKSUMSWORKER_H
+
+/* Shared memory */
+extern Size DataChecksumsWorkerShmemSize(void);
+extern void DataChecksumsWorkerShmemInit(void);
+
+/* Start the background processes for enabling or disabling checksums */
+void		StartDataChecksumsWorkerLauncher(bool enable_checksums,
+											 int cost_delay,
+											 int cost_limit,
+											 bool fast);
+
+/* Background worker entrypoints */
+void		DataChecksumsWorkerLauncherMain(Datum arg);
+void		DataChecksumsWorkerMain(Datum arg);
+
+#endif							/* DATACHECKSUMSWORKER_H */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 6646b6f6371..5326782171f 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -205,7 +205,17 @@ typedef PageHeaderData *PageHeader;
  * handling pages.
  */
 #define PG_PAGE_LAYOUT_VERSION		4
+
+/*
+ * Checksum version 0 is used for when data checksums are disabled.
+ * PG_DATA_CHECKSUM_VERSION defines that data checksums are enabled in the
+ * cluster and PG_DATA_CHECKSUM_INPROGRESS_{ON|OFF}_VERSION defines that data
+ * checksums are either currently being enabled or disabled.
+ */
 #define PG_DATA_CHECKSUM_VERSION	1
+#define PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION	2
+#define PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION	3
+
 
 /* ----------------------------------------------------------------
  *						page support functions
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 25d13a798d1..6faff962ef0 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -15,6 +15,14 @@
 
 #include "storage/block.h"
 
+typedef enum ChecksumType
+{
+	DATA_CHECKSUMS_OFF = 0,
+	DATA_CHECKSUMS_ON,
+	DATA_CHECKSUMS_INPROGRESS_ON,
+	DATA_CHECKSUMS_INPROGRESS_OFF
+} ChecksumType;
+
 /*
  * Compute the checksum for a Postgres page.  The page must be aligned on a
  * 4-byte boundary.
diff --git a/src/include/storage/lwlocklist.h b/src/include/storage/lwlocklist.h
index cf565452382..afe39db753c 100644
--- a/src/include/storage/lwlocklist.h
+++ b/src/include/storage/lwlocklist.h
@@ -83,3 +83,4 @@ PG_LWLOCK(49, WALSummarizer)
 PG_LWLOCK(50, DSMRegistry)
 PG_LWLOCK(51, InjectionPoint)
 PG_LWLOCK(52, SerialControl)
+PG_LWLOCK(53, DataChecksumsWorker)
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 114eb1f8f76..61a3e3af9d4 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -447,9 +447,10 @@ extern PGDLLIMPORT PGPROC *PreparedXactProcs;
  * Background writer, checkpointer, WAL writer, WAL summarizer, and archiver
  * run during normal operation.  Startup process and WAL receiver also consume
  * 2 slots, but WAL writer is launched only after startup has exited, so we
- * only need 6 slots.
+ * only need 6 slots to cover these. The DataChecksums worker and launcher
+ * can consume 2 slots when data checksums are enabled or disabled.
  */
-#define NUM_AUXILIARY_PROCS		6
+#define NUM_AUXILIARY_PROCS		8
 
 /* configurable options */
 extern PGDLLIMPORT int DeadlockTimeout;
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 022fd8ed933..8937fa6ed3d 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -54,6 +54,11 @@ typedef enum
 typedef enum
 {
 	PROCSIGNAL_BARRIER_SMGRRELEASE, /* ask smgr to close files */
+
+	PROCSIGNAL_BARRIER_CHECKSUM_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_ON,
 } ProcSignalBarrierType;
 
 /*
diff --git a/src/include/utils/backend_progress.h b/src/include/utils/backend_progress.h
index dda813ab407..c664e92dbfe 100644
--- a/src/include/utils/backend_progress.h
+++ b/src/include/utils/backend_progress.h
@@ -28,6 +28,7 @@ typedef enum ProgressCommandType
 	PROGRESS_COMMAND_CREATE_INDEX,
 	PROGRESS_COMMAND_BASEBACKUP,
 	PROGRESS_COMMAND_COPY,
+	PROGRESS_COMMAND_DATACHECKSUMS,
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	20
diff --git a/src/test/Makefile b/src/test/Makefile
index 511a72e6238..278ce3e8a86 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,16 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = perl postmaster regress isolation modules authentication recovery subscription
+SUBDIRS = \
+		perl \
+		postmaster \
+		regress \
+		isolation \
+		modules \
+		authentication \
+		recovery \
+		subscription \
+		checksum
 
 ifeq ($(with_icu),yes)
 SUBDIRS += icu
diff --git a/src/test/checksum/.gitignore b/src/test/checksum/.gitignore
new file mode 100644
index 00000000000..871e943d50e
--- /dev/null
+++ b/src/test/checksum/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/checksum/Makefile b/src/test/checksum/Makefile
new file mode 100644
index 00000000000..fd03bf73df4
--- /dev/null
+++ b/src/test/checksum/Makefile
@@ -0,0 +1,23 @@
+#-------------------------------------------------------------------------
+#
+# Makefile for src/test/checksum
+#
+# Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/checksum/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/test/checksum
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
+
+clean distclean maintainer-clean:
+	rm -rf tmp_check
diff --git a/src/test/checksum/README b/src/test/checksum/README
new file mode 100644
index 00000000000..0f0317060b3
--- /dev/null
+++ b/src/test/checksum/README
@@ -0,0 +1,22 @@
+src/test/checksum/README
+
+Regression tests for data checksums
+===================================
+
+This directory contains a test suite for enabling data checksums
+in a running cluster.
+
+Running the tests
+=================
+
+    make check
+
+or
+
+    make installcheck
+
+NOTE: This creates a temporary installation (in the case of "check"),
+with multiple nodes, be they master or standby(s) for the purpose of
+the tests.
+
+NOTE: This requires the --enable-tap-tests argument to configure.
diff --git a/src/test/checksum/meson.build b/src/test/checksum/meson.build
new file mode 100644
index 00000000000..5f96b5c246d
--- /dev/null
+++ b/src/test/checksum/meson.build
@@ -0,0 +1,15 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+tests += {
+  'name': 'checksums',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'tests': [
+      't/001_basic.pl',
+      't/002_restarts.pl',
+      't/003_standby_restarts.pl',
+      't/004_offline.pl',
+    ],
+  },
+}
diff --git a/src/test/checksum/t/001_basic.pl b/src/test/checksum/t/001_basic.pl
new file mode 100644
index 00000000000..4c64f6a14fc
--- /dev/null
+++ b/src/test/checksum/t/001_basic.pl
@@ -0,0 +1,88 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are turned off
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable data checksums
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+# Wait for checksums to become enabled
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Enable data checksums again which should be a no-op..
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+# ..and make sure we can still read/write data
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums again
+$node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are disabled');
+
+# Test reading again
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure previously checksummed pages can be read back');
+
+# Re-enable checksums and make sure that the underlying data has changed to
+# ensure that checksums will be different.
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back the data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$node->stop;
+
+done_testing();
diff --git a/src/test/checksum/t/002_restarts.pl b/src/test/checksum/t/002_restarts.pl
new file mode 100644
index 00000000000..2697b722257
--- /dev/null
+++ b/src/test/checksum/t/002_restarts.pl
@@ -0,0 +1,92 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# restarting the processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, '1', "ensure checksums aren't enabled yet");
+
+$bsession->quit;
+$node->stop;
+$node->start;
+
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'inprogress-on', "ensure checksums aren't enabled yet");
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are turned on');
+
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+$result = $node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are turned off');
+
+done_testing();
diff --git a/src/test/checksum/t/003_standby_restarts.pl b/src/test/checksum/t/003_standby_restarts.pl
new file mode 100644
index 00000000000..6c0fe8f3bf8
--- /dev/null
+++ b/src/test/checksum/t/003_standby_restarts.pl
@@ -0,0 +1,131 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# streaming replication
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize primary node
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+$node_primary->init(allows_streaming => 1, no_data_checksums => 1);
+$node_primary->start;
+my $backup_name = 'my_backup';
+
+# Take backup
+$node_primary->backup($backup_name);
+
+# Create streaming standby linking to primary
+my $node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $backup_name,
+	has_streaming => 1);
+$node_standby_1->start;
+
+# Create some content on the primary to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Wait for standbys to catch up
+$node_primary->wait_for_catchup($node_standby_1, 'replay',
+	$node_primary->lsn('insert'));
+
+# Check that checksums are turned off on all nodes
+my $result = $node_primary->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on primary');
+
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on standby_1');
+
+# Enable checksums for the cluster
+$node_primary->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+# Ensure that the primary switches to "inprogress-on"
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	"inprogress-on");
+is($result, 1, 'ensure checksums are in progress on primary');
+
+# Wait for checksum enable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Ensure that the standby has switched to "inprogress-on" or "on".  Normally it
+# would be "inprogress-on", but it is theoretically possible for the primary to
+# complete the checksum enabling *and* have the standby replay that record
+# before we reach the check below.
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting = 'off' FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'f');
+is($result, 1, 'ensure standby has absorbed the inprogress-on barrier');
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+
+is(($result eq 'inprogress-on' || $result eq 'on'),
+	1, 'ensure checksums are on, or in progress, on standby_1');
+
+# Insert some more data which should be checksummed on INSERT
+$node_primary->safe_psql('postgres',
+	"INSERT INTO t VALUES (generate_series(1, 10000));");
+
+# Wait for checksums enabled on the primary
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the primary');
+
+# Wait for checksums enabled on the standby
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the standby');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, '20000', 'ensure we can safely read all data with checksums');
+
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+# Disable checksums and ensure it's propagated to standby and that we can
+# still read all data
+$node_primary->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+# Wait for checksum disable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure data checksums are disabled on the primary 2');
+
+# Ensure that the standby has switched to off
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are off on standby_1');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, "20000", 'ensure we can safely read all data without checksums');
+
+done_testing();
diff --git a/src/test/checksum/t/004_offline.pl b/src/test/checksum/t/004_offline.pl
new file mode 100644
index 00000000000..9cee62c9b52
--- /dev/null
+++ b/src/test/checksum/t/004_offline.pl
@@ -0,0 +1,101 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums offline from various states
+# of checksum processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable checksums offline using pg_checksums
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are enabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums offline again using pg_checksums
+$node->stop;
+$node->checksum_disable_offline;
+$node->start;
+
+# Ensure that checksums are disabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, 1, 'ensure checksums are in the process of being enabled');
+
+# Turn the cluster off and enable checksums offline, then start back up
+$bsession->quit;
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are now enabled even though processing wasn't
+# restarted
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+done_testing();
diff --git a/src/test/meson.build b/src/test/meson.build
index ccc31d6a86a..c4bfef2c0e2 100644
--- a/src/test/meson.build
+++ b/src/test/meson.build
@@ -8,6 +8,7 @@ subdir('postmaster')
 subdir('recovery')
 subdir('subscription')
 subdir('modules')
+subdir('checksum')
 
 if ssl.found()
   subdir('ssl')
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index b105cba05a6..666bd2a2d4c 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3748,6 +3748,40 @@ sub advance_wal
 	}
 }
 
+=item $node->checksum_enable_offline()
+
+Enable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_enable_offline
+{
+	my ($self) = @_;
+
+	print "### Enabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-e');
+	return;
+}
+
+=item checksum_disable_offline
+
+Disable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_disable_offline
+{
+	my ($self) = @_;
+
+	print "### Disabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-d');
+	return;
+}
+
 =pod
 
 =back
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 62f69ac20b2..36bed4b168d 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2041,6 +2041,34 @@ pg_stat_progress_create_index| SELECT s.pid,
     s.param15 AS partitions_done
    FROM (pg_stat_get_progress_info('CREATE INDEX'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
+pg_stat_progress_data_checksums| SELECT s.pid,
+    s.datid,
+    d.datname,
+        CASE s.param1
+            WHEN 0 THEN 'enabling'::text
+            WHEN 1 THEN 'disabling'::text
+            WHEN 2 THEN 'waiting'::text
+            WHEN 3 THEN 'waiting on backends'::text
+            WHEN 4 THEN 'waiting on temporary tables'::text
+            WHEN 5 THEN 'done'::text
+            ELSE NULL::text
+        END AS phase,
+        CASE s.param2
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param2
+        END AS databases_total,
+        CASE s.param3
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param3
+        END AS relations_total,
+    s.param4 AS databases_processed,
+    s.param5 AS relations_processed,
+    s.param6 AS databases_current,
+    s.param7 AS relation_current,
+    s.param8 AS relation_current_blocks,
+    s.param9 AS relation_current_blocks_processed
+   FROM (pg_stat_get_progress_info('DATACHECKSUMS'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 9840060997f..86e4057f61d 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -403,6 +403,7 @@ CheckPointStmt
 CheckpointStatsData
 CheckpointerRequest
 CheckpointerShmemStruct
+ChecksumType
 Chromosome
 CkptSortItem
 CkptTsStatus
@@ -593,6 +594,10 @@ DataPageDeleteStack
 DataTypesUsageChecks
 DataTypesUsageVersionCheck
 DatabaseInfo
+DataChecksumsWorkerDatabase
+DataChecksumsWorkerResult
+DataChecksumsWorkerResultEntry
+DataChecksumsWorkerShmemStruct
 DateADT
 DateTimeErrorExtra
 Datum
@@ -4141,6 +4146,7 @@ xl_btree_split
 xl_btree_unlink_page
 xl_btree_update
 xl_btree_vacuum
+xl_checksum_state
 xl_clog_truncate
 xl_commit_ts_truncate
 xl_dbase_create_file_copy_rec
-- 
2.48.1

v20250309-0002-simple-post-rebase-fixes.patchtext/x-patch; charset=UTF-8; name=v20250309-0002-simple-post-rebase-fixes.patchDownload
From b72c6ed49314783a421239d350e1a05057b01da1 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Fri, 7 Mar 2025 20:19:29 +0100
Subject: [PATCH v20250309 2/5] simple post-rebase fixes

- Update checks in PostmasterStateMachine to account for datachecksum
  workers, etc.

- Remove pgstat_bestart() call - it would need to be _initial(), but I
  don't think it's needed.

- Update vacuum_delay_point() call.
---
 src/backend/postmaster/datachecksumsworker.c | 5 +----
 src/backend/postmaster/postmaster.c          | 5 +++++
 src/backend/utils/activity/pgstat_backend.c  | 2 ++
 3 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
index de7a077f9c2..b9d003423c0 100644
--- a/src/backend/postmaster/datachecksumsworker.c
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -450,7 +450,7 @@ ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrateg
 		if (abort_requested)
 			return false;
 
-		vacuum_delay_point();
+		vacuum_delay_point(false);
 	}
 
 	pfree(relns);
@@ -752,9 +752,6 @@ DataChecksumsWorkerLauncherMain(Datum arg)
 	 */
 	BackgroundWorkerInitializeConnectionByOid(InvalidOid, InvalidOid, 0);
 
-	/* Initialize backend status information */
-	pgstat_bestart();
-
 	enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
 	DataChecksumsWorkerShmem->launcher_running = true;
 	DataChecksumsWorkerShmem->enabling_checksums = enabling_checksums;
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index d2a7a7add6f..2fc438987b5 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2947,6 +2947,11 @@ PostmasterStateMachine(void)
 									B_INVALID,
 									B_STANDALONE_BACKEND);
 
+			/* also add checksumming processes */
+			remainMask = btmask_add(remainMask,
+									B_DATACHECKSUMSWORKER_LAUNCHER,
+									B_DATACHECKSUMSWORKER_WORKER);
+
 			/* All types should be included in targetMask or remainMask */
 			Assert((remainMask.mask | targetMask.mask) == BTYPE_MASK_ALL.mask);
 		}
diff --git a/src/backend/utils/activity/pgstat_backend.c b/src/backend/utils/activity/pgstat_backend.c
index a9343b7b59e..f4b2f3b91d8 100644
--- a/src/backend/utils/activity/pgstat_backend.c
+++ b/src/backend/utils/activity/pgstat_backend.c
@@ -292,6 +292,8 @@ pgstat_tracks_backend_bktype(BackendType bktype)
 		case B_BG_WRITER:
 		case B_CHECKPOINTER:
 		case B_STARTUP:
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 			return false;
 
 		case B_AUTOVAC_WORKER:
-- 
2.48.1

v20250309-0003-sync-the-data_checksums-GUC-with-the-local.patchtext/x-patch; charset=UTF-8; name=v20250309-0003-sync-the-data_checksums-GUC-with-the-local.patchDownload
From 9257d742294b27a4ee559fa31459fa0afa502b6b Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Fri, 7 Mar 2025 23:07:44 +0100
Subject: [PATCH v20250309 3/5] sync the data_checksums GUC with the local
 variable

We now have three places that in some way express the state of data
checksums in the instance.

- control file / data_checksum_version

- LocalDataChecksumVersion as a local cache to reduce locking

- data_checksums backing the GUC

We need to keep the GUC variable in sync, to ensure we get the correct
value even early during startup (e.g. with -C).

Introduces a new SetLocalDataChecksumVersion() which sets the local
variable, and also updates the data_checksum GUC at the same time.
---
 src/backend/access/transam/xlog.c   | 51 +++++++++++++++++++++++++----
 src/backend/utils/misc/guc_tables.c |  3 +-
 src/include/access/xlog.h           |  1 +
 3 files changed, 47 insertions(+), 8 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index b5190a7e104..f137cdc6d42 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -657,6 +657,14 @@ static bool updateMinRecoveryPoint = true;
  */
 static uint32 LocalDataChecksumVersion = 0;
 
+/*
+ * Variable backing the GUC, keep it in sync with LocalDataChecksumVersion.
+ * See SetLocalDataChecksumVersion().
+ */
+int			data_checksums = 0;
+
+static void SetLocalDataChecksumVersion(uint32 data_checksum_version);
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -4594,7 +4602,7 @@ ReadControlFile(void)
 
 	CalculateCheckpointSegments();
 
-	LocalDataChecksumVersion = ControlFile->data_checksum_version;
+	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
 }
 
 /*
@@ -4913,7 +4921,7 @@ SetDataChecksumsOff(void)
 bool
 AbsorbChecksumsOnInProgressBarrier(void)
 {
-	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
 	return true;
 }
 
@@ -4921,21 +4929,21 @@ bool
 AbsorbChecksumsOnBarrier(void)
 {
 	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
-	LocalDataChecksumVersion = PG_DATA_CHECKSUM_VERSION;
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_VERSION);
 	return true;
 }
 
 bool
 AbsorbChecksumsOffInProgressBarrier(void)
 {
-	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
 	return true;
 }
 
 bool
 AbsorbChecksumsOffBarrier(void)
 {
-	LocalDataChecksumVersion = 0;
+	SetLocalDataChecksumVersion(0);
 	return true;
 }
 
@@ -4951,10 +4959,41 @@ void
 InitLocalControldata(void)
 {
 	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
-	LocalDataChecksumVersion = ControlFile->data_checksum_version;
+	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
 	LWLockRelease(ControlFileLock);
 }
 
+/*
+ * XXX probably should be called in all places that modify the value of
+ * LocalDataChecksumVersion (to make sure data_checksums GUC is in sync)
+ *
+ * XXX aren't PG_DATA_ and DATA_ constants the same? why do we need both?
+ */
+void
+SetLocalDataChecksumVersion(uint32 data_checksum_version)
+{
+	LocalDataChecksumVersion = data_checksum_version;
+
+	switch (LocalDataChecksumVersion)
+	{
+		case PG_DATA_CHECKSUM_VERSION:
+			data_checksums = DATA_CHECKSUMS_ON;
+			break;
+
+		case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+			data_checksums = DATA_CHECKSUMS_INPROGRESS_ON;
+			break;
+
+		case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+			data_checksums = DATA_CHECKSUMS_INPROGRESS_OFF;
+			break;
+
+		default:
+			data_checksums = DATA_CHECKSUMS_OFF;
+			break;
+	}
+}
+
 /* guc hook */
 const char *
 show_data_checksums(void)
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 05cba3a02e3..66abf056159 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -609,7 +609,6 @@ static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
 static int	num_os_semaphores;
-static int	data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
@@ -5300,7 +5299,7 @@ struct config_enum ConfigureNamesEnum[] =
 		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
 			gettext_noop("Shows whether data checksums are turned on for this cluster."),
 			NULL,
-			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
 		},
 		&data_checksums,
 		DATA_CHECKSUMS_OFF, data_checksums_options,
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 0bb32c9c0fc..aec3ea0bc63 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -56,6 +56,7 @@ extern PGDLLIMPORT int CommitDelay;
 extern PGDLLIMPORT int CommitSiblings;
 extern PGDLLIMPORT bool track_wal_io_timing;
 extern PGDLLIMPORT int wal_decode_buffer_size;
+extern PGDLLIMPORT int data_checksums;
 
 extern PGDLLIMPORT int CheckPointSegments;
 
-- 
2.48.1

v20250309-0004-make-progress-reporting-work.patchtext/x-patch; charset=UTF-8; name=v20250309-0004-make-progress-reporting-work.patchDownload
From aa8a3867b3316a1fa379b1e70553057d5d96fedc Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Sat, 8 Mar 2025 19:34:50 +0100
Subject: [PATCH v20250309 4/5] make progress reporting work

- Splits the progress status by worker type - one row for launcher,
  one row for checksum worker (might be more with parallel workers).

- The launcher only updates database counters, the workers only set
  relation/block counters.

- Also reworks the columns in the system view a bit, discards the
  "current" fields (we still know the database for each worker).

- Issue: Not sure what to do about relation forks, at this point it
  tracks only blocks for MAIN fork.
---
 src/backend/catalog/system_views.sql         | 36 ++++----
 src/backend/postmaster/datachecksumsworker.c | 91 +++++++++++++++++---
 src/include/commands/progress.h              | 27 +++---
 3 files changed, 112 insertions(+), 42 deletions(-)

diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 0d62976dc1f..4330d0ad656 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1335,25 +1335,25 @@ CREATE VIEW pg_stat_progress_copy AS
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
 CREATE VIEW pg_stat_progress_data_checksums AS
-	SELECT
-		S.pid AS pid, S.datid AS datid, D.datname AS datname,
-		CASE S.param1 WHEN 0 THEN 'enabling'
+    SELECT
+        S.pid AS pid, S.datid, D.datname AS datname,
+        CASE S.param1 WHEN 0 THEN 'enabling'
                       WHEN 1 THEN 'disabling'
-					  WHEN 2 THEN 'waiting'
-					  WHEN 3 THEN 'waiting on backends'
-					  WHEN 4 THEN 'waiting on temporary tables'
-					  WHEN 5 THEN 'done'
-					  END AS phase,
-		CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
-		CASE S.param3 WHEN -1 THEN NULL ELSE S.param3 END AS relations_total,
-		S.param4 AS databases_processed,
-		S.param5 AS relations_processed,
-		S.param6 AS databases_current,
-		S.param7 AS relation_current,
-		S.param8 AS relation_current_blocks,
-		S.param9 AS relation_current_blocks_processed
-	FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
-		LEFT JOIN pg_database D ON S.datid = D.oid;
+                      WHEN 2 THEN 'waiting'
+                      WHEN 3 THEN 'waiting on backends'
+                      WHEN 4 THEN 'waiting on temporary tables'
+                      WHEN 5 THEN 'waiting on checkpoint'
+                      WHEN 6 THEN 'done'
+                      END AS phase,
+        CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+        S.param3 AS databases_done,
+        CASE S.param4 WHEN -1 THEN NULL ELSE S.param4 END AS relations_total,
+        CASE S.param5 WHEN -1 THEN NULL ELSE S.param5 END AS relations_done,
+        CASE S.param6 WHEN -1 THEN NULL ELSE S.param6 END AS blocks_total,
+        CASE S.param7 WHEN -1 THEN NULL ELSE S.param7 END AS blocks_done
+    FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid
+    ORDER BY S.datid; -- return the launcher process first
 
 CREATE VIEW pg_user_mappings AS
     SELECT
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
index b9d003423c0..f79dc290b2b 100644
--- a/src/backend/postmaster/datachecksumsworker.c
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -409,6 +409,11 @@ ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrateg
 			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
 	pgstat_report_activity(STATE_RUNNING, activity);
 
+	/* XXX only do this for main forks, maybe we should do it for all? */
+	if (forkNum == MAIN_FORKNUM)
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+									 numblocks);
+
 	/*
 	 * We are looping over the blocks which existed at the time of process
 	 * start, which is safe since new blocks are created with checksums set
@@ -450,6 +455,11 @@ ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrateg
 		if (abort_requested)
 			return false;
 
+		/* XXX only do this for main forks, maybe we should do it for all? */
+		if (forkNum == MAIN_FORKNUM)
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+										 (blknum + 1));
+
 		vacuum_delay_point(false);
 	}
 
@@ -798,6 +808,8 @@ again:
 		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
 									 PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS);
 
+		/* XXX isn't it weird there's no wait between the phase updates? */
+
 		/*
 		 * Set the state to inprogress-on and wait on the procsignal barrier.
 		 */
@@ -908,8 +920,29 @@ ProcessAllDatabases(bool immediate_checkpoint)
 	 * columns for processed databases is instead increased such that it can
 	 * be compared against the total.
 	 */
-	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_DB,
-								 list_length(DatabaseList));
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_DBS_TOTAL,
+			PROGRESS_DATACHECKSUMS_DBS_DONE,
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE,
+			PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+			PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+		};
+
+		int64	vals[6];
+
+		vals[0] = list_length(DatabaseList);
+		vals[1] = 0;
+
+		/* translated to NULL */
+		vals[2] = -1;
+		vals[3] = -1;
+		vals[4] = -1;
+		vals[5] = -1;
+
+		pgstat_progress_update_multi_param(6, index, vals);
+	}
 
 	while (true)
 	{
@@ -921,14 +954,6 @@ ProcessAllDatabases(bool immediate_checkpoint)
 			DataChecksumsWorkerResultEntry *entry;
 			bool		found;
 
-			/*
-			 * Indicate which database is being processed set the number of
-			 * relations to -1 to clear field from previous values. -1 will
-			 * translate to NULL in the progress view.
-			 */
-			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_CUR_DB, db->dboid);
-			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_REL, -1);
-
 			/*
 			 * Check if this database has been processed already, and if so
 			 * whether it should be retried or skipped.
@@ -957,6 +982,12 @@ ProcessAllDatabases(bool immediate_checkpoint)
 			result = ProcessDatabase(db);
 			processed_databases++;
 
+			/*
+			 * Update the number of processed databases in the progress report.
+			 */
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_DBS_DONE,
+										 processed_databases);
+
 			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
 			{
 				/*
@@ -1050,6 +1081,13 @@ ProcessAllDatabases(bool immediate_checkpoint)
 				 errhint("The server log might have more information on the cause of the error.")));
 	}
 
+	/*
+	 * When enabling checksums, we have to wait for a checkpoint for the
+	 * checksums to e.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT);
+
 	/*
 	 * Force a checkpoint to get everything out to disk. The use of immediate
 	 * checkpoints is for running tests, as they would otherwise not execute
@@ -1255,6 +1293,7 @@ DataChecksumsWorkerMain(Datum arg)
 	List	   *InitialTempTableList = NIL;
 	BufferAccessStrategy strategy;
 	bool		aborted = false;
+	int64		rels_done;
 
 	enabling_checksums = true;
 
@@ -1268,6 +1307,10 @@ DataChecksumsWorkerMain(Datum arg)
 	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
 											  BGWORKER_BYPASS_ALLOWCONN);
 
+	/* worker will have a separate entry in pg_stat_progress_data_checksums */
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
 	/*
 	 * Get a list of all temp tables present as we start in this database. We
 	 * need to wait until they are all gone until we are done, since we cannot
@@ -1294,6 +1337,24 @@ DataChecksumsWorkerMain(Datum arg)
 
 	RelationList = BuildRelationList(false,
 									 DataChecksumsWorkerShmem->process_shared_catalogs);
+
+	/* Update the total number of relations to be processed in this DB. */
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE
+		};
+
+		int64	vals[2];
+
+		vals[0] = list_length(RelationList);
+		vals[1] = 0;
+
+		pgstat_progress_update_multi_param(2, index, vals);
+	}
+
+	/* process the relations */
+	rels_done = 0;
 	foreach_oid(reloid, RelationList)
 	{
 		if (!ProcessSingleRelationByOid(reloid, strategy))
@@ -1301,6 +1362,9 @@ DataChecksumsWorkerMain(Datum arg)
 			aborted = true;
 			break;
 		}
+
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_RELS_DONE,
+									 ++rels_done);
 	}
 	list_free(RelationList);
 
@@ -1313,6 +1377,10 @@ DataChecksumsWorkerMain(Datum arg)
 		return;
 	}
 
+	/* The worker is about to wait for temporary tables to go away. */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL);
+
 	/*
 	 * Wait for all temp tables that existed when we started to go away. This
 	 * is necessary since we cannot "reach" them to enable checksums. Any temp
@@ -1369,5 +1437,8 @@ DataChecksumsWorkerMain(Datum arg)
 
 	list_free(InitialTempTableList);
 
+	/* worker done */
+	pgstat_progress_end_command();
+
 	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
 }
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 1ebd0c792b4..94b478a6cc9 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -158,21 +158,20 @@
 #define PROGRESS_COPY_TYPE_CALLBACK 4
 
 /* Progress parameters for PROGRESS_DATACHECKSUMS */
-#define PROGRESS_DATACHECKSUMS_PHASE 0
-#define PROGRESS_DATACHECKSUMS_TOTAL_DB 1
-#define PROGRESS_DATACHECKSUMS_TOTAL_REL 2
-#define PROGRESS_DATACHECKSUMS_PROCESSED_DB 3
-#define PROGRESS_DATACHECKSUMS_PROCESSED_REL 4
-#define PROGRESS_DATACHECKSUMS_CUR_DB 5
-#define PROGRESS_DATACHECKSUMS_CUR_REL 6
-#define PROGRESS_DATACHECKSUMS_CUR_REL_TOTAL_BLOCKS 7
-#define PROGRESS_DATACHECKSUMS_CUR_REL_PROCESSED_BLOCKS 8
+#define PROGRESS_DATACHECKSUMS_PHASE		0
+#define PROGRESS_DATACHECKSUMS_DBS_TOTAL	1
+#define PROGRESS_DATACHECKSUMS_DBS_DONE		2
+#define PROGRESS_DATACHECKSUMS_RELS_TOTAL	3
+#define PROGRESS_DATACHECKSUMS_RELS_DONE	4
+#define PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL	5
+#define PROGRESS_DATACHECKSUMS_BLOCKS_DONE	6
 
 /* Phases of datachecksumsworker operation */
-#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING 0
-#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING 1
-#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS 2
-#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL 3
-#define PROGRESS_DATACHECKSUMS_DONE 4
+#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING			0
+#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING			1
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING			2
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS	3
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL	4
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT 5
 
 #endif
-- 
2.48.1

v20250309-0005-update-docs.patchtext/x-patch; charset=UTF-8; name=v20250309-0005-update-docs.patchDownload
From 3015bbc4b6301d87ac7a186e19fa2eec0763b6c8 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Sun, 9 Mar 2025 13:34:10 +0100
Subject: [PATCH v20250309 5/5] update docs

---
 doc/src/sgml/monitoring.sgml | 84 ++++++++++++++++++++----------------
 doc/src/sgml/wal.sgml        |  8 ++--
 2 files changed, 51 insertions(+), 41 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 4a6ef5f1605..2a568ee2357 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6804,8 +6804,8 @@ FROM pg_stat_get_backend_idset() AS backendid;
   <para>
    When data checksums are being enabled on a running cluster, the
    <structname>pg_stat_progress_data_checksums</structname> view will contain
-   a row for each background worker which is currently calculating checksums
-   for the data pages.
+   a row for the launcher process, and one row for each worker process which
+   is currently calculating checksums for the data pages in one database.
   </para>
 
   <table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
@@ -6837,37 +6837,33 @@ FROM pg_stat_get_backend_idset() AS backendid;
      </row>
 
      <row>
-      <entry role="catalog_table_entry">
-       <para role="column_definition">
-        <structfield>phase</structfield> <type>text</type>
-       </para>
-       <para>
-        Current processing phase, see <xref linkend="datachecksum-phases"/>
-        for description of the phases.
-       </para>
-      </entry>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datid</structfield> <type>oid</type>
+      </para>
+      <para>
+       OID of this database, or 0 for the launcher process
+       relation
+      </para></entry>
      </row>
 
      <row>
-      <entry role="catalog_table_entry">
-       <para role="column_definition">
-        <structfield>databases_total</structfield> <type>integer</type>
-       </para>
-       <para>
-        The total number of databases which will be processed.
-       </para>
-      </entry>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datname</structfield> <type>name</type>
+      </para>
+      <para>
+       Name of this database, or <literal>NULL</literal> for the
+       launcher process.
+      </para></entry>
      </row>
 
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>relations_total</structfield> <type>integer</type>
+        <structfield>phase</structfield> <type>text</type>
        </para>
        <para>
-        The total number of relations which will be processed, or
-        <literal>NULL</literal> if the data checksums worker process hasn't
-        calculated the number of relations yet.
+        Current processing phase, see <xref linkend="datachecksum-phases"/>
+        for description of the phases.
        </para>
       </entry>
      </row>
@@ -6875,10 +6871,12 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>databases_processed</structfield> <type>integer</type>
+        <structfield>databases_total</structfield> <type>integer</type>
        </para>
        <para>
-        The number of databases which have been processed.
+        The total number of databases which will be processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6886,10 +6884,12 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>relations_processed</structfield> <type>integer</type>
+        <structfield>databases_done</structfield> <type>integer</type>
        </para>
        <para>
-        The number of relations which have been processed.
+        The number of databases which have been processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6897,10 +6897,13 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>databases_current</structfield> <type>oid</type>
+        <structfield>relations_total</structfield> <type>integer</type>
        </para>
        <para>
-        OID of the database currently being processed.
+        The total number of relations which will be processed, or
+        <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of relations yet. The launcher process has
+        this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6908,10 +6911,11 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>relation_current</structfield> <type>oid</type>
+        <structfield>relations_done</structfield> <type>integer</type>
        </para>
        <para>
-        OID of the relation currently being processed.
+        The number of relations which have been processed. The launcher
+        process has this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6919,10 +6923,13 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>relation_current_blocks</structfield> <type>integer</type>
+        <structfield>blocks_total</structfield> <type>integer</type>
        </para>
        <para>
-        The total number of blocks in the relation currently being processed.
+        The number of blocks in the current relation which will be processed,
+        or <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of blocks yet. The launcher process has
+        this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6930,11 +6937,11 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>relation_current_blocks_processed</structfield> <type>integer</type>
+        <structfield>blocks_done</structfield> <type>integer</type>
        </para>
        <para>
-        The number of blocks which have been processed in the relation currently
-        being processed.
+        The number of blocks in the current relation which have been processed.
+        The launcher process has this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6982,9 +6989,10 @@ FROM pg_stat_get_backend_idset() AS backendid;
       </entry>
      </row>
      <row>
-      <entry><literal>done</literal></entry>
+      <entry><literal>waiting on checkpoint</literal></entry>
       <entry>
-       The command has finished processing and is exiting.
+       The command is currently waiting for a checkpoint to update the checksum
+       state at the end.
       </entry>
      </row>
     </tbody>
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index 2562b1a002c..0ada90ca0b1 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -310,15 +310,17 @@
     If the cluster is stopped while in <literal>inprogress-on</literal> mode, for
     any reason, then this process must be restarted manually. To do this,
     re-execute the function <function>pg_enable_data_checksums()</function>
-    once the cluster has been restarted. The background worker will attempt
-    to resume the work from where it was interrupted.
+    once the cluster has been restarted. The process will start over, there is
+    no support for resuming work from where it was interrupted.
    </para>
 
    <note>
     <para>
      Enabling checksums can cause significant I/O to the system, as most of the
      database pages will need to be rewritten, and will be written both to the
-     data files and the WAL.
+     data files and the WAL. The impact may be limited by throttling using the
+     <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter>
+     parameters of the <function>pg_enable_data_checksums</function> function.
     </para>
    </note>
 
-- 
2.48.1

#21Tomas Vondra
tomas@vondra.me
In reply to: Tomas Vondra (#20)
5 attachment(s)
Re: Changing the state of data checksums in a running cluster

Seems cfbot was unhappy with the patches, so here's an improved version,
fixing some minor issues in expected output and a compiler warning.

There however seems to be some issue with 003_standby_restarts, which
causes failures on freebsd and macos. I don't know what that is about,
but the test runs much longer than on debian.

regards

--
Tomas Vondra

Attachments:

v20250310-0001-Online-enabling-and-disabling-of-data-chec.patchtext/x-patch; charset=UTF-8; name=v20250310-0001-Online-enabling-and-disabling-of-data-chec.patchDownload
From c8e48010604a0042d22d1b0d3b31c335277b1ceb Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Fri, 7 Mar 2025 19:21:22 +0100
Subject: [PATCH v20250310 1/5] Online enabling and disabling of data checksums

This allows data checksums to be enabled, or disabled, in a running
cluster without restricting access to the cluster during processing.

Data checksums could prior to this only be enabled during initdb or
when the cluster is offline using the pg_checksums app. This commit
introduce functionality to enable, or disable, data checksums while
the cluster is running regardless of how it was initialized.

A background worker launcher process is responsible for launching a
dynamic per-database background worker which will mark all buffers
dirty for all relation with storage in order for them to have data
checksums calcuated on write.  Once all relations in all databases
have been processed, the data_checksums state will be set to on and
the cluster will at that point be identical to one which had data
checksums enabled during initialization or via offline processing.

When data checksums are being enabled, concurrent I/O operations
from backends other than the data checksums worker will write the
checksums but not verify them on reading.  Only when all backends
have absorbed the procsignalbarrier for setting data_checksums to
on will they also start verifying checksums on reading.  The same
process is repeated during disabling; all backends write checksums
but do not verify them until the barrier for setting the state to
off has been absorbed by all.  This in-progress state is used to
ensure there are no false negatives (or positives) due to reading
a checksum which is not in sync with the page.

This work is based on an earlier version of this patch which was
reviewed by among others Heikki Linnakangas, Robert Haas, Andres
Freund, Tomas Vondra, Michael Banck and Andrey Borodin.

Author: Daniel Gustafsson <daniel@yesql.se>
Author: Magnus Hagander <magnus@hagander.net>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com
Discussion: https://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de
Discussion: https://postgr.es/m/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com
---
 doc/src/sgml/func.sgml                        |   71 +
 doc/src/sgml/glossary.sgml                    |   23 +
 doc/src/sgml/monitoring.sgml                  |  207 ++-
 doc/src/sgml/ref/pg_checksums.sgml            |    6 +
 doc/src/sgml/wal.sgml                         |   57 +-
 src/backend/access/rmgrdesc/xlogdesc.c        |   24 +
 src/backend/access/transam/xlog.c             |  450 +++++-
 src/backend/access/transam/xlogfuncs.c        |   41 +
 src/backend/backup/basebackup.c               |    6 +-
 src/backend/catalog/system_functions.sql      |   11 +
 src/backend/catalog/system_views.sql          |   21 +
 src/backend/postmaster/Makefile               |    1 +
 src/backend/postmaster/bgworker.c             |    7 +
 src/backend/postmaster/datachecksumsworker.c  | 1376 +++++++++++++++++
 src/backend/postmaster/meson.build            |    1 +
 src/backend/replication/logical/decode.c      |    1 +
 src/backend/storage/ipc/ipci.c                |    4 +
 src/backend/storage/ipc/procsignal.c          |   13 +
 src/backend/storage/page/README               |    4 +-
 src/backend/storage/page/bufpage.c            |    6 +-
 src/backend/utils/activity/pgstat_io.c        |    2 +
 .../utils/activity/wait_event_names.txt       |    3 +
 src/backend/utils/adt/pgstatfuncs.c           |    8 +-
 src/backend/utils/init/miscinit.c             |    9 +-
 src/backend/utils/init/postinit.c             |    7 +-
 src/backend/utils/misc/guc_tables.c           |   32 +-
 src/bin/pg_checksums/pg_checksums.c           |    2 +-
 src/bin/pg_upgrade/controldata.c              |    9 +
 src/include/access/xlog.h                     |   16 +-
 src/include/access/xlog_internal.h            |    7 +
 src/include/catalog/pg_control.h              |    1 +
 src/include/catalog/pg_proc.dat               |   17 +
 src/include/commands/progress.h               |   18 +
 src/include/miscadmin.h                       |    4 +
 src/include/postmaster/datachecksumsworker.h  |   31 +
 src/include/storage/bufpage.h                 |   10 +
 src/include/storage/checksum.h                |    8 +
 src/include/storage/lwlocklist.h              |    1 +
 src/include/storage/proc.h                    |    5 +-
 src/include/storage/procsignal.h              |    5 +
 src/include/utils/backend_progress.h          |    1 +
 src/test/Makefile                             |   11 +-
 src/test/checksum/.gitignore                  |    2 +
 src/test/checksum/Makefile                    |   23 +
 src/test/checksum/README                      |   22 +
 src/test/checksum/meson.build                 |   15 +
 src/test/checksum/t/001_basic.pl              |   88 ++
 src/test/checksum/t/002_restarts.pl           |   92 ++
 src/test/checksum/t/003_standby_restarts.pl   |  131 ++
 src/test/checksum/t/004_offline.pl            |  101 ++
 src/test/meson.build                          |    1 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |   34 +
 src/test/regress/expected/rules.out           |   28 +
 src/tools/pgindent/typedefs.list              |    6 +
 54 files changed, 3031 insertions(+), 49 deletions(-)
 create mode 100644 src/backend/postmaster/datachecksumsworker.c
 create mode 100644 src/include/postmaster/datachecksumsworker.h
 create mode 100644 src/test/checksum/.gitignore
 create mode 100644 src/test/checksum/Makefile
 create mode 100644 src/test/checksum/README
 create mode 100644 src/test/checksum/meson.build
 create mode 100644 src/test/checksum/t/001_basic.pl
 create mode 100644 src/test/checksum/t/002_restarts.pl
 create mode 100644 src/test/checksum/t/003_standby_restarts.pl
 create mode 100644 src/test/checksum/t/004_offline.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 51dd8ad6571..a09843e4ecf 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29865,6 +29865,77 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
 
   </sect2>
 
+  <sect2 id="functions-admin-checksum">
+   <title>Data Checksum Functions</title>
+
+   <para>
+    The functions shown in <xref linkend="functions-checksums-table" /> can
+    be used to enable or disable data checksums in a running cluster.
+    See <xref linkend="checksums" /> for details.
+   </para>
+
+   <table id="functions-checksums-table">
+    <title>Data Checksum Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_enable_data_checksums</primary>
+        </indexterm>
+        <function>pg_enable_data_checksums</function> ( <optional><parameter>cost_delay</parameter> <type>int</type>, <parameter>cost_limit</parameter> <type>int</type></optional> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Initiates data checksums for the cluster. This will switch the data
+        checksums mode to <literal>inprogress-on</literal> as well as start a
+        background worker that will process all pages in the database and
+        enable checksums on them. When all data pages have had checksums
+        enabled, the cluster will automatically switch data checksums mode to
+        <literal>on</literal>.
+       </para>
+       <para>
+        If <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter> are
+        specified, the speed of the process is throttled using the same principles as
+        <link linkend="runtime-config-resource-vacuum-cost">Cost-based Vacuum Delay</link>.
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_disable_data_checksums</primary>
+        </indexterm>
+        <function>pg_disable_data_checksums</function> ()
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Disables data checksum validation and calculation for the cluster. This
+        will switch the data checksum mode to <literal>inprogress-off</literal>
+        while data checksums are being disabled. When all active backends have
+        stopped validating data checksums, the data checksum mode will be
+        changed to <literal>off</literal>.  At this point the data pages will
+        still have checksums recorded but they are not updated when pages are
+        modified.
+       </para></entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+
+  </sect2>
+
   <sect2 id="functions-admin-dbobject">
    <title>Database Object Management Functions</title>
 
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index c0f812e3f5e..547e8586d4b 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -159,6 +159,8 @@
      (but not the autovacuum workers),
      the <glossterm linkend="glossary-background-writer">background writer</glossterm>,
      the <glossterm linkend="glossary-checkpointer">checkpointer</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker">data checksums worker</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker-launcher">data checksums worker launcher</glossterm>,
      the <glossterm linkend="glossary-logger">logger</glossterm>,
      the <glossterm linkend="glossary-startup-process">startup process</glossterm>,
      the <glossterm linkend="glossary-wal-archiver">WAL archiver</glossterm>,
@@ -548,6 +550,27 @@
    </glossdef>
   </glossentry>
 
+  <glossentry id="glossary-data-checksums-worker">
+   <glossterm>Data Checksums Worker</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which enables or disables data checksums in a specific database.
+    </para>
+   </glossdef>
+  </glossentry>
+
+  <glossentry id="glossary-data-checksums-worker-launcher">
+   <glossterm>Data Checksums Worker Launcher</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which starts <glossterm linkend="glossary-data-checksums-worker"> processes</glossterm>
+     for each database.
+    </para>
+   </glossdef>
+  </glossentry>
+
   <glossentry id="glossary-db-cluster">
    <glossterm>Database cluster</glossterm>
    <glossdef>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 16646f560e8..4a6ef5f1605 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3497,8 +3497,9 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Number of data page checksum failures detected in this
-       database (or on a shared object), or NULL if data checksums are
-       disabled.
+       database (or on a shared object).
+       Detected failures are reported regardless of the
+       <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -3508,8 +3509,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Time at which the last data page checksum failure was detected in
-       this database (or on a shared object), or NULL if data checksums are
-       disabled.
+       this database (or on a shared object). Last failure is reported
+       regardless of the <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -6793,6 +6794,204 @@ FROM pg_stat_get_backend_idset() AS backendid;
 
  </sect2>
 
+ <sect2 id="data-checksum-progress-reporting">
+  <title>Data Checksum Progress Reporting</title>
+
+  <indexterm>
+   <primary>pg_stat_progress_data_checksums</primary>
+  </indexterm>
+
+  <para>
+   When data checksums are being enabled on a running cluster, the
+   <structname>pg_stat_progress_data_checksums</structname> view will contain
+   a row for each background worker which is currently calculating checksums
+   for the data pages.
+  </para>
+
+  <table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
+   <title><structname>pg_stat_progress_data_checksums</structname> View</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        Column Type
+       </para>
+       <para>
+        Description>
+       </para>
+      </entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>pid</structfield> <type>integer</type>
+       </para>
+       <para>
+        Process ID of a datachecksumworker process.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>phase</structfield> <type>text</type>
+       </para>
+       <para>
+        Current processing phase, see <xref linkend="datachecksum-phases"/>
+        for description of the phases.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of databases which will be processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of relations which will be processed, or
+        <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of relations yet.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of databases which have been processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of relations which have been processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_current</structfield> <type>oid</type>
+       </para>
+       <para>
+        OID of the database currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current</structfield> <type>oid</type>
+       </para>
+       <para>
+        OID of the relation currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current_blocks</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of blocks in the relation currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current_blocks_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks which have been processed in the relation currently
+        being processed.
+       </para>
+      </entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+  <table id="datachecksum-phases">
+   <title>Data Checksum Phases</title>
+   <tgroup cols="2">
+    <colspec colname="col1" colwidth="1*"/>
+    <colspec colname="col2" colwidth="2*"/>
+    <thead>
+     <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+    <tbody>
+     <row>
+      <entry><literal>enabling</literal></entry>
+      <entry>
+       The command is currently enabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>disabling</literal></entry>
+      <entry>
+       The command is currently disabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on backends</literal></entry>
+      <entry>
+       The command is currently waiting for backends to acknowledge the data
+       checksum operation.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on temporary tables</literal></entry>
+      <entry>
+       The command is currently waiting for all temporary tables which existed
+       at the time the command was started to be removed.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>done</literal></entry>
+      <entry>
+       The command has finished processing and is exiting.
+      </entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+ </sect2>
+
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index 95043aa329c..0343710af53 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -45,6 +45,12 @@ PostgreSQL documentation
    exit status is nonzero if the operation failed.
   </para>
 
+  <para>
+   When enabling checksums, if checksums were in the process of being enabled
+   when the cluster was shut down, <application>pg_checksums</application>
+   will still process all relations regardless of the online processing.
+  </para>
+
   <para>
    When verifying checksums, every file in the cluster is scanned. When
    enabling checksums, each relation file block with a changed checksum is
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index f3b86b26be9..2562b1a002c 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -246,9 +246,10 @@
   <para>
    Checksums can be disabled when the cluster is initialized using <link
    linkend="app-initdb-data-checksums"><application>initdb</application></link>.
-   They can also be enabled or disabled at a later time as an offline
-   operation. Data checksums are enabled or disabled at the full cluster
-   level, and cannot be specified individually for databases or tables.
+   They can also be enabled or disabled at a later time either as an offline
+   operation or online in a running cluster allowing concurrent access. Data
+   checksums are enabled or disabled at the full cluster level, and cannot be
+   specified individually for databases or tables.
   </para>
 
   <para>
@@ -265,7 +266,7 @@
   </para>
 
   <sect2 id="checksums-offline-enable-disable">
-   <title>Off-line Enabling of Checksums</title>
+   <title>Offline Enabling of Checksums</title>
 
    <para>
     The <link linkend="app-pgchecksums"><application>pg_checksums</application></link>
@@ -274,6 +275,54 @@
    </para>
 
   </sect2>
+
+  <sect2 id="checksums-online-enable-disable">
+   <title>Online Enabling of Checksums</title>
+
+   <para>
+    Checksums can be enabled or disabled online, by calling the appropriate
+    <link linkend="functions-admin-checksum">functions</link>.
+   </para>
+
+   <para>
+    Enabling checksums will put the cluster checksum mode in
+    <literal>inprogress-on</literal> mode.  During this time, checksums will be
+    written but not verified. In addition to this, a background worker process
+    is started that enables checksums on all existing data in the cluster. Once
+    this worker has completed processing all databases in the cluster, the
+    checksum mode will automatically switch to <literal>on</literal>. The
+    processing will consume two background worker processes, make sure that
+    <varname>max_worker_processes</varname> allows for at least two more
+    additional processes.
+   </para>
+
+   <para>
+    The process will initially wait for all open transactions to finish before
+    it starts, so that it can be certain that there are no tables that have been
+    created inside a transaction that has not committed yet and thus would not
+    be visible to the process enabling checksums. It will also, for each database,
+    wait for all pre-existing temporary tables to get removed before it finishes.
+    If long-lived temporary tables are used in the application it may be necessary
+    to terminate these application connections to allow the process to complete.
+   </para>
+
+   <para>
+    If the cluster is stopped while in <literal>inprogress-on</literal> mode, for
+    any reason, then this process must be restarted manually. To do this,
+    re-execute the function <function>pg_enable_data_checksums()</function>
+    once the cluster has been restarted. The background worker will attempt
+    to resume the work from where it was interrupted.
+   </para>
+
+   <note>
+    <para>
+     Enabling checksums can cause significant I/O to the system, as most of the
+     database pages will need to be rewritten, and will be written both to the
+     data files and the WAL.
+    </para>
+   </note>
+
+  </sect2>
  </sect1>
 
   <sect1 id="wal-intro">
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 58040f28656..1e3ad2ab68d 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -18,6 +18,7 @@
 #include "access/xlog.h"
 #include "access/xlog_internal.h"
 #include "catalog/pg_control.h"
+#include "storage/bufpage.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -167,6 +168,26 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
 		memcpy(&wal_level, rec, sizeof(int));
 		appendStringInfo(buf, "wal_level %s", get_wal_level_string(wal_level));
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state xlrec;
+
+		memcpy(&xlrec, rec, sizeof(xl_checksum_state));
+		switch (xlrec.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_VERSION:
+				appendStringInfoString(buf, "on");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				appendStringInfoString(buf, "inprogress-off");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				appendStringInfoString(buf, "inprogress-on");
+				break;
+			default:
+				appendStringInfoString(buf, "off");
+		}
+	}
 }
 
 const char *
@@ -218,6 +239,9 @@ xlog_identify(uint8 info)
 		case XLOG_CHECKPOINT_REDO:
 			id = "CHECKPOINT_REDO";
 			break;
+		case XLOG_CHECKSUMS:
+			id = "CHECKSUMS";
+			break;
 	}
 
 	return id;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 799fc739e18..b5190a7e104 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -647,6 +647,16 @@ static XLogRecPtr LocalMinRecoveryPoint;
 static TimeLineID LocalMinRecoveryPointTLI;
 static bool updateMinRecoveryPoint = true;
 
+/*
+ * Local state fror Controlfile data_checksum_version. After initialization
+ * this is only updated when absorbing a procsignal barrier during interrupt
+ * processing. The reason for keeping a copy in backend-private memory is to
+ * avoid locking for interrogating checksum state. Possible values are the
+ * checksum versions defined in storage/bufpage.h as well as zero when data
+ * checksums are disabled.
+ */
+static uint32 LocalDataChecksumVersion = 0;
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -715,6 +725,8 @@ static void WALInsertLockAcquireExclusive(void);
 static void WALInsertLockRelease(void);
 static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);
 
+static void XLogChecksums(ChecksumType new_type);
+
 /*
  * Insert an XLOG record represented by an already-constructed chain of data
  * chunks.  This is a low-level routine; to construct the WAL record header
@@ -828,9 +840,10 @@ XLogInsertRecord(XLogRecData *rdata,
 		 * only happen just after a checkpoint, so it's better to be slow in
 		 * this case and fast otherwise.
 		 *
-		 * Also check to see if fullPageWrites was just turned on or there's a
-		 * running backup (which forces full-page writes); if we weren't
-		 * already doing full-page writes then go back and recompute.
+		 * Also check to see if fullPageWrites was just turned on, there's a
+		 * running backup or if checksums are enabled (all of which forces
+		 * full-page writes); if we weren't already doing full-page writes
+		 * then go back and recompute.
 		 *
 		 * If we aren't doing full-page writes then RedoRecPtr doesn't
 		 * actually affect the contents of the XLOG record, so we'll update
@@ -843,7 +856,9 @@ XLogInsertRecord(XLogRecData *rdata,
 			Assert(RedoRecPtr < Insert->RedoRecPtr);
 			RedoRecPtr = Insert->RedoRecPtr;
 		}
-		doPageWrites = (Insert->fullPageWrites || Insert->runningBackups > 0);
+		doPageWrites = (Insert->fullPageWrites ||
+						Insert->runningBackups > 0 ||
+						DataChecksumsNeedWrite());
 
 		if (doPageWrites &&
 			(!prevDoPageWrites ||
@@ -4579,9 +4594,7 @@ ReadControlFile(void)
 
 	CalculateCheckpointSegments();
 
-	/* Make the initdb settings visible as GUC variables, too */
-	SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no",
-					PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
+	LocalDataChecksumVersion = ControlFile->data_checksum_version;
 }
 
 /*
@@ -4615,13 +4628,345 @@ GetMockAuthenticationNonce(void)
 }
 
 /*
- * Are checksums enabled for data pages?
+ * DataChecksumsNeedWrite
+ *		Returns whether data checksums must be written or not
+ *
+ * Returns true iff data checksums are enabled or are in the process of being
+ * enabled. During "inprogress-on" and "inprogress-off" states checksums must
+ * be written even though they are not verified (see datachecksumsworker.c for
+ * a longer discussion).
+ *
+ * This function is intended for callsites which are about to write a data page
+ * to storage, and need to know whether to re-calculate the checksum for the
+ * page header. Calling this function must be performed as close to the write
+ * operation as possible to keep the critical section short.
+ */
+bool
+DataChecksumsNeedWrite(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * DataChecksumsNeedVerify
+ *		Returns whether data checksums must be verified or not
+ *
+ * Data checksums are only verified if they are fully enabled in the cluster.
+ * During the "inprogress-on" and "inprogress-off" states they are only
+ * updated, not verified (see datachecksumsworker.c for a longer discussion).
+ *
+ * This function is intended for callsites which have read data and are about
+ * to perform checksum validation based on the result of this.  Calling this
+ * function must be performed as close to the validation call as possible to
+ * keep the critical section short. This is in order to protect against time of
+ * check/time of use situations around data checksum validation.
+ */
+bool
+DataChecksumsNeedVerify(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
+}
+
+/*
+ * DataChecksumsOnInProgress
+ *		Returns whether data checksums are being enabled
+ *
+ * Most operations don't need to worry about the "inprogress" states, and
+ * should use DataChecksumsNeedVerify() or DataChecksumsNeedWrite(). The
+ * "inprogress-on" state for enabling checksums is used when the checksum
+ * worker is setting checksums on all pages, it can thus be used to check for
+ * aborted checksum processing which need to be restarted.
+ */
+inline bool
+DataChecksumsOnInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+}
+
+/*
+ * DataChecksumsOffInProgress
+ *		Returns whether data checksums are being disabled
+ *
+ * The "inprogress-off" state for disabling checksums is used for when the
+ * worker resets the catalog state.  DataChecksumsNeedVerify() or
+ * DataChecksumsNeedWrite() should be used for deciding whether to read/write
+ * checksums.
  */
 bool
-DataChecksumsEnabled(void)
+DataChecksumsOffInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * SetDataChecksumsOnInProgress
+ *		Sets the data checksum state to "inprogress-on" to enable checksums
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on". See
+ * SetDataChecksumsOn below for a description on how this state change works.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOnInProgress(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile != NULL);
+
+	/*
+	 * The state transition is performed in a critical section with
+	 * checkpoints held off to provide crash safety.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state change in all backends to ensure that all backends are in
+	 * "inprogress-on". Once done we know that all backends are writing data
+	 * checksums.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOn
+ *		Enables data checksums cluster-wide
+ *
+ * Enabling data checksums is performed using two barriers, the first one to
+ * set the checksums state to "inprogress-on" (which is performed by
+ * SetDataChecksumsOnInProgress()) and the second one to set the state to "on"
+ * (performed here).
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on".  This state
+ * requires data checksums to be written but not verified. This ensures that
+ * all data pages can be checksummed without the risk of false negatives in
+ * validation during the process.  When all existing pages are guaranteed to
+ * have checksums, and all new pages will be initiated with checksums, the
+ * state can be changed to "on". Once the state is "on" checksums will be both
+ * written and verified. See datachecksumsworker.c for a longer discussion on
+ * how data checksums can be enabled in a running cluster.
+ *
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOn(void)
 {
+	uint64		barrier;
+
 	Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/*
+	 * The only allowed state transition to "on" is from "inprogress-on" since
+	 * that state ensures that all pages will have data checksums written.
+	 */
+	if (ControlFile->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+		elog(ERROR, "checksums not in \"inprogress-on\" mode");
+	}
+
+	LWLockRelease(ControlFileLock);
+
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state transition of "on" in all backends. When done we know that
+	 * data checksums are enabled in all backends and data checksums are both
+	 * written and verified.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOff
+ *		Disables data checksums cluster-wide
+ *
+ * Disabling data checksums must be performed with two sets of barriers, each
+ * carrying a different state. The state is first set to "inprogress-off"
+ * during which checksums are still written but not verified. This ensures that
+ * backends which have yet to observe the state change from "on" won't get
+ * validation errors on concurrently modified pages. Once all backends have
+ * changed to "inprogress-off", the barrier for moving to "off" can be emitted.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOff(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/* If data checksums are already disabled there is nothing to do */
+	if (ControlFile->data_checksum_version == 0)
+	{
+		LWLockRelease(ControlFileLock);
+		return;
+	}
+
+	/*
+	 * If data checksums are currently enabled we first transition to the
+	 * "inprogress-off" state during which backends continue to write
+	 * checksums without verifying them. When all backends are in
+	 * "inprogress-off" the next transition to "off" can be performed, after
+	 * which all data checksum processing is disabled.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+
+		MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+		START_CRIT_SECTION();
+
+		XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+		LWLockRelease(ControlFileLock);
+
+		barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+
+		END_CRIT_SECTION();
+		MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+
+		/*
+		 * Update local state in all backends to ensure that any backend in
+		 * "on" state is changed to "inprogress-off".
+		 */
+		WaitForProcSignalBarrier(barrier);
+
+		/*
+		 * At this point we know that no backends are verifying data checksums
+		 * during reading. Next, we can safely move to state "off" to also
+		 * stop writing checksums.
+		 */
+	}
+	else
+	{
+		/*
+		 * Ending up here implies that the checksums state is "inprogress-on"
+		 * or "inprogress-off" and we can transition directly to "off" from
+		 * there.
+		 */
+		LWLockRelease(ControlFileLock);
+	}
+
+	/*
+	 * Ensure that we don't incur a checkpoint during disabling checksums.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(0);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = 0;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * ProcSignalBarrier absorption functions for enabling and disabling data
+ * checksums in a running cluster. The procsignalbarriers are emitted in the
+ * SetDataChecksums* functions.
+ */
+bool
+AbsorbChecksumsOnInProgressBarrier(void)
+{
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOnBarrier(void)
+{
+	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffInProgressBarrier(void)
+{
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffBarrier(void)
+{
+	LocalDataChecksumVersion = 0;
+	return true;
+}
+
+/*
+ * InitLocalControlData
+ *
+ * Set up backend local caches of controldata variables which may change at
+ * any point during runtime and thus require special cased locking. So far
+ * this only applies to data_checksum_version, but it's intended to be general
+ * purpose enough to handle future cases.
+ */
+void
+InitLocalControldata(void)
+{
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	LocalDataChecksumVersion = ControlFile->data_checksum_version;
+	LWLockRelease(ControlFileLock);
+}
+
+/* guc hook */
+const char *
+show_data_checksums(void)
+{
+	if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)
+		return "on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		return "inprogress-on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+		return "inprogress-off";
+	else
+		return "off";
 }
 
 /*
@@ -6194,6 +6539,33 @@ StartupXLOG(void)
 	 */
 	CompleteCommitTsInitialization();
 
+	/*
+	 * If we reach this point with checksums being enabled ("inprogress-on"
+	 * state), we notify the user that they need to manually restart the
+	 * process to enable checksums. This is because we cannot launch a dynamic
+	 * background worker directly from here, it has to be launched from a
+	 * regular backend.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		ereport(WARNING,
+				(errmsg("data checksums are being enabled, but no worker is running"),
+				 errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
+
+	/*
+	 * If data checksums were being disabled when the cluster was shut down,
+	 * we know that we have a state where all backends have stopped validating
+	 * checksums and we can move to off instead of prompting the user to
+	 * perform any action.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+	{
+		XLogChecksums(0);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = 0;
+		LWLockRelease(ControlFileLock);
+	}
+
 	/*
 	 * All done with end-of-recovery actions.
 	 *
@@ -8201,6 +8573,24 @@ XLogReportParameters(void)
 	}
 }
 
+/*
+ * Log the new state of checksums
+ */
+static void
+XLogChecksums(ChecksumType new_type)
+{
+	xl_checksum_state xlrec;
+	XLogRecPtr	recptr;
+
+	xlrec.new_checksumtype = new_type;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, sizeof(xl_checksum_state));
+
+	recptr = XLogInsert(RM_XLOG_ID, XLOG_CHECKSUMS);
+	XLogFlush(recptr);
+}
+
 /*
  * Update full_page_writes in shared memory, and write an
  * XLOG_FPW_CHANGE record if necessary.
@@ -8629,6 +9019,46 @@ xlog_redo(XLogReaderState *record)
 	{
 		/* nothing to do here, just for informational purposes */
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state state;
+		uint64		barrier;
+
+		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = state.new_checksumtype;
+		LWLockRelease(ControlFileLock);
+
+		/*
+		 * Block on a procsignalbarrier to await all processes having seen the
+		 * change to checksum status. Once the barrier has been passed we can
+		 * initiate the corresponding processing.
+		 */
+		switch (state.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			default:
+				Assert(state.new_checksumtype == 0);
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+		}
+	}
 }
 
 /*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 8c3090165f0..6b1c2ed85ea 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -26,6 +26,7 @@
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "postmaster/datachecksumsworker.h"
 #include "replication/walreceiver.h"
 #include "storage/fd.h"
 #include "storage/latch.h"
@@ -748,3 +749,43 @@ pg_promote(PG_FUNCTION_ARGS)
 						   wait_seconds)));
 	PG_RETURN_BOOL(false);
 }
+
+/*
+ * Disables data checksums for the cluster, if applicable. Starts a background
+ * worker which turns off the data checksums.
+ */
+Datum
+disable_data_checksums(PG_FUNCTION_ARGS)
+{
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	StartDataChecksumsWorkerLauncher(false, 0, 0, false);
+	PG_RETURN_VOID();
+}
+
+/*
+ * Enables data checksums for the cluster, if applicable.  Supports vacuum-
+ * like cost based throttling to limit system load. Starts a background worker
+ * which updates data checksums on existing data.
+ */
+Datum
+enable_data_checksums(PG_FUNCTION_ARGS)
+{
+	int			cost_delay = PG_GETARG_INT32(0);
+	int			cost_limit = PG_GETARG_INT32(1);
+	bool		fast = PG_GETARG_BOOL(2);
+
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	if (cost_delay < 0)
+		ereport(ERROR, errmsg("cost delay cannot be a negative value"));
+
+	if (cost_limit <= 0)
+		ereport(ERROR, errmsg("cost limit must be greater than zero"));
+
+	StartDataChecksumsWorkerLauncher(true, cost_delay, cost_limit, fast);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 3f8a3c55725..598dd41bae6 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1613,7 +1613,8 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	 * enabled for this cluster, and if this is a relation file, then verify
 	 * the checksum.
 	 */
-	if (!noverify_checksums && DataChecksumsEnabled() &&
+	if (!noverify_checksums &&
+		DataChecksumsNeedWrite() &&
 		RelFileNumberIsValid(relfilenumber))
 		verify_checksum = true;
 
@@ -2006,6 +2007,9 @@ verify_page_checksum(Page page, XLogRecPtr start_lsn, BlockNumber blkno,
 	if (PageIsNew(page) || PageGetLSN(page) >= start_lsn)
 		return true;
 
+	if (!DataChecksumsNeedVerify())
+		return true;
+
 	/* Perform the actual checksum calculation. */
 	checksum = pg_checksum_page(page, blkno);
 
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 566f308e443..b18db2b5dde 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -650,6 +650,13 @@ LANGUAGE INTERNAL
 CALLED ON NULL INPUT VOLATILE PARALLEL SAFE
 AS 'pg_stat_reset_slru';
 
+CREATE OR REPLACE FUNCTION
+  pg_enable_data_checksums(cost_delay integer DEFAULT 0,
+                           cost_limit integer DEFAULT 100,
+						   fast boolean DEFAULT false)
+  RETURNS void STRICT VOLATILE LANGUAGE internal AS 'enable_data_checksums'
+  PARALLEL RESTRICTED;
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
@@ -775,6 +782,10 @@ REVOKE EXECUTE ON FUNCTION pg_ls_logicalmapdir() FROM PUBLIC;
 
 REVOKE EXECUTE ON FUNCTION pg_ls_replslotdir(text) FROM PUBLIC;
 
+REVOKE EXECUTE ON FUNCTION pg_enable_data_checksums(integer, integer, boolean) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_disable_data_checksums() FROM public;
+
 --
 -- We also set up some things as accessible to standard roles.
 --
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index a4d2cfdcaf5..0d62976dc1f 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1334,6 +1334,27 @@ CREATE VIEW pg_stat_progress_copy AS
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_data_checksums AS
+	SELECT
+		S.pid AS pid, S.datid AS datid, D.datname AS datname,
+		CASE S.param1 WHEN 0 THEN 'enabling'
+                      WHEN 1 THEN 'disabling'
+					  WHEN 2 THEN 'waiting'
+					  WHEN 3 THEN 'waiting on backends'
+					  WHEN 4 THEN 'waiting on temporary tables'
+					  WHEN 5 THEN 'done'
+					  END AS phase,
+		CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+		CASE S.param3 WHEN -1 THEN NULL ELSE S.param3 END AS relations_total,
+		S.param4 AS databases_processed,
+		S.param5 AS relations_processed,
+		S.param6 AS databases_current,
+		S.param7 AS relation_current,
+		S.param8 AS relation_current_blocks,
+		S.param9 AS relation_current_blocks_processed
+	FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+		LEFT JOIN pg_database D ON S.datid = D.oid;
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/postmaster/Makefile b/src/backend/postmaster/Makefile
index 0f4435d2d97..0c36765acfe 100644
--- a/src/backend/postmaster/Makefile
+++ b/src/backend/postmaster/Makefile
@@ -18,6 +18,7 @@ OBJS = \
 	bgworker.o \
 	bgwriter.o \
 	checkpointer.o \
+	datachecksumsworker.o \
 	fork_process.o \
 	interrupt.o \
 	launch_backend.o \
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 116ddf7b835..5720f66b0fd 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -18,6 +18,7 @@
 #include "pgstat.h"
 #include "port/atomics.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "replication/logicallauncher.h"
 #include "replication/logicalworker.h"
@@ -132,6 +133,12 @@ static const struct
 	},
 	{
 		"TablesyncWorkerMain", TablesyncWorkerMain
+	},
+	{
+		"DataChecksumsWorkerLauncherMain", DataChecksumsWorkerLauncherMain
+	},
+	{
+		"DataChecksumsWorkerMain", DataChecksumsWorkerMain
 	}
 };
 
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
new file mode 100644
index 00000000000..de7a077f9c2
--- /dev/null
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -0,0 +1,1376 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.c
+ *	  Background worker for enabling or disabling data checksums online
+ *
+ * When enabling data checksums on a database at initdb time or when shut down
+ * with pg_checksums, no extra process is required as each page is checksummed,
+ * and verified, when accessed.  When enabling checksums on an already running
+ * cluster, this worker will ensure that all pages are checksummed before
+ * verification of the checksums is turned on. In the case of disabling
+ * checksums, the state transition is performed only in the control file, no
+ * changes are performed on the data pages.
+ *
+ * Checksums can be either enabled or disabled cluster-wide, with on/off being
+ * the end state for data_checksums.
+ *
+ * Enabling checksums
+ * ------------------
+ * When enabling checksums in an online cluster, data_checksums will be set to
+ * "inprogress-on" which signals that write operations MUST compute and write
+ * the checksum on the data page, but during reading the checksum SHALL NOT be
+ * verified. This ensures that all objects created during checksumming will
+ * have checksums set, but no reads will fail due to incorrect checksum. The
+ * DataChecksumsWorker will compile a list of databases which exist at the
+ * start of checksumming, and all of these which haven't been dropped during
+ * the processing MUST have been processed successfully in order for checksums
+ * to be enabled. Any new relation created during processing will see the
+ * in-progress state and will automatically be checksummed.
+ *
+ * For each database, all relations which have storage are read and every data
+ * page is marked dirty to force a write with the checksum. This will generate
+ * a lot of WAL as the entire database is read and written.
+ *
+ * If the processing is interrupted by a cluster restart, it will be restarted
+ * from the beginning again as state isn't persisted.
+ *
+ * Disabling checksums
+ * -------------------
+ * When disabling checksums, data_checksums will be set to "inprogress-off"
+ * which signals that checksums are written but no longer verified. This ensure
+ * that backends which have yet to move from the "on" state will still be able
+ * to process data checksum validation.
+ *
+ * Synchronization and Correctness
+ * -------------------------------
+ * The processes involved in enabling, or disabling, data checksums in an
+ * online cluster must be properly synchronized with the normal backends
+ * serving concurrent queries to ensure correctness. Correctness is defined
+ * as the following:
+ *
+ *    - Backends SHALL NOT violate local data_checksums state
+ *    - Data checksums SHALL NOT be considered enabled cluster-wide until all
+ *      currently connected backends have the local state "enabled"
+ *
+ * There are two levels of synchronization required for enabling data checksums
+ * in an online cluster: (i) changing state in the active backends ("on",
+ * "off", "inprogress-on" and "inprogress-off"), and (ii) ensuring no
+ * incompatible objects and processes are left in a database when workers end.
+ * The former deals with cluster-wide agreement on data checksum state and the
+ * latter with ensuring that any concurrent activity cannot break the data
+ * checksum contract during processing.
+ *
+ * Synchronizing the state change is done with procsignal barriers, where the
+ * WAL logging backend updating the global state in the controlfile will wait
+ * for all other backends to absorb the barrier. Barrier absorption will happen
+ * during interrupt processing, which means that connected backends will change
+ * state at different times. To prevent data checksum state changes when
+ * writing and verifying checksums, interrupts shall be held off before
+ * interrogating state and resumed when the IO operation has been performed.
+ *
+ *   When Enabling Data Checksums
+ *   ----------------------------
+ *   A process which fails to observe data checksums being enabled can induce
+ *   two types of errors: failing to write the checksum when modifying the page
+ *   and failing to validate the data checksum on the page when reading it.
+ *
+ *   When processing starts all backends belong to one of the below sets, with
+ *   one set being empty:
+ *
+ *   Bd: Backends in "off" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   If processing is started in an online cluster then all backends are in Bd.
+ *   If processing was halted by the cluster shutting down, the controlfile
+ *   state "inprogress-on" will be observed on system startup and all backends
+ *   will be in Bd. Backends transition Bd -> Bi via a procsignalbarrier.  When
+ *   the DataChecksumsWorker has finished writing checksums on all pages and
+ *   enables data checksums cluster-wide, there are four sets of backends where
+ *   Bd shall be an empty set:
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   Backends in Bi and Be will write checksums when modifying a page, but only
+ *   backends in Be will verify the checksum during reading. The Bg backend is
+ *   blocked waiting for all backends in Bi to process interrupts and move to
+ *   Be. Any backend starting while Bg is waiting on the procsignalbarrier will
+ *   observe the global state being "on" and will thus automatically belong to
+ *   Be.  Checksums are enabled cluster-wide when Bi is an empty set. Bi and Be
+ *   are compatible sets while still operating based on their local state as
+ *   both write data checksums.
+ *
+ *   When Disabling Data Checksums
+ *   -----------------------------
+ *   A process which fails to observe that data checksums have been disabled
+ *   can induce two types of errors: writing the checksum when modifying the
+ *   page and validating a data checksum which is no longer correct due to
+ *   modifications to the page.
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bo: Backends in "inprogress-off" state
+ *
+ *   Backends transition from the Be state to Bd like so: Be -> Bo -> Bd
+ *
+ *   The goal is to transition all backends to Bd making the others empty sets.
+ *   Backends in Bo write data checksums, but don't validate them, such that
+ *   backends still in Be can continue to validate pages until the barrier has
+ *   been absorbed such that they are in Bo. Once all backends are in Bo, the
+ *   barrier to transition to "off" can be raised and all backends can safely
+ *   stop writing data checksums as no backend is enforcing data checksum
+ *   validation any longer.
+ *
+ *
+ * Potential optimizations
+ * -----------------------
+ * Below are some potential optimizations and improvements which were brought
+ * up during reviews of this feature, but which weren't implemented in the
+ * initial version. These are ideas listed without any validation on their
+ * feasibility or potential payoff. More discussion on these can be found on
+ * the -hackers threads linked to in the commit message of this feature.
+ *
+ *   * Launching datachecksumsworker for resuming operation from the startup
+ *     process: Currently users have to restart processing manually after a
+ *     restart since dynamic background worker cannot be started from the
+ *     postmaster. Changing the startup process could make restarting the
+ *     processing automatic on cluster restart.
+ *   * Avoid dirtying the page when checksums already match: Iff the checksum
+ *     on the page happens to already match we still dirty the page. It should
+ *     be enough to only do the log_newpage_buffer() call in that case.
+ *   * Invent a lightweight WAL record that doesn't contain the full-page
+ *     image but just the block number: On replay, the redo routine would read
+ *     the page from disk.
+ *   * Teach pg_checksums to avoid checksummed pages when pg_checksums is used
+ *     to enable checksums on a cluster which is in inprogress-on state and
+ *     may have checksummed pages (make pg_checksums be able to resume an
+ *     online operation).
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/postmaster/datachecksumsworker.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/htup_details.h"
+#include "access/xact.h"
+#include "access/xloginsert.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_database.h"
+#include "commands/progress.h"
+#include "commands/vacuum.h"
+#include "common/relpath.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "postmaster/bgworker.h"
+#include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/bufmgr.h"
+#include "storage/checksum.h"
+#include "storage/ipc.h"
+#include "storage/lmgr.h"
+#include "storage/lwlock.h"
+#include "storage/procarray.h"
+#include "storage/smgr.h"
+#include "tcop/tcopprot.h"
+#include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
+#include "utils/ps_status.h"
+#include "utils/syscache.h"
+
+/*
+ * Number of times we retry to open a database before giving up and consider
+ * it to have failed processing.
+ */
+#define DATACHECKSUMSWORKER_MAX_DB_RETRIES 5
+
+typedef enum
+{
+	DATACHECKSUMSWORKER_SUCCESSFUL = 0,
+	DATACHECKSUMSWORKER_ABORTED,
+	DATACHECKSUMSWORKER_FAILED,
+	DATACHECKSUMSWORKER_RETRYDB,
+} DataChecksumsWorkerResult;
+
+/*
+ * Signaling between backends calling pg_enable/disable_data_checksums, the
+ * checksums launcher process, and the checksums worker process.
+ *
+ * This struct is protected by DataChecksumsWorkerLock
+ */
+typedef struct DataChecksumsWorkerShmemStruct
+{
+	/*
+	 * These are set by pg_enable/disable_data_checksums, to tell the launcher
+	 * what the target state is.
+	 */
+	bool		launch_enable_checksums;	/* True if checksums are being
+											 * enabled, else false */
+	int			launch_cost_delay;
+	int			launch_cost_limit;
+	bool		launch_fast;
+
+	/*
+	 * Is a launcher process is currently running?
+	 *
+	 * This is set by the launcher process, after it has read the above
+	 * launch_* parameters.
+	 */
+	bool		launcher_running;
+
+	/*
+	 * These fields indicate the target state that the launcher is currently
+	 * working towards. They can be different from the corresponding launch_*
+	 * fields, if a new pg_enable/disable_data_checksums() call was made while
+	 * the launcher/worker was already running.
+	 *
+	 * The below members are set when the launcher starts, and are only
+	 * accessed read-only by the single worker. Thus, we can access these
+	 * without a lock. If multiple workers, or dynamic cost parameters, are
+	 * supported at some point then this would need to be revisited.
+	 */
+	bool		enabling_checksums; /* True if checksums are being enabled,
+									 * else false */
+	int			cost_delay;
+	int			cost_limit;
+	bool		immediate_checkpoint;
+
+	/*
+	 * Signaling between the launcher and the worker process.
+	 *
+	 * As there is only a single worker, and the launcher won't read these
+	 * until the worker exits, they can be accessed without the need for a
+	 * lock. If multiple workers are supported then this will have to be
+	 * revisited.
+	 */
+
+	/* result, set by worker before exiting */
+	DataChecksumsWorkerResult success;
+
+	/*
+	 * tells the worker process whether it should also process the shared
+	 * catalogs
+	 */
+	bool		process_shared_catalogs;
+} DataChecksumsWorkerShmemStruct;
+
+/* Shared memory segment for datachecksumsworker */
+static DataChecksumsWorkerShmemStruct *DataChecksumsWorkerShmem;
+
+/* Bookkeeping for work to do */
+typedef struct DataChecksumsWorkerDatabase
+{
+	Oid			dboid;
+	char	   *dbname;
+} DataChecksumsWorkerDatabase;
+
+typedef struct DataChecksumsWorkerResultEntry
+{
+	Oid			dboid;
+	DataChecksumsWorkerResult result;
+	int			retries;
+} DataChecksumsWorkerResultEntry;
+
+
+/*
+ * Flag set by the interrupt handler
+ */
+static volatile sig_atomic_t abort_requested = false;
+
+/*
+ * Have we set the DataChecksumsWorkerShmemStruct->launcher_running flag?
+ * If we have, we need to clear it before exiting!
+ */
+static volatile sig_atomic_t launcher_running = false;
+
+/*
+ * Are we enabling data checksums, or disabling them?
+ */
+static bool enabling_checksums;
+
+/* Prototypes */
+static List *BuildDatabaseList(void);
+static List *BuildRelationList(bool temp_relations, bool include_shared);
+static void FreeDatabaseList(List *dblist);
+static DataChecksumsWorkerResult ProcessDatabase(DataChecksumsWorkerDatabase *db);
+static bool ProcessAllDatabases(bool immediate_checkpoint);
+static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy);
+static void launcher_cancel_handler(SIGNAL_ARGS);
+static void WaitForAllTransactionsToFinish(void);
+
+/*
+ * StartDataChecksumsWorkerLauncher
+ *		Main entry point for datachecksumsworker launcher process
+ *
+ * The main entrypoint for starting data checksums processing for enabling as
+ * well as disabling.
+ */
+void
+StartDataChecksumsWorkerLauncher(bool enable_checksums,
+								 int cost_delay,
+								 int cost_limit,
+								 bool fast)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	bool		launcher_running;
+
+	/* the cost delay settings have no effect when disabling */
+	Assert(enable_checksums || cost_delay == 0);
+	Assert(enable_checksums || cost_limit == 0);
+
+	/* store the desired state in shared memory */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	DataChecksumsWorkerShmem->launch_enable_checksums = enable_checksums;
+	DataChecksumsWorkerShmem->launch_cost_delay = cost_delay;
+	DataChecksumsWorkerShmem->launch_cost_limit = cost_limit;
+	DataChecksumsWorkerShmem->launch_fast = fast;
+
+	/* is the launcher already running? */
+	launcher_running = DataChecksumsWorkerShmem->launcher_running;
+
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * Launch a new launcher process, if it's not running already.
+	 *
+	 * If the launcher is currently busy enabling the checksums, and we want
+	 * them disabled (or vice versa), the launcher will notice that at latest
+	 * when it's about to exit, and will loop back process the new request. So
+	 * if the launcher is already running, we don't need to do anything more
+	 * here to abort it.
+	 *
+	 * If you call pg_enable/disable_data_checksums() twice in a row, before
+	 * the launcher has had a chance to start up, we still end up launching it
+	 * twice.  That's OK, the second invocation will see that a launcher is
+	 * already running and exit quickly.
+	 *
+	 * TODO: We could optimize here and skip launching the launcher, if we are
+	 * already in the desired state, i.e. if the checksums are already enabled
+	 * and you call pg_enable_data_checksums().
+	 */
+	if (!launcher_running)
+	{
+		/*
+		 * Prepare the BackgroundWorker and launch it.
+		 */
+		memset(&bgw, 0, sizeof(bgw));
+		bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+		bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+		snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+		snprintf(bgw.bgw_function_name, BGW_MAXLEN, "DataChecksumsWorkerLauncherMain");
+		snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker launcher");
+		snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker launcher");
+		bgw.bgw_restart_time = BGW_NEVER_RESTART;
+		bgw.bgw_notify_pid = MyProcPid;
+		bgw.bgw_main_arg = (Datum) 0;
+
+		if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+			ereport(ERROR,
+					(errmsg("failed to start background worker to process data checksums")));
+	}
+}
+
+/*
+ * ProcessSingleRelationFork
+ *		Enable data checksums in a single relation/fork.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy)
+{
+	BlockNumber numblocks = RelationGetNumberOfBlocksInFork(reln, forkNum);
+	BlockNumber blknum;
+	char		activity[NAMEDATALEN * 2 + 128];
+	char	   *relns;
+
+	relns = get_namespace_name(RelationGetNamespace(reln));
+
+	if (!relns)
+		return false;
+
+	/* Report the current relation to pgstat_activity */
+	snprintf(activity, sizeof(activity) - 1, "processing: %s.%s (%s, %dblocks)",
+			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	/*
+	 * We are looping over the blocks which existed at the time of process
+	 * start, which is safe since new blocks are created with checksums set
+	 * already due to the state being "inprogress-on".
+	 */
+	for (blknum = 0; blknum < numblocks; blknum++)
+	{
+		Buffer		buf = ReadBufferExtended(reln, forkNum, blknum, RBM_NORMAL, strategy);
+
+		/* Need to get an exclusive lock before we can flag as dirty */
+		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+
+		/*
+		 * Mark the buffer as dirty and force a full page write.  We have to
+		 * re-write the page to WAL even if the checksum hasn't changed,
+		 * because if there is a replica it might have a slightly different
+		 * version of the page with an invalid checksum, caused by unlogged
+		 * changes (e.g. hintbits) on the master happening while checksums
+		 * were off. This can happen if there was a valid checksum on the page
+		 * at one point in the past, so only when checksums are first on, then
+		 * off, and then turned on again. Iff wal_level is set to "minimal",
+		 * this could be avoided iff the checksum is calculated to be correct.
+		 */
+		START_CRIT_SECTION();
+		MarkBufferDirty(buf);
+		log_newpage_buffer(buf, false);
+		END_CRIT_SECTION();
+
+		UnlockReleaseBuffer(buf);
+
+		/*
+		 * This is the only place where we check if we are asked to abort, the
+		 * abortion will bubble up from here. It's safe to check this without
+		 * a lock, because if we miss it being set, we will try again soon.
+		 */
+		Assert(enabling_checksums);
+		if (!DataChecksumsWorkerShmem->launch_enable_checksums)
+			abort_requested = true;
+		if (abort_requested)
+			return false;
+
+		vacuum_delay_point();
+	}
+
+	pfree(relns);
+	return true;
+}
+
+/*
+ * ProcessSingleRelationByOid
+ *		Process a single relation based on oid.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationByOid(Oid relationId, BufferAccessStrategy strategy)
+{
+	Relation	rel;
+	ForkNumber	fnum;
+	bool		aborted = false;
+
+	StartTransactionCommand();
+
+	rel = try_relation_open(relationId, AccessShareLock);
+	if (rel == NULL)
+	{
+		/*
+		 * Relation no longer exists. We don't consider this an error since
+		 * there are no pages in it that need data checksums, and thus return
+		 * true. The worker operates off a list of relations generated at the
+		 * start of processing, so relations being dropped in the meantime is
+		 * to be expected.
+		 */
+		CommitTransactionCommand();
+		pgstat_report_activity(STATE_IDLE, NULL);
+		return true;
+	}
+	RelationGetSmgr(rel);
+
+	for (fnum = 0; fnum <= MAX_FORKNUM; fnum++)
+	{
+		if (smgrexists(rel->rd_smgr, fnum))
+		{
+			if (!ProcessSingleRelationFork(rel, fnum, strategy))
+			{
+				aborted = true;
+				break;
+			}
+		}
+	}
+	relation_close(rel, AccessShareLock);
+	elog(DEBUG2,
+		 "data checksum processing done for relation with OID %u: %s",
+		 relationId, (aborted ? "aborted" : "finished"));
+
+	CommitTransactionCommand();
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return !aborted;
+}
+
+/*
+ * ProcessDatabase
+ *		Enable data checksums in a single database.
+ *
+ * We do this by launching a dynamic background worker into this database, and
+ * waiting for it to finish.  We have to do this in a separate worker, since
+ * each process can only be connected to one database during its lifetime.
+ */
+static DataChecksumsWorkerResult
+ProcessDatabase(DataChecksumsWorkerDatabase *db)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	BgwHandleStatus status;
+	pid_t		pid;
+	char		activity[NAMEDATALEN + 64];
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_FAILED;
+
+	memset(&bgw, 0, sizeof(bgw));
+	bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+	bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+	snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+	snprintf(bgw.bgw_function_name, BGW_MAXLEN, "%s", "DataChecksumsWorkerMain");
+	snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker worker");
+	snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker worker");
+	bgw.bgw_restart_time = BGW_NEVER_RESTART;
+	bgw.bgw_notify_pid = MyProcPid;
+	bgw.bgw_main_arg = ObjectIdGetDatum(db->dboid);
+
+	/*
+	 * If there are no worker slots available, make sure we retry processing
+	 * this database. This will make the datachecksumsworker move on to the
+	 * next database and quite likely fail with the same problem. TODO: Maybe
+	 * we need a backoff to avoid running through all the databases here in
+	 * short order.
+	 */
+	if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+	{
+		ereport(WARNING,
+				(errmsg("failed to start worker for enabling data checksums in database \"%s\", retrying",
+						db->dbname),
+				 errhint("The max_worker_processes setting might be too low.")));
+		return DATACHECKSUMSWORKER_RETRYDB;
+	}
+
+	status = WaitForBackgroundWorkerStartup(bgw_handle, &pid);
+	if (status == BGWH_STOPPED)
+	{
+		ereport(WARNING,
+				(errmsg("could not start background worker for enabling data checksums in database \"%s\"",
+						db->dbname),
+				 errhint("More details on the error might be found in the server log.")));
+		return DATACHECKSUMSWORKER_FAILED;
+	}
+
+	/*
+	 * If the postmaster crashed we cannot end up with a processed database so
+	 * we have no alternative other than exiting. When enabling checksums we
+	 * won't at this time have changed the pg_control version to enabled so
+	 * when the cluster comes back up processing will have to be restarted.
+	 * When disabling, the pg_control version will be set to off before this
+	 * so when the cluster comes up checksums will be off as expected.
+	 */
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("cannot enable data checksums without the postmaster process"),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	Assert(status == BGWH_STARTED);
+	ereport(DEBUG1,
+			(errmsg("initiating data checksum processing in database \"%s\"",
+					db->dbname)));
+
+	snprintf(activity, sizeof(activity) - 1,
+			 "Waiting for worker in database %s (pid %d)", db->dbname, pid);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	status = WaitForBackgroundWorkerShutdown(bgw_handle);
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("postmaster exited during data checksum processing in \"%s\"",
+						db->dbname),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	if (DataChecksumsWorkerShmem->success == DATACHECKSUMSWORKER_ABORTED)
+		ereport(LOG,
+				(errmsg("data checksums processing was aborted in database \"%s\"",
+						db->dbname)));
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return DataChecksumsWorkerShmem->success;
+}
+
+/*
+ * launcher_exit
+ *
+ * Internal routine for cleaning up state when the launcher process exits. We
+ * need to clean up the abort flag to ensure that processing can be restarted
+ * again after it was previously aborted.
+ */
+static void
+launcher_exit(int code, Datum arg)
+{
+	if (launcher_running)
+	{
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		launcher_running = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		LWLockRelease(DataChecksumsWorkerLock);
+	}
+}
+
+/*
+ * launcher_cancel_handler
+ *
+ * Internal routine for reacting to SIGINT and flagging the worker to abort.
+ * The worker won't be interrupted immediately but will check for abort flag
+ * between each block in a relation.
+ */
+static void
+launcher_cancel_handler(SIGNAL_ARGS)
+{
+	int			save_errno = errno;
+
+	abort_requested = true;
+
+	/*
+	 * There is no sleeping in the main loop, the flag will be checked
+	 * periodically in ProcessSingleRelationFork. The worker does however
+	 * sleep when waiting for concurrent transactions to end so we still need
+	 * to set the latch.
+	 */
+	SetLatch(MyLatch);
+
+	errno = save_errno;
+}
+
+/*
+ * WaitForAllTransactionsToFinish
+ *		Blocks awaiting all current transactions to finish
+ *
+ * Returns when all transactions which are active at the call of the function
+ * have ended, or if the postmaster dies while waiting. If the postmaster dies
+ * the abort flag will be set to indicate that the caller of this shouldn't
+ * proceed.
+ *
+ * NB: this will return early, if aborted by SIGINT or if the target state
+ * is changed while we're running.
+ */
+static void
+WaitForAllTransactionsToFinish(void)
+{
+	TransactionId waitforxid;
+
+	LWLockAcquire(XidGenLock, LW_SHARED);
+	waitforxid = XidFromFullTransactionId(TransamVariables->nextXid);
+	LWLockRelease(XidGenLock);
+
+	while (TransactionIdPrecedes(GetOldestActiveTransactionId(), waitforxid))
+	{
+		char		activity[64];
+		int			rc;
+
+		/* Oldest running xid is older than us, so wait */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for current transactions to finish (waiting for %u)",
+				 waitforxid);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		rc = WaitLatch(MyLatch,
+					   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+					   5000,
+					   WAIT_EVENT_CHECKSUM_ENABLE_STARTCONDITION);
+
+		/*
+		 * If the postmaster died we won't be able to enable checksums
+		 * cluster-wide so abort and hope to continue when restarted.
+		 */
+		if (rc & WL_POSTMASTER_DEATH)
+			ereport(FATAL,
+					(errmsg("postmaster exited during data checksum processing"),
+					 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_SHARED);
+		if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			abort_requested = true;
+		LWLockRelease(DataChecksumsWorkerLock);
+		if (abort_requested)
+			break;
+	}
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+	return;
+}
+
+/*
+ * DataChecksumsWorkerLauncherMain
+ *
+ * Main function for launching dynamic background workers for processing data
+ * checksums in databases. This function has the bgworker management, with
+ * ProcessAllDatabases being responsible for looping over the databases and
+ * initiating processing.
+ */
+void
+DataChecksumsWorkerLauncherMain(Datum arg)
+{
+	on_shmem_exit(launcher_exit, 0);
+
+	ereport(DEBUG1,
+			errmsg("background worker \"datachecksumsworker\" launcher started"));
+
+	pqsignal(SIGTERM, die);
+	pqsignal(SIGINT, launcher_cancel_handler);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_LAUNCHER;
+	init_ps_display(NULL);
+
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	if (DataChecksumsWorkerShmem->launcher_running)
+	{
+		/* Launcher was already running, let it finish */
+		LWLockRelease(DataChecksumsWorkerLock);
+		return;
+	}
+
+	launcher_running = true;
+
+	/*
+	 * Initialize a connection to shared catalogs only.
+	 */
+	BackgroundWorkerInitializeConnectionByOid(InvalidOid, InvalidOid, 0);
+
+	/* Initialize backend status information */
+	pgstat_bestart();
+
+	enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+	DataChecksumsWorkerShmem->launcher_running = true;
+	DataChecksumsWorkerShmem->enabling_checksums = enabling_checksums;
+	DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+	DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+	DataChecksumsWorkerShmem->immediate_checkpoint = DataChecksumsWorkerShmem->launch_fast;
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * The target state can change while we are busy enabling/disabling
+	 * checksums, if the user calls pg_disable/enable_data_checksums() before
+	 * we are finished with the previous request. In that case, we will loop
+	 * back here, to process the new request.
+	 */
+again:
+
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	/*
+	 * If we're asked to enable checksums, we need to check if processing was
+	 * previously interrupted such that we should resume rather than start
+	 * from scratch.
+	 */
+	if (enabling_checksums)
+	{
+		/*
+		 * If we are asked to enable checksums in a cluster which already has
+		 * checksums enabled, exit immediately as there is nothing more to do.
+		 * Hold interrupts to make sure state doesn't change during checking.
+		 */
+		HOLD_INTERRUPTS();
+		if (DataChecksumsNeedVerify())
+		{
+			RESUME_INTERRUPTS();
+			goto done;
+		}
+		RESUME_INTERRUPTS();
+
+		/*
+		 * Initialize progress and indicate that we are waiting on the other
+		 * backends to clear the procsignalbarrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS);
+
+		/*
+		 * Set the state to inprogress-on and wait on the procsignal barrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_ENABLING);
+		SetDataChecksumsOnInProgress();
+
+		if (!ProcessAllDatabases(DataChecksumsWorkerShmem->immediate_checkpoint))
+		{
+			/*
+			 * If the target state changed during processing then it's not a
+			 * failure, so restart processing instead.
+			 */
+			LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+			if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			{
+				LWLockRelease(DataChecksumsWorkerLock);
+				goto done;
+			}
+			LWLockRelease(DataChecksumsWorkerLock);
+			ereport(ERROR,
+					(errmsg("unable to enable data checksums in cluster")));
+		}
+
+		SetDataChecksumsOn();
+	}
+	else
+	{
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_DISABLING);
+		SetDataChecksumsOff();
+	}
+
+done:
+
+	/*
+	 * All done. But before we exit, check if the target state was changed
+	 * while we were running. In that case we will have to start all over
+	 * again.
+	 */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+	if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+	{
+		DataChecksumsWorkerShmem->enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+		DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+		LWLockRelease(DataChecksumsWorkerLock);
+		goto again;
+	}
+
+	/* Shut down progress reporting as we are done */
+	pgstat_progress_end_command();
+
+	launcher_running = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	LWLockRelease(DataChecksumsWorkerLock);
+}
+
+/*
+ * ProcessAllDatabases
+ *		Compute the list of all databases and process checksums in each
+ *
+ * This will repeatedly generate a list of databases to process for enabling
+ * checksums. Until no new databases are found, this will loop around computing
+ * a new list and comparing it to the already seen ones.
+ *
+ * If immediate_checkpoint is set to true then a CHECKPOINT_IMMEDIATE will be
+ * issued. This is useful for testing but should be avoided in production use
+ * as it may affect cluster performance drastically.
+ */
+static bool
+ProcessAllDatabases(bool immediate_checkpoint)
+{
+	List	   *DatabaseList;
+	HTAB	   *ProcessedDatabases = NULL;
+	HASHCTL		hash_ctl;
+	bool		found_failed = false;
+	int			flags;
+
+	/* Initialize a hash tracking all processed databases */
+	memset(&hash_ctl, 0, sizeof(hash_ctl));
+	hash_ctl.keysize = sizeof(Oid);
+	hash_ctl.entrysize = sizeof(DataChecksumsWorkerResultEntry);
+	ProcessedDatabases = hash_create("Processed databases",
+									 64,
+									 &hash_ctl,
+									 HASH_ELEM | HASH_BLOBS);
+
+	/*
+	 * Set up so first run processes shared catalogs, but not once in every
+	 * db.
+	 */
+	DataChecksumsWorkerShmem->process_shared_catalogs = true;
+
+	/*
+	 * Get a list of all databases to process. This may include databases that
+	 * were created during our runtime.  Since a database can be created as a
+	 * copy of any other database (which may not have existed in our last
+	 * run), we have to repeat this loop until no new databases show up in the
+	 * list.
+	 */
+	DatabaseList = BuildDatabaseList();
+
+	/*
+	 * Update progress reporting with the total number of databases we need to
+	 * process. This number should not be changed during processing, the
+	 * columns for processed databases is instead increased such that it can
+	 * be compared against the total.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_DB,
+								 list_length(DatabaseList));
+
+	while (true)
+	{
+		int			processed_databases = 0;
+
+		foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+		{
+			DataChecksumsWorkerResult result;
+			DataChecksumsWorkerResultEntry *entry;
+			bool		found;
+
+			/*
+			 * Indicate which database is being processed set the number of
+			 * relations to -1 to clear field from previous values. -1 will
+			 * translate to NULL in the progress view.
+			 */
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_CUR_DB, db->dboid);
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_REL, -1);
+
+			/*
+			 * Check if this database has been processed already, and if so
+			 * whether it should be retried or skipped.
+			 */
+			entry = (DataChecksumsWorkerResultEntry *) hash_search(ProcessedDatabases, &db->dboid,
+																   HASH_FIND, NULL);
+
+			if (entry)
+			{
+				if (entry->result == DATACHECKSUMSWORKER_RETRYDB)
+				{
+					/*
+					 * Limit the number of retries to avoid infinite looping
+					 * in case there simply won't be enough workers in the
+					 * cluster to finish this operation.
+					 */
+					if (entry->retries > DATACHECKSUMSWORKER_MAX_DB_RETRIES)
+						entry->result = DATACHECKSUMSWORKER_FAILED;
+				}
+
+				/* Skip if this database has been processed already */
+				if (entry->result != DATACHECKSUMSWORKER_RETRYDB)
+					continue;
+			}
+
+			result = ProcessDatabase(db);
+			processed_databases++;
+
+			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
+			{
+				/*
+				 * If one database has completed shared catalogs, we don't
+				 * have to process them again.
+				 */
+				if (DataChecksumsWorkerShmem->process_shared_catalogs)
+					DataChecksumsWorkerShmem->process_shared_catalogs = false;
+			}
+			else if (result == DATACHECKSUMSWORKER_ABORTED)
+			{
+				/* Abort flag set, so exit the whole process */
+				return false;
+			}
+
+			entry = hash_search(ProcessedDatabases, &db->dboid, HASH_ENTER, &found);
+			entry->dboid = db->dboid;
+			entry->result = result;
+			if (!found)
+				entry->retries = 0;
+			else
+				entry->retries++;
+		}
+
+		elog(DEBUG1,
+			 "%i databases processed for data checksum enabling, %s",
+			 processed_databases,
+			 (processed_databases ? "process with restart" : "process completed"));
+
+		FreeDatabaseList(DatabaseList);
+
+		/*
+		 * If no databases were processed in this run of the loop, we have now
+		 * finished all databases and no concurrently created ones can exist.
+		 */
+		if (processed_databases == 0)
+			break;
+
+		/*
+		 * Re-generate the list of databases for another pass. Since we wait
+		 * for all pre-existing transactions finish, this way we can be
+		 * certain that there are no databases left without checksums.
+		 */
+		WaitForAllTransactionsToFinish();
+		DatabaseList = BuildDatabaseList();
+	}
+
+	/*
+	 * ProcessedDatabases now has all databases and the results of their
+	 * processing. Failure to enable checksums for a database can be because
+	 * they actually failed for some reason, or because the database was
+	 * dropped between us getting the database list and trying to process it.
+	 * Get a fresh list of databases to detect the second case where the
+	 * database was dropped before we had started processing it. If a database
+	 * still exists, but enabling checksums failed then we fail the entire
+	 * checksumming process and exit with an error.
+	 */
+	WaitForAllTransactionsToFinish();
+	DatabaseList = BuildDatabaseList();
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+	{
+		DataChecksumsWorkerResult *entry;
+		bool		found;
+
+		entry = hash_search(ProcessedDatabases, (void *) &db->dboid,
+							HASH_FIND, &found);
+
+		/*
+		 * We are only interested in the databases where the failed database
+		 * still exists.
+		 */
+		if (found && *entry == DATACHECKSUMSWORKER_FAILED)
+		{
+			ereport(WARNING,
+					(errmsg("failed to enable data checksums in \"%s\"",
+							db->dbname)));
+			found_failed = found;
+			continue;
+		}
+	}
+
+	FreeDatabaseList(DatabaseList);
+
+	if (found_failed)
+	{
+		/* Disable checksums on cluster, because we failed */
+		SetDataChecksumsOff();
+		ereport(ERROR,
+				(errmsg("data checksums failed to get enabled in all databases, aborting"),
+				 errhint("The server log might have more information on the cause of the error.")));
+	}
+
+	/*
+	 * Force a checkpoint to get everything out to disk. The use of immediate
+	 * checkpoints is for running tests, as they would otherwise not execute
+	 * in such a way that they can reliably be placed under timeout control.
+	 */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_IMMEDIATE;
+	RequestCheckpoint(flags);
+
+	return true;
+}
+
+/*
+ * DataChecksumsWorkerShmemSize
+ *		Compute required space for datachecksumsworker-related shared memory
+ */
+Size
+DataChecksumsWorkerShmemSize(void)
+{
+	Size		size;
+
+	size = sizeof(DataChecksumsWorkerShmemStruct);
+	size = MAXALIGN(size);
+
+	return size;
+}
+
+/*
+ * DataChecksumsWorkerShmemInit
+ *		Allocate and initialize datachecksumsworker-related shared memory
+ */
+void
+DataChecksumsWorkerShmemInit(void)
+{
+	bool		found;
+
+	DataChecksumsWorkerShmem = (DataChecksumsWorkerShmemStruct *)
+		ShmemInitStruct("DataChecksumsWorker Data",
+						DataChecksumsWorkerShmemSize(),
+						&found);
+
+	MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
+
+	/*
+	 * Even if this is a redundant assignment, we want to be explicit about
+	 * our intent for readability, since we want to be able to query this
+	 * state in case of restartability.
+	 */
+	DataChecksumsWorkerShmem->launch_enable_checksums = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	DataChecksumsWorkerShmem->launch_fast = false;
+}
+
+/*
+ * BuildDatabaseList
+ *		Compile a list of all currently available databases in the cluster
+ *
+ * This creates the list of databases for the datachecksumsworker workers to
+ * add checksums to. If the caller wants to ensure that no concurrently
+ * running CREATE DATABASE calls exist, this needs to be preceded by a call
+ * to WaitForAllTransactionsToFinish().
+ */
+static List *
+BuildDatabaseList(void)
+{
+	List	   *DatabaseList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(DatabaseRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_database pgdb = (Form_pg_database) GETSTRUCT(tup);
+		DataChecksumsWorkerDatabase *db;
+
+		oldctx = MemoryContextSwitchTo(ctx);
+
+		db = (DataChecksumsWorkerDatabase *) palloc0(sizeof(DataChecksumsWorkerDatabase));
+
+		db->dboid = pgdb->oid;
+		db->dbname = pstrdup(NameStr(pgdb->datname));
+
+		DatabaseList = lappend(DatabaseList, db);
+
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return DatabaseList;
+}
+
+static void
+FreeDatabaseList(List *dblist)
+{
+	if (!dblist)
+		return;
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, dblist)
+	{
+		if (db->dbname != NULL)
+			pfree(db->dbname);
+	}
+
+	list_free_deep(dblist);
+}
+
+/*
+ * BuildRelationList
+ *		Compile a list of relations in the database
+ *
+ * Returns a list of OIDs for the request relation types. If temp_relations
+ * is True then only temporary relations are returned. If temp_relations is
+ * False then non-temporary relations which have data checksums are returned.
+ * If include_shared is True then shared relations are included as well in a
+ * non-temporary list. include_shared has no relevance when building a list of
+ * temporary relations.
+ */
+static List *
+BuildRelationList(bool temp_relations, bool include_shared)
+{
+	List	   *RelationList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(RelationRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_class pgc = (Form_pg_class) GETSTRUCT(tup);
+
+		/*
+		 * Only include temporary relations when asked for a temp relation
+		 * list.
+		 */
+		if (pgc->relpersistence == RELPERSISTENCE_TEMP)
+		{
+			if (!temp_relations)
+				continue;
+		}
+		else
+		{
+			/*
+			 * If we are only interested in temp relations then continue
+			 * immediately as the current relation isn't a temp relation.
+			 */
+			if (temp_relations)
+				continue;
+
+			if (!RELKIND_HAS_STORAGE(pgc->relkind))
+				continue;
+
+			if (pgc->relisshared && !include_shared)
+				continue;
+		}
+
+		oldctx = MemoryContextSwitchTo(ctx);
+		RelationList = lappend_oid(RelationList, pgc->oid);
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return RelationList;
+}
+
+/*
+ * DataChecksumsWorkerMain
+ *
+ * Main function for enabling checksums in a single database, This is the
+ * function set as the bgw_function_name in the dynamic background worker
+ * process initiated for each database by the worker launcher. After enabling
+ * data checksums in each applicable relation in the database, it will wait for
+ * all temporary relations that were present when the function started to
+ * disappear before returning. This is required since we cannot rewrite
+ * existing temporary relations with data checksums.
+ */
+void
+DataChecksumsWorkerMain(Datum arg)
+{
+	Oid			dboid = DatumGetObjectId(arg);
+	List	   *RelationList = NIL;
+	List	   *InitialTempTableList = NIL;
+	BufferAccessStrategy strategy;
+	bool		aborted = false;
+
+	enabling_checksums = true;
+
+	pqsignal(SIGTERM, die);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_WORKER;
+	init_ps_display(NULL);
+
+	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
+											  BGWORKER_BYPASS_ALLOWCONN);
+
+	/*
+	 * Get a list of all temp tables present as we start in this database. We
+	 * need to wait until they are all gone until we are done, since we cannot
+	 * access these relations and modify them.
+	 */
+	InitialTempTableList = BuildRelationList(true, false);
+
+	/*
+	 * Enable vacuum cost delay, if any.
+	 */
+	Assert(DataChecksumsWorkerShmem->enabling_checksums);
+	VacuumCostDelay = DataChecksumsWorkerShmem->cost_delay;
+	VacuumCostLimit = DataChecksumsWorkerShmem->cost_limit;
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumCostPageHit = 0;
+	VacuumCostPageMiss = 0;
+	VacuumCostPageDirty = 0;
+
+	/*
+	 * Create and set the vacuum strategy as our buffer strategy.
+	 */
+	strategy = GetAccessStrategy(BAS_VACUUM);
+
+	RelationList = BuildRelationList(false,
+									 DataChecksumsWorkerShmem->process_shared_catalogs);
+	foreach_oid(reloid, RelationList)
+	{
+		if (!ProcessSingleRelationByOid(reloid, strategy))
+		{
+			aborted = true;
+			break;
+		}
+	}
+	list_free(RelationList);
+
+	if (aborted)
+	{
+		DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+		ereport(DEBUG1,
+				(errmsg("data checksum processing aborted in database OID %u",
+						dboid)));
+		return;
+	}
+
+	/*
+	 * Wait for all temp tables that existed when we started to go away. This
+	 * is necessary since we cannot "reach" them to enable checksums. Any temp
+	 * tables created after we started will already have checksums in them
+	 * (due to the "inprogress-on" state), so no need to wait for those.
+	 */
+	for (;;)
+	{
+		List	   *CurrentTempTables;
+		int			numleft;
+		char		activity[64];
+
+		CurrentTempTables = BuildRelationList(true, false);
+		numleft = 0;
+		foreach_oid(tmptbloid, InitialTempTableList)
+		{
+			if (list_member_oid(CurrentTempTables, tmptbloid))
+				numleft++;
+		}
+		list_free(CurrentTempTables);
+
+		if (numleft == 0)
+			break;
+
+		/*
+		 * At least one temp table is left to wait for, indicate in pgstat
+		 * activity and progress reporting.
+		 */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for %d temp tables to be removed", numleft);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		(void) WaitLatch(MyLatch,
+						 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+						 5000,
+						 WAIT_EVENT_CHECKSUM_ENABLE_FINISHCONDITION);
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		aborted = DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums;
+		LWLockRelease(DataChecksumsWorkerLock);
+
+		if (aborted || abort_requested)
+		{
+			DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+			ereport(DEBUG1,
+					(errmsg("data checksum processing aborted in database OID %u",
+							dboid)));
+			return;
+		}
+	}
+
+	list_free(InitialTempTableList);
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
+}
diff --git a/src/backend/postmaster/meson.build b/src/backend/postmaster/meson.build
index 0008603cfee..ce10ef1059a 100644
--- a/src/backend/postmaster/meson.build
+++ b/src/backend/postmaster/meson.build
@@ -6,6 +6,7 @@ backend_sources += files(
   'bgworker.c',
   'bgwriter.c',
   'checkpointer.c',
+  'datachecksumsworker.c',
   'fork_process.c',
   'interrupt.c',
   'launch_backend.c',
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 24d88f368d8..00cce4c7673 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -186,6 +186,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 		case XLOG_FPW_CHANGE:
 		case XLOG_FPI_FOR_HINT:
 		case XLOG_FPI:
+		case XLOG_CHECKSUMS:
 		case XLOG_OVERWRITE_CONTRECORD:
 		case XLOG_CHECKPOINT_REDO:
 			break;
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 174eed70367..e7761a3ddb6 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -30,6 +30,8 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "postmaster/postmaster.h"
 #include "postmaster/walsummarizer.h"
 #include "replication/logicallauncher.h"
 #include "replication/origin.h"
@@ -148,6 +150,7 @@ CalculateShmemSize(int *num_semaphores)
 	size = add_size(size, WaitEventCustomShmemSize());
 	size = add_size(size, InjectionPointShmemSize());
 	size = add_size(size, SlotSyncShmemSize());
+	size = add_size(size, DataChecksumsWorkerShmemSize());
 
 	/* include additional requested shmem from preload libraries */
 	size = add_size(size, total_addin_request);
@@ -330,6 +333,7 @@ CreateOrAttachShmemStructs(void)
 	PgArchShmemInit();
 	ApplyLauncherShmemInit();
 	SlotSyncShmemInit();
+	DataChecksumsWorkerShmemInit();
 
 	/*
 	 * Set up other modules that need some shared memory space
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 7d201965503..2b13a8cd260 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -18,6 +18,7 @@
 #include <unistd.h>
 
 #include "access/parallel.h"
+#include "access/xlog.h"
 #include "commands/async.h"
 #include "miscadmin.h"
 #include "pgstat.h"
@@ -575,6 +576,18 @@ ProcessProcSignalBarrier(void)
 					case PROCSIGNAL_BARRIER_SMGRRELEASE:
 						processed = ProcessBarrierSmgrRelease();
 						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON:
+						processed = AbsorbChecksumsOnInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_ON:
+						processed = AbsorbChecksumsOnBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF:
+						processed = AbsorbChecksumsOffInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_OFF:
+						processed = AbsorbChecksumsOffBarrier();
+						break;
 				}
 
 				/*
diff --git a/src/backend/storage/page/README b/src/backend/storage/page/README
index e30d7ac59ad..73c36a63908 100644
--- a/src/backend/storage/page/README
+++ b/src/backend/storage/page/README
@@ -10,7 +10,9 @@ http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf, discussed
 2010/12/22 on -hackers list.
 
 Current implementation requires this be enabled system-wide at initdb time, or
-by using the pg_checksums tool on an offline cluster.
+by using the pg_checksums tool on an offline cluster.  Checksums can also be
+enabled at runtime using pg_enable_data_checksums(), and disabled by using
+pg_disable_data_checksums().
 
 The checksum is not valid at all times on a data page!!
 The checksum is valid when the page leaves the shared pool and is checked
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index ecc81aacfc3..8bd5fed8c85 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -98,7 +98,7 @@ PageIsVerifiedExtended(PageData *page, BlockNumber blkno, int flags)
 	 */
 	if (!PageIsNew(page))
 	{
-		if (DataChecksumsEnabled())
+		if (DataChecksumsNeedVerify())
 		{
 			checksum = pg_checksum_page(page, blkno);
 
@@ -1501,7 +1501,7 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 	static char *pageCopy = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return page;
 
 	/*
@@ -1531,7 +1531,7 @@ void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
 	/* If we don't need a checksum, just return */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return;
 
 	((PageHeader) page)->pd_checksum = pg_checksum_page(page, blkno);
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index eb575025596..e7b418a000e 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -370,6 +370,8 @@ pgstat_tracks_io_bktype(BackendType bktype)
 		case B_LOGGER:
 			return false;
 
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 		case B_AUTOVAC_LAUNCHER:
 		case B_AUTOVAC_WORKER:
 		case B_BACKEND:
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index e199f071628..c7cd48e8d33 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -115,6 +115,8 @@ CHECKPOINT_DELAY_COMPLETE	"Waiting for a backend that blocks a checkpoint from c
 CHECKPOINT_DELAY_START	"Waiting for a backend that blocks a checkpoint from starting."
 CHECKPOINT_DONE	"Waiting for a checkpoint to complete."
 CHECKPOINT_START	"Waiting for a checkpoint to start."
+CHECKSUM_ENABLE_STARTCONDITION	"Waiting for data checksums enabling to start."
+CHECKSUM_ENABLE_FINISHCONDITION	"Waiting for data checksums to be enabled."
 EXECUTE_GATHER	"Waiting for activity from a child process while executing a <literal>Gather</literal> plan node."
 HASH_BATCH_ALLOCATE	"Waiting for an elected Parallel Hash participant to allocate a hash table."
 HASH_BATCH_ELECT	"Waiting to elect a Parallel Hash participant to allocate a hash table."
@@ -346,6 +348,7 @@ WALSummarizer	"Waiting to read or update WAL summarization state."
 DSMRegistry	"Waiting to read or update the dynamic shared memory registry."
 InjectionPoint	"Waiting to read or update information related to injection points."
 SerialControl	"Waiting to read or update shared <filename>pg_serial</filename> state."
+DataChecksumsWorker	"Waiting for data checksumsworker."
 
 #
 # END OF PREDEFINED LWLOCKS (DO NOT CHANGE THIS LINE)
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 9172e1cb9d2..f6bd1e9f0ca 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -274,6 +274,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		cmdtype = PROGRESS_COMMAND_BASEBACKUP;
 	else if (pg_strcasecmp(cmd, "COPY") == 0)
 		cmdtype = PROGRESS_COMMAND_COPY;
+	else if (pg_strcasecmp(cmd, "DATACHECKSUMS") == 0)
+		cmdtype = PROGRESS_COMMAND_DATACHECKSUMS;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -1146,9 +1148,6 @@ pg_stat_get_db_checksum_failures(PG_FUNCTION_ARGS)
 	int64		result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
@@ -1164,9 +1163,6 @@ pg_stat_get_db_checksum_last_failure(PG_FUNCTION_ARGS)
 	TimestampTz result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index dc3521457c7..a071ba6f455 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -293,6 +293,12 @@ GetBackendTypeDesc(BackendType backendType)
 		case B_CHECKPOINTER:
 			backendDesc = gettext_noop("checkpointer");
 			break;
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+			backendDesc = "datachecksumsworker launcher";
+			break;
+		case B_DATACHECKSUMSWORKER_WORKER:
+			backendDesc = "datachecksumsworker worker";
+			break;
 		case B_LOGGER:
 			backendDesc = gettext_noop("logger");
 			break;
@@ -892,7 +898,8 @@ InitializeSessionUserIdStandalone(void)
 	 * workers, in slot sync worker and in background workers.
 	 */
 	Assert(!IsUnderPostmaster || AmAutoVacuumWorkerProcess() ||
-		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess());
+		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess() ||
+		   AmDataChecksumsWorkerProcess());
 
 	/* call only once */
 	Assert(!OidIsValid(AuthenticatedUserId));
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index ee1a9d5d98b..70899e6ef2d 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -739,6 +739,11 @@ InitPostgres(const char *in_dbname, Oid dboid,
 	 */
 	SharedInvalBackendInit(false);
 
+	/*
+	 * Set up backend local cache of Controldata values.
+	 */
+	InitLocalControldata();
+
 	ProcSignalInit(MyCancelKeyValid, MyCancelKey);
 
 	/*
@@ -869,7 +874,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
 					 errhint("You should immediately run CREATE USER \"%s\" SUPERUSER;.",
 							 username != NULL ? username : "postgres")));
 	}
-	else if (AmBackgroundWorkerProcess())
+	else if (AmBackgroundWorkerProcess() || AmDataChecksumsWorkerProcess())
 	{
 		if (username == NULL && !OidIsValid(useroid))
 		{
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index ad25cbb39c5..05cba3a02e3 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -476,6 +476,14 @@ static const struct config_enum_entry wal_compression_options[] = {
 	{NULL, 0, false}
 };
 
+static const struct config_enum_entry data_checksums_options[] = {
+	{"on", DATA_CHECKSUMS_ON, true},
+	{"off", DATA_CHECKSUMS_OFF, true},
+	{"inprogress-on", DATA_CHECKSUMS_INPROGRESS_ON, true},
+	{"inprogress-off", DATA_CHECKSUMS_INPROGRESS_OFF, true},
+	{NULL, 0, false}
+};
+
 /*
  * Options for enum values stored in other modules
  */
@@ -601,7 +609,7 @@ static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
 static int	num_os_semaphores;
-static bool data_checksums;
+static int	data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
@@ -1952,17 +1960,6 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
-	{
-		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
-			gettext_noop("Shows whether data checksums are turned on for this cluster."),
-			NULL,
-			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
-		},
-		&data_checksums,
-		false,
-		NULL, NULL, NULL
-	},
-
 	{
 		{"syslog_sequence_numbers", PGC_SIGHUP, LOGGING_WHERE,
 			gettext_noop("Add sequence number to syslog messages to avoid duplicate suppression."),
@@ -5299,6 +5296,17 @@ struct config_enum ConfigureNamesEnum[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows whether data checksums are turned on for this cluster."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+		},
+		&data_checksums,
+		DATA_CHECKSUMS_OFF, data_checksums_options,
+		NULL, NULL, show_data_checksums
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 867aeddc601..79c3f86357e 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -576,7 +576,7 @@ main(int argc, char *argv[])
 		mode == PG_MODE_DISABLE)
 		pg_fatal("data checksums are already disabled in cluster");
 
-	if (ControlFile->data_checksum_version > 0 &&
+	if (ControlFile->data_checksum_version == DATA_CHECKSUMS_ON &&
 		mode == PG_MODE_ENABLE)
 		pg_fatal("data checksums are already enabled in cluster");
 
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index bd49ea867bf..f48936d3b00 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -14,6 +14,7 @@
 
 #include "common/string.h"
 #include "pg_upgrade.h"
+#include "storage/bufpage.h"
 
 
 /*
@@ -735,6 +736,14 @@ check_control_data(ControlData *oldctrl,
 	 * check_for_isn_and_int8_passing_mismatch().
 	 */
 
+	/*
+	 * If data checksums are in any in-progress state then disallow the
+	 * upgrade. The user should either let the process finish, or turn off
+	 * data checksums, before retrying.
+	 */
+	if (oldctrl->data_checksum_version > PG_DATA_CHECKSUM_VERSION)
+		pg_fatal("checksums are being enabled in the old cluster");
+
 	/*
 	 * We might eventually allow upgrades from checksum to no-checksum
 	 * clusters.
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index d313099c027..0bb32c9c0fc 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -117,7 +117,7 @@ extern PGDLLIMPORT int wal_level;
  * of the bits make it to disk, but the checksum wouldn't match.  Also WAL-log
  * them if forced by wal_log_hints=on.
  */
-#define XLogHintBitIsNeeded() (DataChecksumsEnabled() || wal_log_hints)
+#define XLogHintBitIsNeeded() (wal_log_hints || DataChecksumsNeedWrite())
 
 /* Do we need to WAL-log information required only for Hot Standby and logical replication? */
 #define XLogStandbyInfoActive() (wal_level >= WAL_LEVEL_REPLICA)
@@ -230,7 +230,19 @@ extern XLogRecPtr GetXLogWriteRecPtr(void);
 
 extern uint64 GetSystemIdentifier(void);
 extern char *GetMockAuthenticationNonce(void);
-extern bool DataChecksumsEnabled(void);
+extern bool DataChecksumsNeedWrite(void);
+extern bool DataChecksumsNeedVerify(void);
+extern bool DataChecksumsOnInProgress(void);
+extern bool DataChecksumsOffInProgress(void);
+extern void SetDataChecksumsOnInProgress(void);
+extern void SetDataChecksumsOn(void);
+extern void SetDataChecksumsOff(void);
+extern bool AbsorbChecksumsOnInProgressBarrier(void);
+extern bool AbsorbChecksumsOffInProgressBarrier(void);
+extern bool AbsorbChecksumsOnBarrier(void);
+extern bool AbsorbChecksumsOffBarrier(void);
+extern const char *show_data_checksums(void);
+extern void InitLocalControldata(void);
 extern bool GetDefaultCharSignedness(void);
 extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
 extern Size XLOGShmemSize(void);
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index 2cf8d55d706..ee82d4ce73d 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,6 +25,7 @@
 #include "lib/stringinfo.h"
 #include "pgtime.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/relfilelocator.h"
 
 
@@ -289,6 +290,12 @@ typedef struct xl_restore_point
 	char		rp_name[MAXFNAMELEN];
 } xl_restore_point;
 
+/* Information logged when data checksum level is changed */
+typedef struct xl_checksum_state
+{
+	ChecksumType new_checksumtype;
+} xl_checksum_state;
+
 /* Overwrite of prior contrecord */
 typedef struct xl_overwrite_contrecord
 {
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 63e834a6ce4..9b5a50adf54 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -80,6 +80,7 @@ typedef struct CheckPoint
 /* 0xC0 is used in Postgres 9.5-11 */
 #define XLOG_OVERWRITE_CONTRECORD		0xD0
 #define XLOG_CHECKPOINT_REDO			0xE0
+#define XLOG_CHECKSUMS					0xF0
 
 
 /*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index cede992b6e2..fb0e062b984 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12246,6 +12246,23 @@
   proname => 'jsonb_subscript_handler', prorettype => 'internal',
   proargtypes => 'internal', prosrc => 'jsonb_subscript_handler' },
 
+# data checksum management functions
+{ oid => '9258',
+  descr => 'disable data checksums',
+  proname => 'pg_disable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => '',
+  prosrc => 'disable_data_checksums' },
+
+{ oid => '9257',
+  descr => 'enable data checksums',
+  proname => 'pg_enable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'int4 int4 bool', proallargtypes => '{int4,int4,bool}',
+  proargmodes => '{i,i,i}',
+  proargnames => '{cost_delay,cost_limit,fast}',
+  prosrc => 'enable_data_checksums' },
+
 # collation management functions
 { oid => '3445', descr => 'import collations from operating system',
   proname => 'pg_import_system_collations', procost => '100',
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 7c736e7b03b..1ebd0c792b4 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -157,4 +157,22 @@
 #define PROGRESS_COPY_TYPE_PIPE 3
 #define PROGRESS_COPY_TYPE_CALLBACK 4
 
+/* Progress parameters for PROGRESS_DATACHECKSUMS */
+#define PROGRESS_DATACHECKSUMS_PHASE 0
+#define PROGRESS_DATACHECKSUMS_TOTAL_DB 1
+#define PROGRESS_DATACHECKSUMS_TOTAL_REL 2
+#define PROGRESS_DATACHECKSUMS_PROCESSED_DB 3
+#define PROGRESS_DATACHECKSUMS_PROCESSED_REL 4
+#define PROGRESS_DATACHECKSUMS_CUR_DB 5
+#define PROGRESS_DATACHECKSUMS_CUR_REL 6
+#define PROGRESS_DATACHECKSUMS_CUR_REL_TOTAL_BLOCKS 7
+#define PROGRESS_DATACHECKSUMS_CUR_REL_PROCESSED_BLOCKS 8
+
+/* Phases of datachecksumsworker operation */
+#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING 0
+#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING 1
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS 2
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL 3
+#define PROGRESS_DATACHECKSUMS_DONE 4
+
 #endif
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index a2b63495eec..9923c7f518d 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -365,6 +365,9 @@ typedef enum BackendType
 	B_WAL_SUMMARIZER,
 	B_WAL_WRITER,
 
+	B_DATACHECKSUMSWORKER_LAUNCHER,
+	B_DATACHECKSUMSWORKER_WORKER,
+
 	/*
 	 * Logger is not connected to shared memory and does not have a PGPROC
 	 * entry.
@@ -389,6 +392,7 @@ extern PGDLLIMPORT BackendType MyBackendType;
 #define AmWalReceiverProcess()		(MyBackendType == B_WAL_RECEIVER)
 #define AmWalSummarizerProcess()	(MyBackendType == B_WAL_SUMMARIZER)
 #define AmWalWriterProcess()		(MyBackendType == B_WAL_WRITER)
+#define AmDataChecksumsWorkerProcess()	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
 
 #define AmSpecialWorkerProcess() \
 	(AmAutoVacuumLauncherProcess() || \
diff --git a/src/include/postmaster/datachecksumsworker.h b/src/include/postmaster/datachecksumsworker.h
new file mode 100644
index 00000000000..59c9000d646
--- /dev/null
+++ b/src/include/postmaster/datachecksumsworker.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.h
+ *	  header file for checksum helper background worker
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/postmaster/datachecksumsworker.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef DATACHECKSUMSWORKER_H
+#define DATACHECKSUMSWORKER_H
+
+/* Shared memory */
+extern Size DataChecksumsWorkerShmemSize(void);
+extern void DataChecksumsWorkerShmemInit(void);
+
+/* Start the background processes for enabling or disabling checksums */
+void		StartDataChecksumsWorkerLauncher(bool enable_checksums,
+											 int cost_delay,
+											 int cost_limit,
+											 bool fast);
+
+/* Background worker entrypoints */
+void		DataChecksumsWorkerLauncherMain(Datum arg);
+void		DataChecksumsWorkerMain(Datum arg);
+
+#endif							/* DATACHECKSUMSWORKER_H */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 6646b6f6371..5326782171f 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -205,7 +205,17 @@ typedef PageHeaderData *PageHeader;
  * handling pages.
  */
 #define PG_PAGE_LAYOUT_VERSION		4
+
+/*
+ * Checksum version 0 is used for when data checksums are disabled.
+ * PG_DATA_CHECKSUM_VERSION defines that data checksums are enabled in the
+ * cluster and PG_DATA_CHECKSUM_INPROGRESS_{ON|OFF}_VERSION defines that data
+ * checksums are either currently being enabled or disabled.
+ */
 #define PG_DATA_CHECKSUM_VERSION	1
+#define PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION	2
+#define PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION	3
+
 
 /* ----------------------------------------------------------------
  *						page support functions
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 25d13a798d1..6faff962ef0 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -15,6 +15,14 @@
 
 #include "storage/block.h"
 
+typedef enum ChecksumType
+{
+	DATA_CHECKSUMS_OFF = 0,
+	DATA_CHECKSUMS_ON,
+	DATA_CHECKSUMS_INPROGRESS_ON,
+	DATA_CHECKSUMS_INPROGRESS_OFF
+} ChecksumType;
+
 /*
  * Compute the checksum for a Postgres page.  The page must be aligned on a
  * 4-byte boundary.
diff --git a/src/include/storage/lwlocklist.h b/src/include/storage/lwlocklist.h
index cf565452382..afe39db753c 100644
--- a/src/include/storage/lwlocklist.h
+++ b/src/include/storage/lwlocklist.h
@@ -83,3 +83,4 @@ PG_LWLOCK(49, WALSummarizer)
 PG_LWLOCK(50, DSMRegistry)
 PG_LWLOCK(51, InjectionPoint)
 PG_LWLOCK(52, SerialControl)
+PG_LWLOCK(53, DataChecksumsWorker)
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 114eb1f8f76..61a3e3af9d4 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -447,9 +447,10 @@ extern PGDLLIMPORT PGPROC *PreparedXactProcs;
  * Background writer, checkpointer, WAL writer, WAL summarizer, and archiver
  * run during normal operation.  Startup process and WAL receiver also consume
  * 2 slots, but WAL writer is launched only after startup has exited, so we
- * only need 6 slots.
+ * only need 6 slots to cover these. The DataChecksums worker and launcher
+ * can consume 2 slots when data checksums are enabled or disabled.
  */
-#define NUM_AUXILIARY_PROCS		6
+#define NUM_AUXILIARY_PROCS		8
 
 /* configurable options */
 extern PGDLLIMPORT int DeadlockTimeout;
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 022fd8ed933..8937fa6ed3d 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -54,6 +54,11 @@ typedef enum
 typedef enum
 {
 	PROCSIGNAL_BARRIER_SMGRRELEASE, /* ask smgr to close files */
+
+	PROCSIGNAL_BARRIER_CHECKSUM_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_ON,
 } ProcSignalBarrierType;
 
 /*
diff --git a/src/include/utils/backend_progress.h b/src/include/utils/backend_progress.h
index dda813ab407..c664e92dbfe 100644
--- a/src/include/utils/backend_progress.h
+++ b/src/include/utils/backend_progress.h
@@ -28,6 +28,7 @@ typedef enum ProgressCommandType
 	PROGRESS_COMMAND_CREATE_INDEX,
 	PROGRESS_COMMAND_BASEBACKUP,
 	PROGRESS_COMMAND_COPY,
+	PROGRESS_COMMAND_DATACHECKSUMS,
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	20
diff --git a/src/test/Makefile b/src/test/Makefile
index 511a72e6238..278ce3e8a86 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,16 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = perl postmaster regress isolation modules authentication recovery subscription
+SUBDIRS = \
+		perl \
+		postmaster \
+		regress \
+		isolation \
+		modules \
+		authentication \
+		recovery \
+		subscription \
+		checksum
 
 ifeq ($(with_icu),yes)
 SUBDIRS += icu
diff --git a/src/test/checksum/.gitignore b/src/test/checksum/.gitignore
new file mode 100644
index 00000000000..871e943d50e
--- /dev/null
+++ b/src/test/checksum/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/checksum/Makefile b/src/test/checksum/Makefile
new file mode 100644
index 00000000000..fd03bf73df4
--- /dev/null
+++ b/src/test/checksum/Makefile
@@ -0,0 +1,23 @@
+#-------------------------------------------------------------------------
+#
+# Makefile for src/test/checksum
+#
+# Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/checksum/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/test/checksum
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
+
+clean distclean maintainer-clean:
+	rm -rf tmp_check
diff --git a/src/test/checksum/README b/src/test/checksum/README
new file mode 100644
index 00000000000..0f0317060b3
--- /dev/null
+++ b/src/test/checksum/README
@@ -0,0 +1,22 @@
+src/test/checksum/README
+
+Regression tests for data checksums
+===================================
+
+This directory contains a test suite for enabling data checksums
+in a running cluster.
+
+Running the tests
+=================
+
+    make check
+
+or
+
+    make installcheck
+
+NOTE: This creates a temporary installation (in the case of "check"),
+with multiple nodes, be they master or standby(s) for the purpose of
+the tests.
+
+NOTE: This requires the --enable-tap-tests argument to configure.
diff --git a/src/test/checksum/meson.build b/src/test/checksum/meson.build
new file mode 100644
index 00000000000..5f96b5c246d
--- /dev/null
+++ b/src/test/checksum/meson.build
@@ -0,0 +1,15 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+tests += {
+  'name': 'checksums',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'tests': [
+      't/001_basic.pl',
+      't/002_restarts.pl',
+      't/003_standby_restarts.pl',
+      't/004_offline.pl',
+    ],
+  },
+}
diff --git a/src/test/checksum/t/001_basic.pl b/src/test/checksum/t/001_basic.pl
new file mode 100644
index 00000000000..4c64f6a14fc
--- /dev/null
+++ b/src/test/checksum/t/001_basic.pl
@@ -0,0 +1,88 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are turned off
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable data checksums
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+# Wait for checksums to become enabled
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Enable data checksums again which should be a no-op..
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+# ..and make sure we can still read/write data
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums again
+$node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are disabled');
+
+# Test reading again
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure previously checksummed pages can be read back');
+
+# Re-enable checksums and make sure that the underlying data has changed to
+# ensure that checksums will be different.
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back the data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$node->stop;
+
+done_testing();
diff --git a/src/test/checksum/t/002_restarts.pl b/src/test/checksum/t/002_restarts.pl
new file mode 100644
index 00000000000..2697b722257
--- /dev/null
+++ b/src/test/checksum/t/002_restarts.pl
@@ -0,0 +1,92 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# restarting the processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, '1', "ensure checksums aren't enabled yet");
+
+$bsession->quit;
+$node->stop;
+$node->start;
+
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'inprogress-on', "ensure checksums aren't enabled yet");
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are turned on');
+
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+$result = $node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are turned off');
+
+done_testing();
diff --git a/src/test/checksum/t/003_standby_restarts.pl b/src/test/checksum/t/003_standby_restarts.pl
new file mode 100644
index 00000000000..6c0fe8f3bf8
--- /dev/null
+++ b/src/test/checksum/t/003_standby_restarts.pl
@@ -0,0 +1,131 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# streaming replication
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize primary node
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+$node_primary->init(allows_streaming => 1, no_data_checksums => 1);
+$node_primary->start;
+my $backup_name = 'my_backup';
+
+# Take backup
+$node_primary->backup($backup_name);
+
+# Create streaming standby linking to primary
+my $node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $backup_name,
+	has_streaming => 1);
+$node_standby_1->start;
+
+# Create some content on the primary to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Wait for standbys to catch up
+$node_primary->wait_for_catchup($node_standby_1, 'replay',
+	$node_primary->lsn('insert'));
+
+# Check that checksums are turned off on all nodes
+my $result = $node_primary->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on primary');
+
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on standby_1');
+
+# Enable checksums for the cluster
+$node_primary->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+# Ensure that the primary switches to "inprogress-on"
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	"inprogress-on");
+is($result, 1, 'ensure checksums are in progress on primary');
+
+# Wait for checksum enable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Ensure that the standby has switched to "inprogress-on" or "on".  Normally it
+# would be "inprogress-on", but it is theoretically possible for the primary to
+# complete the checksum enabling *and* have the standby replay that record
+# before we reach the check below.
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting = 'off' FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'f');
+is($result, 1, 'ensure standby has absorbed the inprogress-on barrier');
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+
+is(($result eq 'inprogress-on' || $result eq 'on'),
+	1, 'ensure checksums are on, or in progress, on standby_1');
+
+# Insert some more data which should be checksummed on INSERT
+$node_primary->safe_psql('postgres',
+	"INSERT INTO t VALUES (generate_series(1, 10000));");
+
+# Wait for checksums enabled on the primary
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the primary');
+
+# Wait for checksums enabled on the standby
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the standby');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, '20000', 'ensure we can safely read all data with checksums');
+
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+# Disable checksums and ensure it's propagated to standby and that we can
+# still read all data
+$node_primary->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+# Wait for checksum disable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure data checksums are disabled on the primary 2');
+
+# Ensure that the standby has switched to off
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are off on standby_1');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, "20000", 'ensure we can safely read all data without checksums');
+
+done_testing();
diff --git a/src/test/checksum/t/004_offline.pl b/src/test/checksum/t/004_offline.pl
new file mode 100644
index 00000000000..9cee62c9b52
--- /dev/null
+++ b/src/test/checksum/t/004_offline.pl
@@ -0,0 +1,101 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums offline from various states
+# of checksum processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable checksums offline using pg_checksums
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are enabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums offline again using pg_checksums
+$node->stop;
+$node->checksum_disable_offline;
+$node->start;
+
+# Ensure that checksums are disabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, 1, 'ensure checksums are in the process of being enabled');
+
+# Turn the cluster off and enable checksums offline, then start back up
+$bsession->quit;
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are now enabled even though processing wasn't
+# restarted
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+done_testing();
diff --git a/src/test/meson.build b/src/test/meson.build
index ccc31d6a86a..c4bfef2c0e2 100644
--- a/src/test/meson.build
+++ b/src/test/meson.build
@@ -8,6 +8,7 @@ subdir('postmaster')
 subdir('recovery')
 subdir('subscription')
 subdir('modules')
+subdir('checksum')
 
 if ssl.found()
   subdir('ssl')
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index b105cba05a6..666bd2a2d4c 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3748,6 +3748,40 @@ sub advance_wal
 	}
 }
 
+=item $node->checksum_enable_offline()
+
+Enable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_enable_offline
+{
+	my ($self) = @_;
+
+	print "### Enabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-e');
+	return;
+}
+
+=item checksum_disable_offline
+
+Disable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_disable_offline
+{
+	my ($self) = @_;
+
+	print "### Disabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-d');
+	return;
+}
+
 =pod
 
 =back
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 62f69ac20b2..36bed4b168d 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2041,6 +2041,34 @@ pg_stat_progress_create_index| SELECT s.pid,
     s.param15 AS partitions_done
    FROM (pg_stat_get_progress_info('CREATE INDEX'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
+pg_stat_progress_data_checksums| SELECT s.pid,
+    s.datid,
+    d.datname,
+        CASE s.param1
+            WHEN 0 THEN 'enabling'::text
+            WHEN 1 THEN 'disabling'::text
+            WHEN 2 THEN 'waiting'::text
+            WHEN 3 THEN 'waiting on backends'::text
+            WHEN 4 THEN 'waiting on temporary tables'::text
+            WHEN 5 THEN 'done'::text
+            ELSE NULL::text
+        END AS phase,
+        CASE s.param2
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param2
+        END AS databases_total,
+        CASE s.param3
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param3
+        END AS relations_total,
+    s.param4 AS databases_processed,
+    s.param5 AS relations_processed,
+    s.param6 AS databases_current,
+    s.param7 AS relation_current,
+    s.param8 AS relation_current_blocks,
+    s.param9 AS relation_current_blocks_processed
+   FROM (pg_stat_get_progress_info('DATACHECKSUMS'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 9840060997f..86e4057f61d 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -403,6 +403,7 @@ CheckPointStmt
 CheckpointStatsData
 CheckpointerRequest
 CheckpointerShmemStruct
+ChecksumType
 Chromosome
 CkptSortItem
 CkptTsStatus
@@ -593,6 +594,10 @@ DataPageDeleteStack
 DataTypesUsageChecks
 DataTypesUsageVersionCheck
 DatabaseInfo
+DataChecksumsWorkerDatabase
+DataChecksumsWorkerResult
+DataChecksumsWorkerResultEntry
+DataChecksumsWorkerShmemStruct
 DateADT
 DateTimeErrorExtra
 Datum
@@ -4141,6 +4146,7 @@ xl_btree_split
 xl_btree_unlink_page
 xl_btree_update
 xl_btree_vacuum
+xl_checksum_state
 xl_clog_truncate
 xl_commit_ts_truncate
 xl_dbase_create_file_copy_rec
-- 
2.48.1

v20250310-0002-simple-post-rebase-fixes.patchtext/x-patch; charset=UTF-8; name=v20250310-0002-simple-post-rebase-fixes.patchDownload
From 54fda18ca79d1ab2626c8c5a5177f462a43f4976 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Fri, 7 Mar 2025 20:19:29 +0100
Subject: [PATCH v20250310 2/5] simple post-rebase fixes

- Update checks in PostmasterStateMachine to account for datachecksum
  workers, etc.

- Remove pgstat_bestart() call - it would need to be _initial(), but I
  don't think it's needed.

- Update vacuum_delay_point() call.

- Cast PID to long in elog call (same as we do in postmaster.c)
---
 src/backend/postmaster/datachecksumsworker.c | 7 ++-----
 src/backend/postmaster/postmaster.c          | 5 +++++
 src/backend/utils/activity/pgstat_backend.c  | 2 ++
 3 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
index de7a077f9c2..b30d5481b06 100644
--- a/src/backend/postmaster/datachecksumsworker.c
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -450,7 +450,7 @@ ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrateg
 		if (abort_requested)
 			return false;
 
-		vacuum_delay_point();
+		vacuum_delay_point(false);
 	}
 
 	pfree(relns);
@@ -587,7 +587,7 @@ ProcessDatabase(DataChecksumsWorkerDatabase *db)
 					db->dbname)));
 
 	snprintf(activity, sizeof(activity) - 1,
-			 "Waiting for worker in database %s (pid %d)", db->dbname, pid);
+			 "Waiting for worker in database %s (pid %ld)", db->dbname, (long) pid);
 	pgstat_report_activity(STATE_RUNNING, activity);
 
 	status = WaitForBackgroundWorkerShutdown(bgw_handle);
@@ -752,9 +752,6 @@ DataChecksumsWorkerLauncherMain(Datum arg)
 	 */
 	BackgroundWorkerInitializeConnectionByOid(InvalidOid, InvalidOid, 0);
 
-	/* Initialize backend status information */
-	pgstat_bestart();
-
 	enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
 	DataChecksumsWorkerShmem->launcher_running = true;
 	DataChecksumsWorkerShmem->enabling_checksums = enabling_checksums;
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index d2a7a7add6f..2fc438987b5 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2947,6 +2947,11 @@ PostmasterStateMachine(void)
 									B_INVALID,
 									B_STANDALONE_BACKEND);
 
+			/* also add checksumming processes */
+			remainMask = btmask_add(remainMask,
+									B_DATACHECKSUMSWORKER_LAUNCHER,
+									B_DATACHECKSUMSWORKER_WORKER);
+
 			/* All types should be included in targetMask or remainMask */
 			Assert((remainMask.mask | targetMask.mask) == BTYPE_MASK_ALL.mask);
 		}
diff --git a/src/backend/utils/activity/pgstat_backend.c b/src/backend/utils/activity/pgstat_backend.c
index a9343b7b59e..f4b2f3b91d8 100644
--- a/src/backend/utils/activity/pgstat_backend.c
+++ b/src/backend/utils/activity/pgstat_backend.c
@@ -292,6 +292,8 @@ pgstat_tracks_backend_bktype(BackendType bktype)
 		case B_BG_WRITER:
 		case B_CHECKPOINTER:
 		case B_STARTUP:
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 			return false;
 
 		case B_AUTOVAC_WORKER:
-- 
2.48.1

v20250310-0003-sync-the-data_checksums-GUC-with-the-local.patchtext/x-patch; charset=UTF-8; name=v20250310-0003-sync-the-data_checksums-GUC-with-the-local.patchDownload
From 8d4315cb29088fe46907ab638656df295f3ab43c Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Fri, 7 Mar 2025 23:07:44 +0100
Subject: [PATCH v20250310 3/5] sync the data_checksums GUC with the local
 variable

We now have three places that in some way express the state of data
checksums in the instance.

- control file / data_checksum_version

- LocalDataChecksumVersion as a local cache to reduce locking

- data_checksums backing the GUC

We need to keep the GUC variable in sync, to ensure we get the correct
value even early during startup (e.g. with -C).

Introduces a new SetLocalDataChecksumVersion() which sets the local
variable, and also updates the data_checksum GUC at the same time.
---
 src/backend/access/transam/xlog.c   | 51 +++++++++++++++++++++++++----
 src/backend/utils/misc/guc_tables.c |  3 +-
 src/include/access/xlog.h           |  1 +
 3 files changed, 47 insertions(+), 8 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index b5190a7e104..f137cdc6d42 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -657,6 +657,14 @@ static bool updateMinRecoveryPoint = true;
  */
 static uint32 LocalDataChecksumVersion = 0;
 
+/*
+ * Variable backing the GUC, keep it in sync with LocalDataChecksumVersion.
+ * See SetLocalDataChecksumVersion().
+ */
+int			data_checksums = 0;
+
+static void SetLocalDataChecksumVersion(uint32 data_checksum_version);
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -4594,7 +4602,7 @@ ReadControlFile(void)
 
 	CalculateCheckpointSegments();
 
-	LocalDataChecksumVersion = ControlFile->data_checksum_version;
+	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
 }
 
 /*
@@ -4913,7 +4921,7 @@ SetDataChecksumsOff(void)
 bool
 AbsorbChecksumsOnInProgressBarrier(void)
 {
-	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
 	return true;
 }
 
@@ -4921,21 +4929,21 @@ bool
 AbsorbChecksumsOnBarrier(void)
 {
 	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
-	LocalDataChecksumVersion = PG_DATA_CHECKSUM_VERSION;
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_VERSION);
 	return true;
 }
 
 bool
 AbsorbChecksumsOffInProgressBarrier(void)
 {
-	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
 	return true;
 }
 
 bool
 AbsorbChecksumsOffBarrier(void)
 {
-	LocalDataChecksumVersion = 0;
+	SetLocalDataChecksumVersion(0);
 	return true;
 }
 
@@ -4951,10 +4959,41 @@ void
 InitLocalControldata(void)
 {
 	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
-	LocalDataChecksumVersion = ControlFile->data_checksum_version;
+	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
 	LWLockRelease(ControlFileLock);
 }
 
+/*
+ * XXX probably should be called in all places that modify the value of
+ * LocalDataChecksumVersion (to make sure data_checksums GUC is in sync)
+ *
+ * XXX aren't PG_DATA_ and DATA_ constants the same? why do we need both?
+ */
+void
+SetLocalDataChecksumVersion(uint32 data_checksum_version)
+{
+	LocalDataChecksumVersion = data_checksum_version;
+
+	switch (LocalDataChecksumVersion)
+	{
+		case PG_DATA_CHECKSUM_VERSION:
+			data_checksums = DATA_CHECKSUMS_ON;
+			break;
+
+		case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+			data_checksums = DATA_CHECKSUMS_INPROGRESS_ON;
+			break;
+
+		case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+			data_checksums = DATA_CHECKSUMS_INPROGRESS_OFF;
+			break;
+
+		default:
+			data_checksums = DATA_CHECKSUMS_OFF;
+			break;
+	}
+}
+
 /* guc hook */
 const char *
 show_data_checksums(void)
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 05cba3a02e3..66abf056159 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -609,7 +609,6 @@ static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
 static int	num_os_semaphores;
-static int	data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
@@ -5300,7 +5299,7 @@ struct config_enum ConfigureNamesEnum[] =
 		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
 			gettext_noop("Shows whether data checksums are turned on for this cluster."),
 			NULL,
-			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
 		},
 		&data_checksums,
 		DATA_CHECKSUMS_OFF, data_checksums_options,
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 0bb32c9c0fc..aec3ea0bc63 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -56,6 +56,7 @@ extern PGDLLIMPORT int CommitDelay;
 extern PGDLLIMPORT int CommitSiblings;
 extern PGDLLIMPORT bool track_wal_io_timing;
 extern PGDLLIMPORT int wal_decode_buffer_size;
+extern PGDLLIMPORT int data_checksums;
 
 extern PGDLLIMPORT int CheckPointSegments;
 
-- 
2.48.1

v20250310-0004-make-progress-reporting-work.patchtext/x-patch; charset=UTF-8; name=v20250310-0004-make-progress-reporting-work.patchDownload
From 1f043a7de9e36005c75b5386d17ea2d75dda9e43 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Sat, 8 Mar 2025 19:34:50 +0100
Subject: [PATCH v20250310 4/5] make progress reporting work

- Splits the progress status by worker type - one row for launcher,
  one row for checksum worker (might be more with parallel workers).

- The launcher only updates database counters, the workers only set
  relation/block counters.

- Also reworks the columns in the system view a bit, discards the
  "current" fields (we still know the database for each worker).

- Issue: Not sure what to do about relation forks, at this point it
  tracks only blocks for MAIN fork.
---
 src/backend/catalog/system_views.sql         | 36 ++++----
 src/backend/postmaster/datachecksumsworker.c | 91 +++++++++++++++++---
 src/include/commands/progress.h              | 27 +++---
 src/test/regress/expected/rules.out          | 29 ++++---
 4 files changed, 131 insertions(+), 52 deletions(-)

diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 0d62976dc1f..4330d0ad656 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1335,25 +1335,25 @@ CREATE VIEW pg_stat_progress_copy AS
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
 CREATE VIEW pg_stat_progress_data_checksums AS
-	SELECT
-		S.pid AS pid, S.datid AS datid, D.datname AS datname,
-		CASE S.param1 WHEN 0 THEN 'enabling'
+    SELECT
+        S.pid AS pid, S.datid, D.datname AS datname,
+        CASE S.param1 WHEN 0 THEN 'enabling'
                       WHEN 1 THEN 'disabling'
-					  WHEN 2 THEN 'waiting'
-					  WHEN 3 THEN 'waiting on backends'
-					  WHEN 4 THEN 'waiting on temporary tables'
-					  WHEN 5 THEN 'done'
-					  END AS phase,
-		CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
-		CASE S.param3 WHEN -1 THEN NULL ELSE S.param3 END AS relations_total,
-		S.param4 AS databases_processed,
-		S.param5 AS relations_processed,
-		S.param6 AS databases_current,
-		S.param7 AS relation_current,
-		S.param8 AS relation_current_blocks,
-		S.param9 AS relation_current_blocks_processed
-	FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
-		LEFT JOIN pg_database D ON S.datid = D.oid;
+                      WHEN 2 THEN 'waiting'
+                      WHEN 3 THEN 'waiting on backends'
+                      WHEN 4 THEN 'waiting on temporary tables'
+                      WHEN 5 THEN 'waiting on checkpoint'
+                      WHEN 6 THEN 'done'
+                      END AS phase,
+        CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+        S.param3 AS databases_done,
+        CASE S.param4 WHEN -1 THEN NULL ELSE S.param4 END AS relations_total,
+        CASE S.param5 WHEN -1 THEN NULL ELSE S.param5 END AS relations_done,
+        CASE S.param6 WHEN -1 THEN NULL ELSE S.param6 END AS blocks_total,
+        CASE S.param7 WHEN -1 THEN NULL ELSE S.param7 END AS blocks_done
+    FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid
+    ORDER BY S.datid; -- return the launcher process first
 
 CREATE VIEW pg_user_mappings AS
     SELECT
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
index b30d5481b06..4101fdff107 100644
--- a/src/backend/postmaster/datachecksumsworker.c
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -409,6 +409,11 @@ ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrateg
 			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
 	pgstat_report_activity(STATE_RUNNING, activity);
 
+	/* XXX only do this for main forks, maybe we should do it for all? */
+	if (forkNum == MAIN_FORKNUM)
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+									 numblocks);
+
 	/*
 	 * We are looping over the blocks which existed at the time of process
 	 * start, which is safe since new blocks are created with checksums set
@@ -450,6 +455,11 @@ ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrateg
 		if (abort_requested)
 			return false;
 
+		/* XXX only do this for main forks, maybe we should do it for all? */
+		if (forkNum == MAIN_FORKNUM)
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+										 (blknum + 1));
+
 		vacuum_delay_point(false);
 	}
 
@@ -798,6 +808,8 @@ again:
 		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
 									 PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS);
 
+		/* XXX isn't it weird there's no wait between the phase updates? */
+
 		/*
 		 * Set the state to inprogress-on and wait on the procsignal barrier.
 		 */
@@ -908,8 +920,29 @@ ProcessAllDatabases(bool immediate_checkpoint)
 	 * columns for processed databases is instead increased such that it can
 	 * be compared against the total.
 	 */
-	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_DB,
-								 list_length(DatabaseList));
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_DBS_TOTAL,
+			PROGRESS_DATACHECKSUMS_DBS_DONE,
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE,
+			PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+			PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+		};
+
+		int64	vals[6];
+
+		vals[0] = list_length(DatabaseList);
+		vals[1] = 0;
+
+		/* translated to NULL */
+		vals[2] = -1;
+		vals[3] = -1;
+		vals[4] = -1;
+		vals[5] = -1;
+
+		pgstat_progress_update_multi_param(6, index, vals);
+	}
 
 	while (true)
 	{
@@ -921,14 +954,6 @@ ProcessAllDatabases(bool immediate_checkpoint)
 			DataChecksumsWorkerResultEntry *entry;
 			bool		found;
 
-			/*
-			 * Indicate which database is being processed set the number of
-			 * relations to -1 to clear field from previous values. -1 will
-			 * translate to NULL in the progress view.
-			 */
-			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_CUR_DB, db->dboid);
-			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_REL, -1);
-
 			/*
 			 * Check if this database has been processed already, and if so
 			 * whether it should be retried or skipped.
@@ -957,6 +982,12 @@ ProcessAllDatabases(bool immediate_checkpoint)
 			result = ProcessDatabase(db);
 			processed_databases++;
 
+			/*
+			 * Update the number of processed databases in the progress report.
+			 */
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_DBS_DONE,
+										 processed_databases);
+
 			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
 			{
 				/*
@@ -1050,6 +1081,13 @@ ProcessAllDatabases(bool immediate_checkpoint)
 				 errhint("The server log might have more information on the cause of the error.")));
 	}
 
+	/*
+	 * When enabling checksums, we have to wait for a checkpoint for the
+	 * checksums to e.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT);
+
 	/*
 	 * Force a checkpoint to get everything out to disk. The use of immediate
 	 * checkpoints is for running tests, as they would otherwise not execute
@@ -1255,6 +1293,7 @@ DataChecksumsWorkerMain(Datum arg)
 	List	   *InitialTempTableList = NIL;
 	BufferAccessStrategy strategy;
 	bool		aborted = false;
+	int64		rels_done;
 
 	enabling_checksums = true;
 
@@ -1268,6 +1307,10 @@ DataChecksumsWorkerMain(Datum arg)
 	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
 											  BGWORKER_BYPASS_ALLOWCONN);
 
+	/* worker will have a separate entry in pg_stat_progress_data_checksums */
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
 	/*
 	 * Get a list of all temp tables present as we start in this database. We
 	 * need to wait until they are all gone until we are done, since we cannot
@@ -1294,6 +1337,24 @@ DataChecksumsWorkerMain(Datum arg)
 
 	RelationList = BuildRelationList(false,
 									 DataChecksumsWorkerShmem->process_shared_catalogs);
+
+	/* Update the total number of relations to be processed in this DB. */
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE
+		};
+
+		int64	vals[2];
+
+		vals[0] = list_length(RelationList);
+		vals[1] = 0;
+
+		pgstat_progress_update_multi_param(2, index, vals);
+	}
+
+	/* process the relations */
+	rels_done = 0;
 	foreach_oid(reloid, RelationList)
 	{
 		if (!ProcessSingleRelationByOid(reloid, strategy))
@@ -1301,6 +1362,9 @@ DataChecksumsWorkerMain(Datum arg)
 			aborted = true;
 			break;
 		}
+
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_RELS_DONE,
+									 ++rels_done);
 	}
 	list_free(RelationList);
 
@@ -1313,6 +1377,10 @@ DataChecksumsWorkerMain(Datum arg)
 		return;
 	}
 
+	/* The worker is about to wait for temporary tables to go away. */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL);
+
 	/*
 	 * Wait for all temp tables that existed when we started to go away. This
 	 * is necessary since we cannot "reach" them to enable checksums. Any temp
@@ -1369,5 +1437,8 @@ DataChecksumsWorkerMain(Datum arg)
 
 	list_free(InitialTempTableList);
 
+	/* worker done */
+	pgstat_progress_end_command();
+
 	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
 }
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 1ebd0c792b4..94b478a6cc9 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -158,21 +158,20 @@
 #define PROGRESS_COPY_TYPE_CALLBACK 4
 
 /* Progress parameters for PROGRESS_DATACHECKSUMS */
-#define PROGRESS_DATACHECKSUMS_PHASE 0
-#define PROGRESS_DATACHECKSUMS_TOTAL_DB 1
-#define PROGRESS_DATACHECKSUMS_TOTAL_REL 2
-#define PROGRESS_DATACHECKSUMS_PROCESSED_DB 3
-#define PROGRESS_DATACHECKSUMS_PROCESSED_REL 4
-#define PROGRESS_DATACHECKSUMS_CUR_DB 5
-#define PROGRESS_DATACHECKSUMS_CUR_REL 6
-#define PROGRESS_DATACHECKSUMS_CUR_REL_TOTAL_BLOCKS 7
-#define PROGRESS_DATACHECKSUMS_CUR_REL_PROCESSED_BLOCKS 8
+#define PROGRESS_DATACHECKSUMS_PHASE		0
+#define PROGRESS_DATACHECKSUMS_DBS_TOTAL	1
+#define PROGRESS_DATACHECKSUMS_DBS_DONE		2
+#define PROGRESS_DATACHECKSUMS_RELS_TOTAL	3
+#define PROGRESS_DATACHECKSUMS_RELS_DONE	4
+#define PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL	5
+#define PROGRESS_DATACHECKSUMS_BLOCKS_DONE	6
 
 /* Phases of datachecksumsworker operation */
-#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING 0
-#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING 1
-#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS 2
-#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL 3
-#define PROGRESS_DATACHECKSUMS_DONE 4
+#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING			0
+#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING			1
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING			2
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS	3
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL	4
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT 5
 
 #endif
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 36bed4b168d..2cfea837554 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2050,25 +2050,34 @@ pg_stat_progress_data_checksums| SELECT s.pid,
             WHEN 2 THEN 'waiting'::text
             WHEN 3 THEN 'waiting on backends'::text
             WHEN 4 THEN 'waiting on temporary tables'::text
-            WHEN 5 THEN 'done'::text
+            WHEN 5 THEN 'waiting on checkpoint'::text
+            WHEN 6 THEN 'done'::text
             ELSE NULL::text
         END AS phase,
         CASE s.param2
             WHEN '-1'::integer THEN NULL::bigint
             ELSE s.param2
         END AS databases_total,
-        CASE s.param3
+    s.param3 AS databases_done,
+        CASE s.param4
             WHEN '-1'::integer THEN NULL::bigint
-            ELSE s.param3
+            ELSE s.param4
         END AS relations_total,
-    s.param4 AS databases_processed,
-    s.param5 AS relations_processed,
-    s.param6 AS databases_current,
-    s.param7 AS relation_current,
-    s.param8 AS relation_current_blocks,
-    s.param9 AS relation_current_blocks_processed
+        CASE s.param5
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param5
+        END AS relations_done,
+        CASE s.param6
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param6
+        END AS blocks_total,
+        CASE s.param7
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param7
+        END AS blocks_done
    FROM (pg_stat_get_progress_info('DATACHECKSUMS'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
-     LEFT JOIN pg_database d ON ((s.datid = d.oid)));
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)))
+  ORDER BY s.datid;
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,
-- 
2.48.1

v20250310-0005-update-docs.patchtext/x-patch; charset=UTF-8; name=v20250310-0005-update-docs.patchDownload
From 318c3f960707f80a53437635140ab935e936262e Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Sun, 9 Mar 2025 13:34:10 +0100
Subject: [PATCH v20250310 5/5] update docs

---
 doc/src/sgml/monitoring.sgml | 84 ++++++++++++++++++++----------------
 doc/src/sgml/wal.sgml        |  8 ++--
 2 files changed, 51 insertions(+), 41 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 4a6ef5f1605..2a568ee2357 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6804,8 +6804,8 @@ FROM pg_stat_get_backend_idset() AS backendid;
   <para>
    When data checksums are being enabled on a running cluster, the
    <structname>pg_stat_progress_data_checksums</structname> view will contain
-   a row for each background worker which is currently calculating checksums
-   for the data pages.
+   a row for the launcher process, and one row for each worker process which
+   is currently calculating checksums for the data pages in one database.
   </para>
 
   <table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
@@ -6837,37 +6837,33 @@ FROM pg_stat_get_backend_idset() AS backendid;
      </row>
 
      <row>
-      <entry role="catalog_table_entry">
-       <para role="column_definition">
-        <structfield>phase</structfield> <type>text</type>
-       </para>
-       <para>
-        Current processing phase, see <xref linkend="datachecksum-phases"/>
-        for description of the phases.
-       </para>
-      </entry>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datid</structfield> <type>oid</type>
+      </para>
+      <para>
+       OID of this database, or 0 for the launcher process
+       relation
+      </para></entry>
      </row>
 
      <row>
-      <entry role="catalog_table_entry">
-       <para role="column_definition">
-        <structfield>databases_total</structfield> <type>integer</type>
-       </para>
-       <para>
-        The total number of databases which will be processed.
-       </para>
-      </entry>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datname</structfield> <type>name</type>
+      </para>
+      <para>
+       Name of this database, or <literal>NULL</literal> for the
+       launcher process.
+      </para></entry>
      </row>
 
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>relations_total</structfield> <type>integer</type>
+        <structfield>phase</structfield> <type>text</type>
        </para>
        <para>
-        The total number of relations which will be processed, or
-        <literal>NULL</literal> if the data checksums worker process hasn't
-        calculated the number of relations yet.
+        Current processing phase, see <xref linkend="datachecksum-phases"/>
+        for description of the phases.
        </para>
       </entry>
      </row>
@@ -6875,10 +6871,12 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>databases_processed</structfield> <type>integer</type>
+        <structfield>databases_total</structfield> <type>integer</type>
        </para>
        <para>
-        The number of databases which have been processed.
+        The total number of databases which will be processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6886,10 +6884,12 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>relations_processed</structfield> <type>integer</type>
+        <structfield>databases_done</structfield> <type>integer</type>
        </para>
        <para>
-        The number of relations which have been processed.
+        The number of databases which have been processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6897,10 +6897,13 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>databases_current</structfield> <type>oid</type>
+        <structfield>relations_total</structfield> <type>integer</type>
        </para>
        <para>
-        OID of the database currently being processed.
+        The total number of relations which will be processed, or
+        <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of relations yet. The launcher process has
+        this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6908,10 +6911,11 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>relation_current</structfield> <type>oid</type>
+        <structfield>relations_done</structfield> <type>integer</type>
        </para>
        <para>
-        OID of the relation currently being processed.
+        The number of relations which have been processed. The launcher
+        process has this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6919,10 +6923,13 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>relation_current_blocks</structfield> <type>integer</type>
+        <structfield>blocks_total</structfield> <type>integer</type>
        </para>
        <para>
-        The total number of blocks in the relation currently being processed.
+        The number of blocks in the current relation which will be processed,
+        or <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of blocks yet. The launcher process has
+        this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6930,11 +6937,11 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>relation_current_blocks_processed</structfield> <type>integer</type>
+        <structfield>blocks_done</structfield> <type>integer</type>
        </para>
        <para>
-        The number of blocks which have been processed in the relation currently
-        being processed.
+        The number of blocks in the current relation which have been processed.
+        The launcher process has this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6982,9 +6989,10 @@ FROM pg_stat_get_backend_idset() AS backendid;
       </entry>
      </row>
      <row>
-      <entry><literal>done</literal></entry>
+      <entry><literal>waiting on checkpoint</literal></entry>
       <entry>
-       The command has finished processing and is exiting.
+       The command is currently waiting for a checkpoint to update the checksum
+       state at the end.
       </entry>
      </row>
     </tbody>
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index 2562b1a002c..0ada90ca0b1 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -310,15 +310,17 @@
     If the cluster is stopped while in <literal>inprogress-on</literal> mode, for
     any reason, then this process must be restarted manually. To do this,
     re-execute the function <function>pg_enable_data_checksums()</function>
-    once the cluster has been restarted. The background worker will attempt
-    to resume the work from where it was interrupted.
+    once the cluster has been restarted. The process will start over, there is
+    no support for resuming work from where it was interrupted.
    </para>
 
    <note>
     <para>
      Enabling checksums can cause significant I/O to the system, as most of the
      database pages will need to be rewritten, and will be written both to the
-     data files and the WAL.
+     data files and the WAL. The impact may be limited by throttling using the
+     <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter>
+     parameters of the <function>pg_enable_data_checksums</function> function.
     </para>
    </note>
 
-- 
2.48.1

#22Tomas Vondra
tomas@vondra.me
In reply to: Tomas Vondra (#21)
5 attachment(s)
Re: Changing the state of data checksums in a running cluster

On 3/10/25 00:35, Tomas Vondra wrote:

Seems cfbot was unhappy with the patches, so here's an improved version,
fixing some minor issues in expected output and a compiler warning.

There however seems to be some issue with 003_standby_restarts, which
causes failures on freebsd and macos. I don't know what that is about,
but the test runs much longer than on debian.

OK, turns out the failures were caused by the test creating a standby
from a backup, without a slot, so sometimes the primary removed the
necessary WAL. Fixed in the attached version.

There's still a failure on windows, though. I'd bet that's due to the
data_checksum/LocalDatachecksumVersion sync not working correctly on
builds with EXEC_BACKEND, or something like that, but it's too late so
I'll take a closer look tomorrow.

regards

--
Tomas Vondra

Attachments:

v20250310b-0001-Online-enabling-and-disabling-of-data-che.patchtext/x-patch; charset=UTF-8; name=v20250310b-0001-Online-enabling-and-disabling-of-data-che.patchDownload
From c8e48010604a0042d22d1b0d3b31c335277b1ceb Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Fri, 7 Mar 2025 19:21:22 +0100
Subject: [PATCH v20250310b 1/5] Online enabling and disabling of data
 checksums

This allows data checksums to be enabled, or disabled, in a running
cluster without restricting access to the cluster during processing.

Data checksums could prior to this only be enabled during initdb or
when the cluster is offline using the pg_checksums app. This commit
introduce functionality to enable, or disable, data checksums while
the cluster is running regardless of how it was initialized.

A background worker launcher process is responsible for launching a
dynamic per-database background worker which will mark all buffers
dirty for all relation with storage in order for them to have data
checksums calcuated on write.  Once all relations in all databases
have been processed, the data_checksums state will be set to on and
the cluster will at that point be identical to one which had data
checksums enabled during initialization or via offline processing.

When data checksums are being enabled, concurrent I/O operations
from backends other than the data checksums worker will write the
checksums but not verify them on reading.  Only when all backends
have absorbed the procsignalbarrier for setting data_checksums to
on will they also start verifying checksums on reading.  The same
process is repeated during disabling; all backends write checksums
but do not verify them until the barrier for setting the state to
off has been absorbed by all.  This in-progress state is used to
ensure there are no false negatives (or positives) due to reading
a checksum which is not in sync with the page.

This work is based on an earlier version of this patch which was
reviewed by among others Heikki Linnakangas, Robert Haas, Andres
Freund, Tomas Vondra, Michael Banck and Andrey Borodin.

Author: Daniel Gustafsson <daniel@yesql.se>
Author: Magnus Hagander <magnus@hagander.net>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com
Discussion: https://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de
Discussion: https://postgr.es/m/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com
---
 doc/src/sgml/func.sgml                        |   71 +
 doc/src/sgml/glossary.sgml                    |   23 +
 doc/src/sgml/monitoring.sgml                  |  207 ++-
 doc/src/sgml/ref/pg_checksums.sgml            |    6 +
 doc/src/sgml/wal.sgml                         |   57 +-
 src/backend/access/rmgrdesc/xlogdesc.c        |   24 +
 src/backend/access/transam/xlog.c             |  450 +++++-
 src/backend/access/transam/xlogfuncs.c        |   41 +
 src/backend/backup/basebackup.c               |    6 +-
 src/backend/catalog/system_functions.sql      |   11 +
 src/backend/catalog/system_views.sql          |   21 +
 src/backend/postmaster/Makefile               |    1 +
 src/backend/postmaster/bgworker.c             |    7 +
 src/backend/postmaster/datachecksumsworker.c  | 1376 +++++++++++++++++
 src/backend/postmaster/meson.build            |    1 +
 src/backend/replication/logical/decode.c      |    1 +
 src/backend/storage/ipc/ipci.c                |    4 +
 src/backend/storage/ipc/procsignal.c          |   13 +
 src/backend/storage/page/README               |    4 +-
 src/backend/storage/page/bufpage.c            |    6 +-
 src/backend/utils/activity/pgstat_io.c        |    2 +
 .../utils/activity/wait_event_names.txt       |    3 +
 src/backend/utils/adt/pgstatfuncs.c           |    8 +-
 src/backend/utils/init/miscinit.c             |    9 +-
 src/backend/utils/init/postinit.c             |    7 +-
 src/backend/utils/misc/guc_tables.c           |   32 +-
 src/bin/pg_checksums/pg_checksums.c           |    2 +-
 src/bin/pg_upgrade/controldata.c              |    9 +
 src/include/access/xlog.h                     |   16 +-
 src/include/access/xlog_internal.h            |    7 +
 src/include/catalog/pg_control.h              |    1 +
 src/include/catalog/pg_proc.dat               |   17 +
 src/include/commands/progress.h               |   18 +
 src/include/miscadmin.h                       |    4 +
 src/include/postmaster/datachecksumsworker.h  |   31 +
 src/include/storage/bufpage.h                 |   10 +
 src/include/storage/checksum.h                |    8 +
 src/include/storage/lwlocklist.h              |    1 +
 src/include/storage/proc.h                    |    5 +-
 src/include/storage/procsignal.h              |    5 +
 src/include/utils/backend_progress.h          |    1 +
 src/test/Makefile                             |   11 +-
 src/test/checksum/.gitignore                  |    2 +
 src/test/checksum/Makefile                    |   23 +
 src/test/checksum/README                      |   22 +
 src/test/checksum/meson.build                 |   15 +
 src/test/checksum/t/001_basic.pl              |   88 ++
 src/test/checksum/t/002_restarts.pl           |   92 ++
 src/test/checksum/t/003_standby_restarts.pl   |  131 ++
 src/test/checksum/t/004_offline.pl            |  101 ++
 src/test/meson.build                          |    1 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |   34 +
 src/test/regress/expected/rules.out           |   28 +
 src/tools/pgindent/typedefs.list              |    6 +
 54 files changed, 3031 insertions(+), 49 deletions(-)
 create mode 100644 src/backend/postmaster/datachecksumsworker.c
 create mode 100644 src/include/postmaster/datachecksumsworker.h
 create mode 100644 src/test/checksum/.gitignore
 create mode 100644 src/test/checksum/Makefile
 create mode 100644 src/test/checksum/README
 create mode 100644 src/test/checksum/meson.build
 create mode 100644 src/test/checksum/t/001_basic.pl
 create mode 100644 src/test/checksum/t/002_restarts.pl
 create mode 100644 src/test/checksum/t/003_standby_restarts.pl
 create mode 100644 src/test/checksum/t/004_offline.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 51dd8ad6571..a09843e4ecf 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29865,6 +29865,77 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
 
   </sect2>
 
+  <sect2 id="functions-admin-checksum">
+   <title>Data Checksum Functions</title>
+
+   <para>
+    The functions shown in <xref linkend="functions-checksums-table" /> can
+    be used to enable or disable data checksums in a running cluster.
+    See <xref linkend="checksums" /> for details.
+   </para>
+
+   <table id="functions-checksums-table">
+    <title>Data Checksum Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_enable_data_checksums</primary>
+        </indexterm>
+        <function>pg_enable_data_checksums</function> ( <optional><parameter>cost_delay</parameter> <type>int</type>, <parameter>cost_limit</parameter> <type>int</type></optional> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Initiates data checksums for the cluster. This will switch the data
+        checksums mode to <literal>inprogress-on</literal> as well as start a
+        background worker that will process all pages in the database and
+        enable checksums on them. When all data pages have had checksums
+        enabled, the cluster will automatically switch data checksums mode to
+        <literal>on</literal>.
+       </para>
+       <para>
+        If <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter> are
+        specified, the speed of the process is throttled using the same principles as
+        <link linkend="runtime-config-resource-vacuum-cost">Cost-based Vacuum Delay</link>.
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_disable_data_checksums</primary>
+        </indexterm>
+        <function>pg_disable_data_checksums</function> ()
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Disables data checksum validation and calculation for the cluster. This
+        will switch the data checksum mode to <literal>inprogress-off</literal>
+        while data checksums are being disabled. When all active backends have
+        stopped validating data checksums, the data checksum mode will be
+        changed to <literal>off</literal>.  At this point the data pages will
+        still have checksums recorded but they are not updated when pages are
+        modified.
+       </para></entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+
+  </sect2>
+
   <sect2 id="functions-admin-dbobject">
    <title>Database Object Management Functions</title>
 
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index c0f812e3f5e..547e8586d4b 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -159,6 +159,8 @@
      (but not the autovacuum workers),
      the <glossterm linkend="glossary-background-writer">background writer</glossterm>,
      the <glossterm linkend="glossary-checkpointer">checkpointer</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker">data checksums worker</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker-launcher">data checksums worker launcher</glossterm>,
      the <glossterm linkend="glossary-logger">logger</glossterm>,
      the <glossterm linkend="glossary-startup-process">startup process</glossterm>,
      the <glossterm linkend="glossary-wal-archiver">WAL archiver</glossterm>,
@@ -548,6 +550,27 @@
    </glossdef>
   </glossentry>
 
+  <glossentry id="glossary-data-checksums-worker">
+   <glossterm>Data Checksums Worker</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which enables or disables data checksums in a specific database.
+    </para>
+   </glossdef>
+  </glossentry>
+
+  <glossentry id="glossary-data-checksums-worker-launcher">
+   <glossterm>Data Checksums Worker Launcher</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which starts <glossterm linkend="glossary-data-checksums-worker"> processes</glossterm>
+     for each database.
+    </para>
+   </glossdef>
+  </glossentry>
+
   <glossentry id="glossary-db-cluster">
    <glossterm>Database cluster</glossterm>
    <glossdef>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 16646f560e8..4a6ef5f1605 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3497,8 +3497,9 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Number of data page checksum failures detected in this
-       database (or on a shared object), or NULL if data checksums are
-       disabled.
+       database (or on a shared object).
+       Detected failures are reported regardless of the
+       <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -3508,8 +3509,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Time at which the last data page checksum failure was detected in
-       this database (or on a shared object), or NULL if data checksums are
-       disabled.
+       this database (or on a shared object). Last failure is reported
+       regardless of the <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -6793,6 +6794,204 @@ FROM pg_stat_get_backend_idset() AS backendid;
 
  </sect2>
 
+ <sect2 id="data-checksum-progress-reporting">
+  <title>Data Checksum Progress Reporting</title>
+
+  <indexterm>
+   <primary>pg_stat_progress_data_checksums</primary>
+  </indexterm>
+
+  <para>
+   When data checksums are being enabled on a running cluster, the
+   <structname>pg_stat_progress_data_checksums</structname> view will contain
+   a row for each background worker which is currently calculating checksums
+   for the data pages.
+  </para>
+
+  <table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
+   <title><structname>pg_stat_progress_data_checksums</structname> View</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        Column Type
+       </para>
+       <para>
+        Description>
+       </para>
+      </entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>pid</structfield> <type>integer</type>
+       </para>
+       <para>
+        Process ID of a datachecksumworker process.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>phase</structfield> <type>text</type>
+       </para>
+       <para>
+        Current processing phase, see <xref linkend="datachecksum-phases"/>
+        for description of the phases.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of databases which will be processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of relations which will be processed, or
+        <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of relations yet.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of databases which have been processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of relations which have been processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_current</structfield> <type>oid</type>
+       </para>
+       <para>
+        OID of the database currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current</structfield> <type>oid</type>
+       </para>
+       <para>
+        OID of the relation currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current_blocks</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of blocks in the relation currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current_blocks_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks which have been processed in the relation currently
+        being processed.
+       </para>
+      </entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+  <table id="datachecksum-phases">
+   <title>Data Checksum Phases</title>
+   <tgroup cols="2">
+    <colspec colname="col1" colwidth="1*"/>
+    <colspec colname="col2" colwidth="2*"/>
+    <thead>
+     <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+    <tbody>
+     <row>
+      <entry><literal>enabling</literal></entry>
+      <entry>
+       The command is currently enabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>disabling</literal></entry>
+      <entry>
+       The command is currently disabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on backends</literal></entry>
+      <entry>
+       The command is currently waiting for backends to acknowledge the data
+       checksum operation.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on temporary tables</literal></entry>
+      <entry>
+       The command is currently waiting for all temporary tables which existed
+       at the time the command was started to be removed.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>done</literal></entry>
+      <entry>
+       The command has finished processing and is exiting.
+      </entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+ </sect2>
+
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index 95043aa329c..0343710af53 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -45,6 +45,12 @@ PostgreSQL documentation
    exit status is nonzero if the operation failed.
   </para>
 
+  <para>
+   When enabling checksums, if checksums were in the process of being enabled
+   when the cluster was shut down, <application>pg_checksums</application>
+   will still process all relations regardless of the online processing.
+  </para>
+
   <para>
    When verifying checksums, every file in the cluster is scanned. When
    enabling checksums, each relation file block with a changed checksum is
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index f3b86b26be9..2562b1a002c 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -246,9 +246,10 @@
   <para>
    Checksums can be disabled when the cluster is initialized using <link
    linkend="app-initdb-data-checksums"><application>initdb</application></link>.
-   They can also be enabled or disabled at a later time as an offline
-   operation. Data checksums are enabled or disabled at the full cluster
-   level, and cannot be specified individually for databases or tables.
+   They can also be enabled or disabled at a later time either as an offline
+   operation or online in a running cluster allowing concurrent access. Data
+   checksums are enabled or disabled at the full cluster level, and cannot be
+   specified individually for databases or tables.
   </para>
 
   <para>
@@ -265,7 +266,7 @@
   </para>
 
   <sect2 id="checksums-offline-enable-disable">
-   <title>Off-line Enabling of Checksums</title>
+   <title>Offline Enabling of Checksums</title>
 
    <para>
     The <link linkend="app-pgchecksums"><application>pg_checksums</application></link>
@@ -274,6 +275,54 @@
    </para>
 
   </sect2>
+
+  <sect2 id="checksums-online-enable-disable">
+   <title>Online Enabling of Checksums</title>
+
+   <para>
+    Checksums can be enabled or disabled online, by calling the appropriate
+    <link linkend="functions-admin-checksum">functions</link>.
+   </para>
+
+   <para>
+    Enabling checksums will put the cluster checksum mode in
+    <literal>inprogress-on</literal> mode.  During this time, checksums will be
+    written but not verified. In addition to this, a background worker process
+    is started that enables checksums on all existing data in the cluster. Once
+    this worker has completed processing all databases in the cluster, the
+    checksum mode will automatically switch to <literal>on</literal>. The
+    processing will consume two background worker processes, make sure that
+    <varname>max_worker_processes</varname> allows for at least two more
+    additional processes.
+   </para>
+
+   <para>
+    The process will initially wait for all open transactions to finish before
+    it starts, so that it can be certain that there are no tables that have been
+    created inside a transaction that has not committed yet and thus would not
+    be visible to the process enabling checksums. It will also, for each database,
+    wait for all pre-existing temporary tables to get removed before it finishes.
+    If long-lived temporary tables are used in the application it may be necessary
+    to terminate these application connections to allow the process to complete.
+   </para>
+
+   <para>
+    If the cluster is stopped while in <literal>inprogress-on</literal> mode, for
+    any reason, then this process must be restarted manually. To do this,
+    re-execute the function <function>pg_enable_data_checksums()</function>
+    once the cluster has been restarted. The background worker will attempt
+    to resume the work from where it was interrupted.
+   </para>
+
+   <note>
+    <para>
+     Enabling checksums can cause significant I/O to the system, as most of the
+     database pages will need to be rewritten, and will be written both to the
+     data files and the WAL.
+    </para>
+   </note>
+
+  </sect2>
  </sect1>
 
   <sect1 id="wal-intro">
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 58040f28656..1e3ad2ab68d 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -18,6 +18,7 @@
 #include "access/xlog.h"
 #include "access/xlog_internal.h"
 #include "catalog/pg_control.h"
+#include "storage/bufpage.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -167,6 +168,26 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
 		memcpy(&wal_level, rec, sizeof(int));
 		appendStringInfo(buf, "wal_level %s", get_wal_level_string(wal_level));
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state xlrec;
+
+		memcpy(&xlrec, rec, sizeof(xl_checksum_state));
+		switch (xlrec.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_VERSION:
+				appendStringInfoString(buf, "on");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				appendStringInfoString(buf, "inprogress-off");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				appendStringInfoString(buf, "inprogress-on");
+				break;
+			default:
+				appendStringInfoString(buf, "off");
+		}
+	}
 }
 
 const char *
@@ -218,6 +239,9 @@ xlog_identify(uint8 info)
 		case XLOG_CHECKPOINT_REDO:
 			id = "CHECKPOINT_REDO";
 			break;
+		case XLOG_CHECKSUMS:
+			id = "CHECKSUMS";
+			break;
 	}
 
 	return id;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 799fc739e18..b5190a7e104 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -647,6 +647,16 @@ static XLogRecPtr LocalMinRecoveryPoint;
 static TimeLineID LocalMinRecoveryPointTLI;
 static bool updateMinRecoveryPoint = true;
 
+/*
+ * Local state fror Controlfile data_checksum_version. After initialization
+ * this is only updated when absorbing a procsignal barrier during interrupt
+ * processing. The reason for keeping a copy in backend-private memory is to
+ * avoid locking for interrogating checksum state. Possible values are the
+ * checksum versions defined in storage/bufpage.h as well as zero when data
+ * checksums are disabled.
+ */
+static uint32 LocalDataChecksumVersion = 0;
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -715,6 +725,8 @@ static void WALInsertLockAcquireExclusive(void);
 static void WALInsertLockRelease(void);
 static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);
 
+static void XLogChecksums(ChecksumType new_type);
+
 /*
  * Insert an XLOG record represented by an already-constructed chain of data
  * chunks.  This is a low-level routine; to construct the WAL record header
@@ -828,9 +840,10 @@ XLogInsertRecord(XLogRecData *rdata,
 		 * only happen just after a checkpoint, so it's better to be slow in
 		 * this case and fast otherwise.
 		 *
-		 * Also check to see if fullPageWrites was just turned on or there's a
-		 * running backup (which forces full-page writes); if we weren't
-		 * already doing full-page writes then go back and recompute.
+		 * Also check to see if fullPageWrites was just turned on, there's a
+		 * running backup or if checksums are enabled (all of which forces
+		 * full-page writes); if we weren't already doing full-page writes
+		 * then go back and recompute.
 		 *
 		 * If we aren't doing full-page writes then RedoRecPtr doesn't
 		 * actually affect the contents of the XLOG record, so we'll update
@@ -843,7 +856,9 @@ XLogInsertRecord(XLogRecData *rdata,
 			Assert(RedoRecPtr < Insert->RedoRecPtr);
 			RedoRecPtr = Insert->RedoRecPtr;
 		}
-		doPageWrites = (Insert->fullPageWrites || Insert->runningBackups > 0);
+		doPageWrites = (Insert->fullPageWrites ||
+						Insert->runningBackups > 0 ||
+						DataChecksumsNeedWrite());
 
 		if (doPageWrites &&
 			(!prevDoPageWrites ||
@@ -4579,9 +4594,7 @@ ReadControlFile(void)
 
 	CalculateCheckpointSegments();
 
-	/* Make the initdb settings visible as GUC variables, too */
-	SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no",
-					PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
+	LocalDataChecksumVersion = ControlFile->data_checksum_version;
 }
 
 /*
@@ -4615,13 +4628,345 @@ GetMockAuthenticationNonce(void)
 }
 
 /*
- * Are checksums enabled for data pages?
+ * DataChecksumsNeedWrite
+ *		Returns whether data checksums must be written or not
+ *
+ * Returns true iff data checksums are enabled or are in the process of being
+ * enabled. During "inprogress-on" and "inprogress-off" states checksums must
+ * be written even though they are not verified (see datachecksumsworker.c for
+ * a longer discussion).
+ *
+ * This function is intended for callsites which are about to write a data page
+ * to storage, and need to know whether to re-calculate the checksum for the
+ * page header. Calling this function must be performed as close to the write
+ * operation as possible to keep the critical section short.
+ */
+bool
+DataChecksumsNeedWrite(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * DataChecksumsNeedVerify
+ *		Returns whether data checksums must be verified or not
+ *
+ * Data checksums are only verified if they are fully enabled in the cluster.
+ * During the "inprogress-on" and "inprogress-off" states they are only
+ * updated, not verified (see datachecksumsworker.c for a longer discussion).
+ *
+ * This function is intended for callsites which have read data and are about
+ * to perform checksum validation based on the result of this.  Calling this
+ * function must be performed as close to the validation call as possible to
+ * keep the critical section short. This is in order to protect against time of
+ * check/time of use situations around data checksum validation.
+ */
+bool
+DataChecksumsNeedVerify(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
+}
+
+/*
+ * DataChecksumsOnInProgress
+ *		Returns whether data checksums are being enabled
+ *
+ * Most operations don't need to worry about the "inprogress" states, and
+ * should use DataChecksumsNeedVerify() or DataChecksumsNeedWrite(). The
+ * "inprogress-on" state for enabling checksums is used when the checksum
+ * worker is setting checksums on all pages, it can thus be used to check for
+ * aborted checksum processing which need to be restarted.
+ */
+inline bool
+DataChecksumsOnInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+}
+
+/*
+ * DataChecksumsOffInProgress
+ *		Returns whether data checksums are being disabled
+ *
+ * The "inprogress-off" state for disabling checksums is used for when the
+ * worker resets the catalog state.  DataChecksumsNeedVerify() or
+ * DataChecksumsNeedWrite() should be used for deciding whether to read/write
+ * checksums.
  */
 bool
-DataChecksumsEnabled(void)
+DataChecksumsOffInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * SetDataChecksumsOnInProgress
+ *		Sets the data checksum state to "inprogress-on" to enable checksums
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on". See
+ * SetDataChecksumsOn below for a description on how this state change works.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOnInProgress(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile != NULL);
+
+	/*
+	 * The state transition is performed in a critical section with
+	 * checkpoints held off to provide crash safety.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state change in all backends to ensure that all backends are in
+	 * "inprogress-on". Once done we know that all backends are writing data
+	 * checksums.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOn
+ *		Enables data checksums cluster-wide
+ *
+ * Enabling data checksums is performed using two barriers, the first one to
+ * set the checksums state to "inprogress-on" (which is performed by
+ * SetDataChecksumsOnInProgress()) and the second one to set the state to "on"
+ * (performed here).
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on".  This state
+ * requires data checksums to be written but not verified. This ensures that
+ * all data pages can be checksummed without the risk of false negatives in
+ * validation during the process.  When all existing pages are guaranteed to
+ * have checksums, and all new pages will be initiated with checksums, the
+ * state can be changed to "on". Once the state is "on" checksums will be both
+ * written and verified. See datachecksumsworker.c for a longer discussion on
+ * how data checksums can be enabled in a running cluster.
+ *
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOn(void)
 {
+	uint64		barrier;
+
 	Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/*
+	 * The only allowed state transition to "on" is from "inprogress-on" since
+	 * that state ensures that all pages will have data checksums written.
+	 */
+	if (ControlFile->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+		elog(ERROR, "checksums not in \"inprogress-on\" mode");
+	}
+
+	LWLockRelease(ControlFileLock);
+
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state transition of "on" in all backends. When done we know that
+	 * data checksums are enabled in all backends and data checksums are both
+	 * written and verified.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOff
+ *		Disables data checksums cluster-wide
+ *
+ * Disabling data checksums must be performed with two sets of barriers, each
+ * carrying a different state. The state is first set to "inprogress-off"
+ * during which checksums are still written but not verified. This ensures that
+ * backends which have yet to observe the state change from "on" won't get
+ * validation errors on concurrently modified pages. Once all backends have
+ * changed to "inprogress-off", the barrier for moving to "off" can be emitted.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOff(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/* If data checksums are already disabled there is nothing to do */
+	if (ControlFile->data_checksum_version == 0)
+	{
+		LWLockRelease(ControlFileLock);
+		return;
+	}
+
+	/*
+	 * If data checksums are currently enabled we first transition to the
+	 * "inprogress-off" state during which backends continue to write
+	 * checksums without verifying them. When all backends are in
+	 * "inprogress-off" the next transition to "off" can be performed, after
+	 * which all data checksum processing is disabled.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+
+		MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+		START_CRIT_SECTION();
+
+		XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+		LWLockRelease(ControlFileLock);
+
+		barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+
+		END_CRIT_SECTION();
+		MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+
+		/*
+		 * Update local state in all backends to ensure that any backend in
+		 * "on" state is changed to "inprogress-off".
+		 */
+		WaitForProcSignalBarrier(barrier);
+
+		/*
+		 * At this point we know that no backends are verifying data checksums
+		 * during reading. Next, we can safely move to state "off" to also
+		 * stop writing checksums.
+		 */
+	}
+	else
+	{
+		/*
+		 * Ending up here implies that the checksums state is "inprogress-on"
+		 * or "inprogress-off" and we can transition directly to "off" from
+		 * there.
+		 */
+		LWLockRelease(ControlFileLock);
+	}
+
+	/*
+	 * Ensure that we don't incur a checkpoint during disabling checksums.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(0);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = 0;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * ProcSignalBarrier absorption functions for enabling and disabling data
+ * checksums in a running cluster. The procsignalbarriers are emitted in the
+ * SetDataChecksums* functions.
+ */
+bool
+AbsorbChecksumsOnInProgressBarrier(void)
+{
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOnBarrier(void)
+{
+	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffInProgressBarrier(void)
+{
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffBarrier(void)
+{
+	LocalDataChecksumVersion = 0;
+	return true;
+}
+
+/*
+ * InitLocalControlData
+ *
+ * Set up backend local caches of controldata variables which may change at
+ * any point during runtime and thus require special cased locking. So far
+ * this only applies to data_checksum_version, but it's intended to be general
+ * purpose enough to handle future cases.
+ */
+void
+InitLocalControldata(void)
+{
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	LocalDataChecksumVersion = ControlFile->data_checksum_version;
+	LWLockRelease(ControlFileLock);
+}
+
+/* guc hook */
+const char *
+show_data_checksums(void)
+{
+	if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)
+		return "on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		return "inprogress-on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+		return "inprogress-off";
+	else
+		return "off";
 }
 
 /*
@@ -6194,6 +6539,33 @@ StartupXLOG(void)
 	 */
 	CompleteCommitTsInitialization();
 
+	/*
+	 * If we reach this point with checksums being enabled ("inprogress-on"
+	 * state), we notify the user that they need to manually restart the
+	 * process to enable checksums. This is because we cannot launch a dynamic
+	 * background worker directly from here, it has to be launched from a
+	 * regular backend.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		ereport(WARNING,
+				(errmsg("data checksums are being enabled, but no worker is running"),
+				 errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
+
+	/*
+	 * If data checksums were being disabled when the cluster was shut down,
+	 * we know that we have a state where all backends have stopped validating
+	 * checksums and we can move to off instead of prompting the user to
+	 * perform any action.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+	{
+		XLogChecksums(0);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = 0;
+		LWLockRelease(ControlFileLock);
+	}
+
 	/*
 	 * All done with end-of-recovery actions.
 	 *
@@ -8201,6 +8573,24 @@ XLogReportParameters(void)
 	}
 }
 
+/*
+ * Log the new state of checksums
+ */
+static void
+XLogChecksums(ChecksumType new_type)
+{
+	xl_checksum_state xlrec;
+	XLogRecPtr	recptr;
+
+	xlrec.new_checksumtype = new_type;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, sizeof(xl_checksum_state));
+
+	recptr = XLogInsert(RM_XLOG_ID, XLOG_CHECKSUMS);
+	XLogFlush(recptr);
+}
+
 /*
  * Update full_page_writes in shared memory, and write an
  * XLOG_FPW_CHANGE record if necessary.
@@ -8629,6 +9019,46 @@ xlog_redo(XLogReaderState *record)
 	{
 		/* nothing to do here, just for informational purposes */
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state state;
+		uint64		barrier;
+
+		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = state.new_checksumtype;
+		LWLockRelease(ControlFileLock);
+
+		/*
+		 * Block on a procsignalbarrier to await all processes having seen the
+		 * change to checksum status. Once the barrier has been passed we can
+		 * initiate the corresponding processing.
+		 */
+		switch (state.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			default:
+				Assert(state.new_checksumtype == 0);
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+		}
+	}
 }
 
 /*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 8c3090165f0..6b1c2ed85ea 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -26,6 +26,7 @@
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "postmaster/datachecksumsworker.h"
 #include "replication/walreceiver.h"
 #include "storage/fd.h"
 #include "storage/latch.h"
@@ -748,3 +749,43 @@ pg_promote(PG_FUNCTION_ARGS)
 						   wait_seconds)));
 	PG_RETURN_BOOL(false);
 }
+
+/*
+ * Disables data checksums for the cluster, if applicable. Starts a background
+ * worker which turns off the data checksums.
+ */
+Datum
+disable_data_checksums(PG_FUNCTION_ARGS)
+{
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	StartDataChecksumsWorkerLauncher(false, 0, 0, false);
+	PG_RETURN_VOID();
+}
+
+/*
+ * Enables data checksums for the cluster, if applicable.  Supports vacuum-
+ * like cost based throttling to limit system load. Starts a background worker
+ * which updates data checksums on existing data.
+ */
+Datum
+enable_data_checksums(PG_FUNCTION_ARGS)
+{
+	int			cost_delay = PG_GETARG_INT32(0);
+	int			cost_limit = PG_GETARG_INT32(1);
+	bool		fast = PG_GETARG_BOOL(2);
+
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	if (cost_delay < 0)
+		ereport(ERROR, errmsg("cost delay cannot be a negative value"));
+
+	if (cost_limit <= 0)
+		ereport(ERROR, errmsg("cost limit must be greater than zero"));
+
+	StartDataChecksumsWorkerLauncher(true, cost_delay, cost_limit, fast);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 3f8a3c55725..598dd41bae6 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1613,7 +1613,8 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	 * enabled for this cluster, and if this is a relation file, then verify
 	 * the checksum.
 	 */
-	if (!noverify_checksums && DataChecksumsEnabled() &&
+	if (!noverify_checksums &&
+		DataChecksumsNeedWrite() &&
 		RelFileNumberIsValid(relfilenumber))
 		verify_checksum = true;
 
@@ -2006,6 +2007,9 @@ verify_page_checksum(Page page, XLogRecPtr start_lsn, BlockNumber blkno,
 	if (PageIsNew(page) || PageGetLSN(page) >= start_lsn)
 		return true;
 
+	if (!DataChecksumsNeedVerify())
+		return true;
+
 	/* Perform the actual checksum calculation. */
 	checksum = pg_checksum_page(page, blkno);
 
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 566f308e443..b18db2b5dde 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -650,6 +650,13 @@ LANGUAGE INTERNAL
 CALLED ON NULL INPUT VOLATILE PARALLEL SAFE
 AS 'pg_stat_reset_slru';
 
+CREATE OR REPLACE FUNCTION
+  pg_enable_data_checksums(cost_delay integer DEFAULT 0,
+                           cost_limit integer DEFAULT 100,
+						   fast boolean DEFAULT false)
+  RETURNS void STRICT VOLATILE LANGUAGE internal AS 'enable_data_checksums'
+  PARALLEL RESTRICTED;
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
@@ -775,6 +782,10 @@ REVOKE EXECUTE ON FUNCTION pg_ls_logicalmapdir() FROM PUBLIC;
 
 REVOKE EXECUTE ON FUNCTION pg_ls_replslotdir(text) FROM PUBLIC;
 
+REVOKE EXECUTE ON FUNCTION pg_enable_data_checksums(integer, integer, boolean) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_disable_data_checksums() FROM public;
+
 --
 -- We also set up some things as accessible to standard roles.
 --
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index a4d2cfdcaf5..0d62976dc1f 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1334,6 +1334,27 @@ CREATE VIEW pg_stat_progress_copy AS
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_data_checksums AS
+	SELECT
+		S.pid AS pid, S.datid AS datid, D.datname AS datname,
+		CASE S.param1 WHEN 0 THEN 'enabling'
+                      WHEN 1 THEN 'disabling'
+					  WHEN 2 THEN 'waiting'
+					  WHEN 3 THEN 'waiting on backends'
+					  WHEN 4 THEN 'waiting on temporary tables'
+					  WHEN 5 THEN 'done'
+					  END AS phase,
+		CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+		CASE S.param3 WHEN -1 THEN NULL ELSE S.param3 END AS relations_total,
+		S.param4 AS databases_processed,
+		S.param5 AS relations_processed,
+		S.param6 AS databases_current,
+		S.param7 AS relation_current,
+		S.param8 AS relation_current_blocks,
+		S.param9 AS relation_current_blocks_processed
+	FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+		LEFT JOIN pg_database D ON S.datid = D.oid;
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/postmaster/Makefile b/src/backend/postmaster/Makefile
index 0f4435d2d97..0c36765acfe 100644
--- a/src/backend/postmaster/Makefile
+++ b/src/backend/postmaster/Makefile
@@ -18,6 +18,7 @@ OBJS = \
 	bgworker.o \
 	bgwriter.o \
 	checkpointer.o \
+	datachecksumsworker.o \
 	fork_process.o \
 	interrupt.o \
 	launch_backend.o \
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 116ddf7b835..5720f66b0fd 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -18,6 +18,7 @@
 #include "pgstat.h"
 #include "port/atomics.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "replication/logicallauncher.h"
 #include "replication/logicalworker.h"
@@ -132,6 +133,12 @@ static const struct
 	},
 	{
 		"TablesyncWorkerMain", TablesyncWorkerMain
+	},
+	{
+		"DataChecksumsWorkerLauncherMain", DataChecksumsWorkerLauncherMain
+	},
+	{
+		"DataChecksumsWorkerMain", DataChecksumsWorkerMain
 	}
 };
 
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
new file mode 100644
index 00000000000..de7a077f9c2
--- /dev/null
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -0,0 +1,1376 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.c
+ *	  Background worker for enabling or disabling data checksums online
+ *
+ * When enabling data checksums on a database at initdb time or when shut down
+ * with pg_checksums, no extra process is required as each page is checksummed,
+ * and verified, when accessed.  When enabling checksums on an already running
+ * cluster, this worker will ensure that all pages are checksummed before
+ * verification of the checksums is turned on. In the case of disabling
+ * checksums, the state transition is performed only in the control file, no
+ * changes are performed on the data pages.
+ *
+ * Checksums can be either enabled or disabled cluster-wide, with on/off being
+ * the end state for data_checksums.
+ *
+ * Enabling checksums
+ * ------------------
+ * When enabling checksums in an online cluster, data_checksums will be set to
+ * "inprogress-on" which signals that write operations MUST compute and write
+ * the checksum on the data page, but during reading the checksum SHALL NOT be
+ * verified. This ensures that all objects created during checksumming will
+ * have checksums set, but no reads will fail due to incorrect checksum. The
+ * DataChecksumsWorker will compile a list of databases which exist at the
+ * start of checksumming, and all of these which haven't been dropped during
+ * the processing MUST have been processed successfully in order for checksums
+ * to be enabled. Any new relation created during processing will see the
+ * in-progress state and will automatically be checksummed.
+ *
+ * For each database, all relations which have storage are read and every data
+ * page is marked dirty to force a write with the checksum. This will generate
+ * a lot of WAL as the entire database is read and written.
+ *
+ * If the processing is interrupted by a cluster restart, it will be restarted
+ * from the beginning again as state isn't persisted.
+ *
+ * Disabling checksums
+ * -------------------
+ * When disabling checksums, data_checksums will be set to "inprogress-off"
+ * which signals that checksums are written but no longer verified. This ensure
+ * that backends which have yet to move from the "on" state will still be able
+ * to process data checksum validation.
+ *
+ * Synchronization and Correctness
+ * -------------------------------
+ * The processes involved in enabling, or disabling, data checksums in an
+ * online cluster must be properly synchronized with the normal backends
+ * serving concurrent queries to ensure correctness. Correctness is defined
+ * as the following:
+ *
+ *    - Backends SHALL NOT violate local data_checksums state
+ *    - Data checksums SHALL NOT be considered enabled cluster-wide until all
+ *      currently connected backends have the local state "enabled"
+ *
+ * There are two levels of synchronization required for enabling data checksums
+ * in an online cluster: (i) changing state in the active backends ("on",
+ * "off", "inprogress-on" and "inprogress-off"), and (ii) ensuring no
+ * incompatible objects and processes are left in a database when workers end.
+ * The former deals with cluster-wide agreement on data checksum state and the
+ * latter with ensuring that any concurrent activity cannot break the data
+ * checksum contract during processing.
+ *
+ * Synchronizing the state change is done with procsignal barriers, where the
+ * WAL logging backend updating the global state in the controlfile will wait
+ * for all other backends to absorb the barrier. Barrier absorption will happen
+ * during interrupt processing, which means that connected backends will change
+ * state at different times. To prevent data checksum state changes when
+ * writing and verifying checksums, interrupts shall be held off before
+ * interrogating state and resumed when the IO operation has been performed.
+ *
+ *   When Enabling Data Checksums
+ *   ----------------------------
+ *   A process which fails to observe data checksums being enabled can induce
+ *   two types of errors: failing to write the checksum when modifying the page
+ *   and failing to validate the data checksum on the page when reading it.
+ *
+ *   When processing starts all backends belong to one of the below sets, with
+ *   one set being empty:
+ *
+ *   Bd: Backends in "off" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   If processing is started in an online cluster then all backends are in Bd.
+ *   If processing was halted by the cluster shutting down, the controlfile
+ *   state "inprogress-on" will be observed on system startup and all backends
+ *   will be in Bd. Backends transition Bd -> Bi via a procsignalbarrier.  When
+ *   the DataChecksumsWorker has finished writing checksums on all pages and
+ *   enables data checksums cluster-wide, there are four sets of backends where
+ *   Bd shall be an empty set:
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   Backends in Bi and Be will write checksums when modifying a page, but only
+ *   backends in Be will verify the checksum during reading. The Bg backend is
+ *   blocked waiting for all backends in Bi to process interrupts and move to
+ *   Be. Any backend starting while Bg is waiting on the procsignalbarrier will
+ *   observe the global state being "on" and will thus automatically belong to
+ *   Be.  Checksums are enabled cluster-wide when Bi is an empty set. Bi and Be
+ *   are compatible sets while still operating based on their local state as
+ *   both write data checksums.
+ *
+ *   When Disabling Data Checksums
+ *   -----------------------------
+ *   A process which fails to observe that data checksums have been disabled
+ *   can induce two types of errors: writing the checksum when modifying the
+ *   page and validating a data checksum which is no longer correct due to
+ *   modifications to the page.
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bo: Backends in "inprogress-off" state
+ *
+ *   Backends transition from the Be state to Bd like so: Be -> Bo -> Bd
+ *
+ *   The goal is to transition all backends to Bd making the others empty sets.
+ *   Backends in Bo write data checksums, but don't validate them, such that
+ *   backends still in Be can continue to validate pages until the barrier has
+ *   been absorbed such that they are in Bo. Once all backends are in Bo, the
+ *   barrier to transition to "off" can be raised and all backends can safely
+ *   stop writing data checksums as no backend is enforcing data checksum
+ *   validation any longer.
+ *
+ *
+ * Potential optimizations
+ * -----------------------
+ * Below are some potential optimizations and improvements which were brought
+ * up during reviews of this feature, but which weren't implemented in the
+ * initial version. These are ideas listed without any validation on their
+ * feasibility or potential payoff. More discussion on these can be found on
+ * the -hackers threads linked to in the commit message of this feature.
+ *
+ *   * Launching datachecksumsworker for resuming operation from the startup
+ *     process: Currently users have to restart processing manually after a
+ *     restart since dynamic background worker cannot be started from the
+ *     postmaster. Changing the startup process could make restarting the
+ *     processing automatic on cluster restart.
+ *   * Avoid dirtying the page when checksums already match: Iff the checksum
+ *     on the page happens to already match we still dirty the page. It should
+ *     be enough to only do the log_newpage_buffer() call in that case.
+ *   * Invent a lightweight WAL record that doesn't contain the full-page
+ *     image but just the block number: On replay, the redo routine would read
+ *     the page from disk.
+ *   * Teach pg_checksums to avoid checksummed pages when pg_checksums is used
+ *     to enable checksums on a cluster which is in inprogress-on state and
+ *     may have checksummed pages (make pg_checksums be able to resume an
+ *     online operation).
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/postmaster/datachecksumsworker.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/htup_details.h"
+#include "access/xact.h"
+#include "access/xloginsert.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_database.h"
+#include "commands/progress.h"
+#include "commands/vacuum.h"
+#include "common/relpath.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "postmaster/bgworker.h"
+#include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/bufmgr.h"
+#include "storage/checksum.h"
+#include "storage/ipc.h"
+#include "storage/lmgr.h"
+#include "storage/lwlock.h"
+#include "storage/procarray.h"
+#include "storage/smgr.h"
+#include "tcop/tcopprot.h"
+#include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
+#include "utils/ps_status.h"
+#include "utils/syscache.h"
+
+/*
+ * Number of times we retry to open a database before giving up and consider
+ * it to have failed processing.
+ */
+#define DATACHECKSUMSWORKER_MAX_DB_RETRIES 5
+
+typedef enum
+{
+	DATACHECKSUMSWORKER_SUCCESSFUL = 0,
+	DATACHECKSUMSWORKER_ABORTED,
+	DATACHECKSUMSWORKER_FAILED,
+	DATACHECKSUMSWORKER_RETRYDB,
+} DataChecksumsWorkerResult;
+
+/*
+ * Signaling between backends calling pg_enable/disable_data_checksums, the
+ * checksums launcher process, and the checksums worker process.
+ *
+ * This struct is protected by DataChecksumsWorkerLock
+ */
+typedef struct DataChecksumsWorkerShmemStruct
+{
+	/*
+	 * These are set by pg_enable/disable_data_checksums, to tell the launcher
+	 * what the target state is.
+	 */
+	bool		launch_enable_checksums;	/* True if checksums are being
+											 * enabled, else false */
+	int			launch_cost_delay;
+	int			launch_cost_limit;
+	bool		launch_fast;
+
+	/*
+	 * Is a launcher process is currently running?
+	 *
+	 * This is set by the launcher process, after it has read the above
+	 * launch_* parameters.
+	 */
+	bool		launcher_running;
+
+	/*
+	 * These fields indicate the target state that the launcher is currently
+	 * working towards. They can be different from the corresponding launch_*
+	 * fields, if a new pg_enable/disable_data_checksums() call was made while
+	 * the launcher/worker was already running.
+	 *
+	 * The below members are set when the launcher starts, and are only
+	 * accessed read-only by the single worker. Thus, we can access these
+	 * without a lock. If multiple workers, or dynamic cost parameters, are
+	 * supported at some point then this would need to be revisited.
+	 */
+	bool		enabling_checksums; /* True if checksums are being enabled,
+									 * else false */
+	int			cost_delay;
+	int			cost_limit;
+	bool		immediate_checkpoint;
+
+	/*
+	 * Signaling between the launcher and the worker process.
+	 *
+	 * As there is only a single worker, and the launcher won't read these
+	 * until the worker exits, they can be accessed without the need for a
+	 * lock. If multiple workers are supported then this will have to be
+	 * revisited.
+	 */
+
+	/* result, set by worker before exiting */
+	DataChecksumsWorkerResult success;
+
+	/*
+	 * tells the worker process whether it should also process the shared
+	 * catalogs
+	 */
+	bool		process_shared_catalogs;
+} DataChecksumsWorkerShmemStruct;
+
+/* Shared memory segment for datachecksumsworker */
+static DataChecksumsWorkerShmemStruct *DataChecksumsWorkerShmem;
+
+/* Bookkeeping for work to do */
+typedef struct DataChecksumsWorkerDatabase
+{
+	Oid			dboid;
+	char	   *dbname;
+} DataChecksumsWorkerDatabase;
+
+typedef struct DataChecksumsWorkerResultEntry
+{
+	Oid			dboid;
+	DataChecksumsWorkerResult result;
+	int			retries;
+} DataChecksumsWorkerResultEntry;
+
+
+/*
+ * Flag set by the interrupt handler
+ */
+static volatile sig_atomic_t abort_requested = false;
+
+/*
+ * Have we set the DataChecksumsWorkerShmemStruct->launcher_running flag?
+ * If we have, we need to clear it before exiting!
+ */
+static volatile sig_atomic_t launcher_running = false;
+
+/*
+ * Are we enabling data checksums, or disabling them?
+ */
+static bool enabling_checksums;
+
+/* Prototypes */
+static List *BuildDatabaseList(void);
+static List *BuildRelationList(bool temp_relations, bool include_shared);
+static void FreeDatabaseList(List *dblist);
+static DataChecksumsWorkerResult ProcessDatabase(DataChecksumsWorkerDatabase *db);
+static bool ProcessAllDatabases(bool immediate_checkpoint);
+static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy);
+static void launcher_cancel_handler(SIGNAL_ARGS);
+static void WaitForAllTransactionsToFinish(void);
+
+/*
+ * StartDataChecksumsWorkerLauncher
+ *		Main entry point for datachecksumsworker launcher process
+ *
+ * The main entrypoint for starting data checksums processing for enabling as
+ * well as disabling.
+ */
+void
+StartDataChecksumsWorkerLauncher(bool enable_checksums,
+								 int cost_delay,
+								 int cost_limit,
+								 bool fast)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	bool		launcher_running;
+
+	/* the cost delay settings have no effect when disabling */
+	Assert(enable_checksums || cost_delay == 0);
+	Assert(enable_checksums || cost_limit == 0);
+
+	/* store the desired state in shared memory */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	DataChecksumsWorkerShmem->launch_enable_checksums = enable_checksums;
+	DataChecksumsWorkerShmem->launch_cost_delay = cost_delay;
+	DataChecksumsWorkerShmem->launch_cost_limit = cost_limit;
+	DataChecksumsWorkerShmem->launch_fast = fast;
+
+	/* is the launcher already running? */
+	launcher_running = DataChecksumsWorkerShmem->launcher_running;
+
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * Launch a new launcher process, if it's not running already.
+	 *
+	 * If the launcher is currently busy enabling the checksums, and we want
+	 * them disabled (or vice versa), the launcher will notice that at latest
+	 * when it's about to exit, and will loop back process the new request. So
+	 * if the launcher is already running, we don't need to do anything more
+	 * here to abort it.
+	 *
+	 * If you call pg_enable/disable_data_checksums() twice in a row, before
+	 * the launcher has had a chance to start up, we still end up launching it
+	 * twice.  That's OK, the second invocation will see that a launcher is
+	 * already running and exit quickly.
+	 *
+	 * TODO: We could optimize here and skip launching the launcher, if we are
+	 * already in the desired state, i.e. if the checksums are already enabled
+	 * and you call pg_enable_data_checksums().
+	 */
+	if (!launcher_running)
+	{
+		/*
+		 * Prepare the BackgroundWorker and launch it.
+		 */
+		memset(&bgw, 0, sizeof(bgw));
+		bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+		bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+		snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+		snprintf(bgw.bgw_function_name, BGW_MAXLEN, "DataChecksumsWorkerLauncherMain");
+		snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker launcher");
+		snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker launcher");
+		bgw.bgw_restart_time = BGW_NEVER_RESTART;
+		bgw.bgw_notify_pid = MyProcPid;
+		bgw.bgw_main_arg = (Datum) 0;
+
+		if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+			ereport(ERROR,
+					(errmsg("failed to start background worker to process data checksums")));
+	}
+}
+
+/*
+ * ProcessSingleRelationFork
+ *		Enable data checksums in a single relation/fork.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy)
+{
+	BlockNumber numblocks = RelationGetNumberOfBlocksInFork(reln, forkNum);
+	BlockNumber blknum;
+	char		activity[NAMEDATALEN * 2 + 128];
+	char	   *relns;
+
+	relns = get_namespace_name(RelationGetNamespace(reln));
+
+	if (!relns)
+		return false;
+
+	/* Report the current relation to pgstat_activity */
+	snprintf(activity, sizeof(activity) - 1, "processing: %s.%s (%s, %dblocks)",
+			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	/*
+	 * We are looping over the blocks which existed at the time of process
+	 * start, which is safe since new blocks are created with checksums set
+	 * already due to the state being "inprogress-on".
+	 */
+	for (blknum = 0; blknum < numblocks; blknum++)
+	{
+		Buffer		buf = ReadBufferExtended(reln, forkNum, blknum, RBM_NORMAL, strategy);
+
+		/* Need to get an exclusive lock before we can flag as dirty */
+		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+
+		/*
+		 * Mark the buffer as dirty and force a full page write.  We have to
+		 * re-write the page to WAL even if the checksum hasn't changed,
+		 * because if there is a replica it might have a slightly different
+		 * version of the page with an invalid checksum, caused by unlogged
+		 * changes (e.g. hintbits) on the master happening while checksums
+		 * were off. This can happen if there was a valid checksum on the page
+		 * at one point in the past, so only when checksums are first on, then
+		 * off, and then turned on again. Iff wal_level is set to "minimal",
+		 * this could be avoided iff the checksum is calculated to be correct.
+		 */
+		START_CRIT_SECTION();
+		MarkBufferDirty(buf);
+		log_newpage_buffer(buf, false);
+		END_CRIT_SECTION();
+
+		UnlockReleaseBuffer(buf);
+
+		/*
+		 * This is the only place where we check if we are asked to abort, the
+		 * abortion will bubble up from here. It's safe to check this without
+		 * a lock, because if we miss it being set, we will try again soon.
+		 */
+		Assert(enabling_checksums);
+		if (!DataChecksumsWorkerShmem->launch_enable_checksums)
+			abort_requested = true;
+		if (abort_requested)
+			return false;
+
+		vacuum_delay_point();
+	}
+
+	pfree(relns);
+	return true;
+}
+
+/*
+ * ProcessSingleRelationByOid
+ *		Process a single relation based on oid.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationByOid(Oid relationId, BufferAccessStrategy strategy)
+{
+	Relation	rel;
+	ForkNumber	fnum;
+	bool		aborted = false;
+
+	StartTransactionCommand();
+
+	rel = try_relation_open(relationId, AccessShareLock);
+	if (rel == NULL)
+	{
+		/*
+		 * Relation no longer exists. We don't consider this an error since
+		 * there are no pages in it that need data checksums, and thus return
+		 * true. The worker operates off a list of relations generated at the
+		 * start of processing, so relations being dropped in the meantime is
+		 * to be expected.
+		 */
+		CommitTransactionCommand();
+		pgstat_report_activity(STATE_IDLE, NULL);
+		return true;
+	}
+	RelationGetSmgr(rel);
+
+	for (fnum = 0; fnum <= MAX_FORKNUM; fnum++)
+	{
+		if (smgrexists(rel->rd_smgr, fnum))
+		{
+			if (!ProcessSingleRelationFork(rel, fnum, strategy))
+			{
+				aborted = true;
+				break;
+			}
+		}
+	}
+	relation_close(rel, AccessShareLock);
+	elog(DEBUG2,
+		 "data checksum processing done for relation with OID %u: %s",
+		 relationId, (aborted ? "aborted" : "finished"));
+
+	CommitTransactionCommand();
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return !aborted;
+}
+
+/*
+ * ProcessDatabase
+ *		Enable data checksums in a single database.
+ *
+ * We do this by launching a dynamic background worker into this database, and
+ * waiting for it to finish.  We have to do this in a separate worker, since
+ * each process can only be connected to one database during its lifetime.
+ */
+static DataChecksumsWorkerResult
+ProcessDatabase(DataChecksumsWorkerDatabase *db)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	BgwHandleStatus status;
+	pid_t		pid;
+	char		activity[NAMEDATALEN + 64];
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_FAILED;
+
+	memset(&bgw, 0, sizeof(bgw));
+	bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+	bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+	snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+	snprintf(bgw.bgw_function_name, BGW_MAXLEN, "%s", "DataChecksumsWorkerMain");
+	snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker worker");
+	snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker worker");
+	bgw.bgw_restart_time = BGW_NEVER_RESTART;
+	bgw.bgw_notify_pid = MyProcPid;
+	bgw.bgw_main_arg = ObjectIdGetDatum(db->dboid);
+
+	/*
+	 * If there are no worker slots available, make sure we retry processing
+	 * this database. This will make the datachecksumsworker move on to the
+	 * next database and quite likely fail with the same problem. TODO: Maybe
+	 * we need a backoff to avoid running through all the databases here in
+	 * short order.
+	 */
+	if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+	{
+		ereport(WARNING,
+				(errmsg("failed to start worker for enabling data checksums in database \"%s\", retrying",
+						db->dbname),
+				 errhint("The max_worker_processes setting might be too low.")));
+		return DATACHECKSUMSWORKER_RETRYDB;
+	}
+
+	status = WaitForBackgroundWorkerStartup(bgw_handle, &pid);
+	if (status == BGWH_STOPPED)
+	{
+		ereport(WARNING,
+				(errmsg("could not start background worker for enabling data checksums in database \"%s\"",
+						db->dbname),
+				 errhint("More details on the error might be found in the server log.")));
+		return DATACHECKSUMSWORKER_FAILED;
+	}
+
+	/*
+	 * If the postmaster crashed we cannot end up with a processed database so
+	 * we have no alternative other than exiting. When enabling checksums we
+	 * won't at this time have changed the pg_control version to enabled so
+	 * when the cluster comes back up processing will have to be restarted.
+	 * When disabling, the pg_control version will be set to off before this
+	 * so when the cluster comes up checksums will be off as expected.
+	 */
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("cannot enable data checksums without the postmaster process"),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	Assert(status == BGWH_STARTED);
+	ereport(DEBUG1,
+			(errmsg("initiating data checksum processing in database \"%s\"",
+					db->dbname)));
+
+	snprintf(activity, sizeof(activity) - 1,
+			 "Waiting for worker in database %s (pid %d)", db->dbname, pid);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	status = WaitForBackgroundWorkerShutdown(bgw_handle);
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("postmaster exited during data checksum processing in \"%s\"",
+						db->dbname),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	if (DataChecksumsWorkerShmem->success == DATACHECKSUMSWORKER_ABORTED)
+		ereport(LOG,
+				(errmsg("data checksums processing was aborted in database \"%s\"",
+						db->dbname)));
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return DataChecksumsWorkerShmem->success;
+}
+
+/*
+ * launcher_exit
+ *
+ * Internal routine for cleaning up state when the launcher process exits. We
+ * need to clean up the abort flag to ensure that processing can be restarted
+ * again after it was previously aborted.
+ */
+static void
+launcher_exit(int code, Datum arg)
+{
+	if (launcher_running)
+	{
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		launcher_running = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		LWLockRelease(DataChecksumsWorkerLock);
+	}
+}
+
+/*
+ * launcher_cancel_handler
+ *
+ * Internal routine for reacting to SIGINT and flagging the worker to abort.
+ * The worker won't be interrupted immediately but will check for abort flag
+ * between each block in a relation.
+ */
+static void
+launcher_cancel_handler(SIGNAL_ARGS)
+{
+	int			save_errno = errno;
+
+	abort_requested = true;
+
+	/*
+	 * There is no sleeping in the main loop, the flag will be checked
+	 * periodically in ProcessSingleRelationFork. The worker does however
+	 * sleep when waiting for concurrent transactions to end so we still need
+	 * to set the latch.
+	 */
+	SetLatch(MyLatch);
+
+	errno = save_errno;
+}
+
+/*
+ * WaitForAllTransactionsToFinish
+ *		Blocks awaiting all current transactions to finish
+ *
+ * Returns when all transactions which are active at the call of the function
+ * have ended, or if the postmaster dies while waiting. If the postmaster dies
+ * the abort flag will be set to indicate that the caller of this shouldn't
+ * proceed.
+ *
+ * NB: this will return early, if aborted by SIGINT or if the target state
+ * is changed while we're running.
+ */
+static void
+WaitForAllTransactionsToFinish(void)
+{
+	TransactionId waitforxid;
+
+	LWLockAcquire(XidGenLock, LW_SHARED);
+	waitforxid = XidFromFullTransactionId(TransamVariables->nextXid);
+	LWLockRelease(XidGenLock);
+
+	while (TransactionIdPrecedes(GetOldestActiveTransactionId(), waitforxid))
+	{
+		char		activity[64];
+		int			rc;
+
+		/* Oldest running xid is older than us, so wait */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for current transactions to finish (waiting for %u)",
+				 waitforxid);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		rc = WaitLatch(MyLatch,
+					   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+					   5000,
+					   WAIT_EVENT_CHECKSUM_ENABLE_STARTCONDITION);
+
+		/*
+		 * If the postmaster died we won't be able to enable checksums
+		 * cluster-wide so abort and hope to continue when restarted.
+		 */
+		if (rc & WL_POSTMASTER_DEATH)
+			ereport(FATAL,
+					(errmsg("postmaster exited during data checksum processing"),
+					 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_SHARED);
+		if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			abort_requested = true;
+		LWLockRelease(DataChecksumsWorkerLock);
+		if (abort_requested)
+			break;
+	}
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+	return;
+}
+
+/*
+ * DataChecksumsWorkerLauncherMain
+ *
+ * Main function for launching dynamic background workers for processing data
+ * checksums in databases. This function has the bgworker management, with
+ * ProcessAllDatabases being responsible for looping over the databases and
+ * initiating processing.
+ */
+void
+DataChecksumsWorkerLauncherMain(Datum arg)
+{
+	on_shmem_exit(launcher_exit, 0);
+
+	ereport(DEBUG1,
+			errmsg("background worker \"datachecksumsworker\" launcher started"));
+
+	pqsignal(SIGTERM, die);
+	pqsignal(SIGINT, launcher_cancel_handler);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_LAUNCHER;
+	init_ps_display(NULL);
+
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	if (DataChecksumsWorkerShmem->launcher_running)
+	{
+		/* Launcher was already running, let it finish */
+		LWLockRelease(DataChecksumsWorkerLock);
+		return;
+	}
+
+	launcher_running = true;
+
+	/*
+	 * Initialize a connection to shared catalogs only.
+	 */
+	BackgroundWorkerInitializeConnectionByOid(InvalidOid, InvalidOid, 0);
+
+	/* Initialize backend status information */
+	pgstat_bestart();
+
+	enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+	DataChecksumsWorkerShmem->launcher_running = true;
+	DataChecksumsWorkerShmem->enabling_checksums = enabling_checksums;
+	DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+	DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+	DataChecksumsWorkerShmem->immediate_checkpoint = DataChecksumsWorkerShmem->launch_fast;
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * The target state can change while we are busy enabling/disabling
+	 * checksums, if the user calls pg_disable/enable_data_checksums() before
+	 * we are finished with the previous request. In that case, we will loop
+	 * back here, to process the new request.
+	 */
+again:
+
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	/*
+	 * If we're asked to enable checksums, we need to check if processing was
+	 * previously interrupted such that we should resume rather than start
+	 * from scratch.
+	 */
+	if (enabling_checksums)
+	{
+		/*
+		 * If we are asked to enable checksums in a cluster which already has
+		 * checksums enabled, exit immediately as there is nothing more to do.
+		 * Hold interrupts to make sure state doesn't change during checking.
+		 */
+		HOLD_INTERRUPTS();
+		if (DataChecksumsNeedVerify())
+		{
+			RESUME_INTERRUPTS();
+			goto done;
+		}
+		RESUME_INTERRUPTS();
+
+		/*
+		 * Initialize progress and indicate that we are waiting on the other
+		 * backends to clear the procsignalbarrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS);
+
+		/*
+		 * Set the state to inprogress-on and wait on the procsignal barrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_ENABLING);
+		SetDataChecksumsOnInProgress();
+
+		if (!ProcessAllDatabases(DataChecksumsWorkerShmem->immediate_checkpoint))
+		{
+			/*
+			 * If the target state changed during processing then it's not a
+			 * failure, so restart processing instead.
+			 */
+			LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+			if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			{
+				LWLockRelease(DataChecksumsWorkerLock);
+				goto done;
+			}
+			LWLockRelease(DataChecksumsWorkerLock);
+			ereport(ERROR,
+					(errmsg("unable to enable data checksums in cluster")));
+		}
+
+		SetDataChecksumsOn();
+	}
+	else
+	{
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_DISABLING);
+		SetDataChecksumsOff();
+	}
+
+done:
+
+	/*
+	 * All done. But before we exit, check if the target state was changed
+	 * while we were running. In that case we will have to start all over
+	 * again.
+	 */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+	if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+	{
+		DataChecksumsWorkerShmem->enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+		DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+		LWLockRelease(DataChecksumsWorkerLock);
+		goto again;
+	}
+
+	/* Shut down progress reporting as we are done */
+	pgstat_progress_end_command();
+
+	launcher_running = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	LWLockRelease(DataChecksumsWorkerLock);
+}
+
+/*
+ * ProcessAllDatabases
+ *		Compute the list of all databases and process checksums in each
+ *
+ * This will repeatedly generate a list of databases to process for enabling
+ * checksums. Until no new databases are found, this will loop around computing
+ * a new list and comparing it to the already seen ones.
+ *
+ * If immediate_checkpoint is set to true then a CHECKPOINT_IMMEDIATE will be
+ * issued. This is useful for testing but should be avoided in production use
+ * as it may affect cluster performance drastically.
+ */
+static bool
+ProcessAllDatabases(bool immediate_checkpoint)
+{
+	List	   *DatabaseList;
+	HTAB	   *ProcessedDatabases = NULL;
+	HASHCTL		hash_ctl;
+	bool		found_failed = false;
+	int			flags;
+
+	/* Initialize a hash tracking all processed databases */
+	memset(&hash_ctl, 0, sizeof(hash_ctl));
+	hash_ctl.keysize = sizeof(Oid);
+	hash_ctl.entrysize = sizeof(DataChecksumsWorkerResultEntry);
+	ProcessedDatabases = hash_create("Processed databases",
+									 64,
+									 &hash_ctl,
+									 HASH_ELEM | HASH_BLOBS);
+
+	/*
+	 * Set up so first run processes shared catalogs, but not once in every
+	 * db.
+	 */
+	DataChecksumsWorkerShmem->process_shared_catalogs = true;
+
+	/*
+	 * Get a list of all databases to process. This may include databases that
+	 * were created during our runtime.  Since a database can be created as a
+	 * copy of any other database (which may not have existed in our last
+	 * run), we have to repeat this loop until no new databases show up in the
+	 * list.
+	 */
+	DatabaseList = BuildDatabaseList();
+
+	/*
+	 * Update progress reporting with the total number of databases we need to
+	 * process. This number should not be changed during processing, the
+	 * columns for processed databases is instead increased such that it can
+	 * be compared against the total.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_DB,
+								 list_length(DatabaseList));
+
+	while (true)
+	{
+		int			processed_databases = 0;
+
+		foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+		{
+			DataChecksumsWorkerResult result;
+			DataChecksumsWorkerResultEntry *entry;
+			bool		found;
+
+			/*
+			 * Indicate which database is being processed set the number of
+			 * relations to -1 to clear field from previous values. -1 will
+			 * translate to NULL in the progress view.
+			 */
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_CUR_DB, db->dboid);
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_REL, -1);
+
+			/*
+			 * Check if this database has been processed already, and if so
+			 * whether it should be retried or skipped.
+			 */
+			entry = (DataChecksumsWorkerResultEntry *) hash_search(ProcessedDatabases, &db->dboid,
+																   HASH_FIND, NULL);
+
+			if (entry)
+			{
+				if (entry->result == DATACHECKSUMSWORKER_RETRYDB)
+				{
+					/*
+					 * Limit the number of retries to avoid infinite looping
+					 * in case there simply won't be enough workers in the
+					 * cluster to finish this operation.
+					 */
+					if (entry->retries > DATACHECKSUMSWORKER_MAX_DB_RETRIES)
+						entry->result = DATACHECKSUMSWORKER_FAILED;
+				}
+
+				/* Skip if this database has been processed already */
+				if (entry->result != DATACHECKSUMSWORKER_RETRYDB)
+					continue;
+			}
+
+			result = ProcessDatabase(db);
+			processed_databases++;
+
+			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
+			{
+				/*
+				 * If one database has completed shared catalogs, we don't
+				 * have to process them again.
+				 */
+				if (DataChecksumsWorkerShmem->process_shared_catalogs)
+					DataChecksumsWorkerShmem->process_shared_catalogs = false;
+			}
+			else if (result == DATACHECKSUMSWORKER_ABORTED)
+			{
+				/* Abort flag set, so exit the whole process */
+				return false;
+			}
+
+			entry = hash_search(ProcessedDatabases, &db->dboid, HASH_ENTER, &found);
+			entry->dboid = db->dboid;
+			entry->result = result;
+			if (!found)
+				entry->retries = 0;
+			else
+				entry->retries++;
+		}
+
+		elog(DEBUG1,
+			 "%i databases processed for data checksum enabling, %s",
+			 processed_databases,
+			 (processed_databases ? "process with restart" : "process completed"));
+
+		FreeDatabaseList(DatabaseList);
+
+		/*
+		 * If no databases were processed in this run of the loop, we have now
+		 * finished all databases and no concurrently created ones can exist.
+		 */
+		if (processed_databases == 0)
+			break;
+
+		/*
+		 * Re-generate the list of databases for another pass. Since we wait
+		 * for all pre-existing transactions finish, this way we can be
+		 * certain that there are no databases left without checksums.
+		 */
+		WaitForAllTransactionsToFinish();
+		DatabaseList = BuildDatabaseList();
+	}
+
+	/*
+	 * ProcessedDatabases now has all databases and the results of their
+	 * processing. Failure to enable checksums for a database can be because
+	 * they actually failed for some reason, or because the database was
+	 * dropped between us getting the database list and trying to process it.
+	 * Get a fresh list of databases to detect the second case where the
+	 * database was dropped before we had started processing it. If a database
+	 * still exists, but enabling checksums failed then we fail the entire
+	 * checksumming process and exit with an error.
+	 */
+	WaitForAllTransactionsToFinish();
+	DatabaseList = BuildDatabaseList();
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+	{
+		DataChecksumsWorkerResult *entry;
+		bool		found;
+
+		entry = hash_search(ProcessedDatabases, (void *) &db->dboid,
+							HASH_FIND, &found);
+
+		/*
+		 * We are only interested in the databases where the failed database
+		 * still exists.
+		 */
+		if (found && *entry == DATACHECKSUMSWORKER_FAILED)
+		{
+			ereport(WARNING,
+					(errmsg("failed to enable data checksums in \"%s\"",
+							db->dbname)));
+			found_failed = found;
+			continue;
+		}
+	}
+
+	FreeDatabaseList(DatabaseList);
+
+	if (found_failed)
+	{
+		/* Disable checksums on cluster, because we failed */
+		SetDataChecksumsOff();
+		ereport(ERROR,
+				(errmsg("data checksums failed to get enabled in all databases, aborting"),
+				 errhint("The server log might have more information on the cause of the error.")));
+	}
+
+	/*
+	 * Force a checkpoint to get everything out to disk. The use of immediate
+	 * checkpoints is for running tests, as they would otherwise not execute
+	 * in such a way that they can reliably be placed under timeout control.
+	 */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_IMMEDIATE;
+	RequestCheckpoint(flags);
+
+	return true;
+}
+
+/*
+ * DataChecksumsWorkerShmemSize
+ *		Compute required space for datachecksumsworker-related shared memory
+ */
+Size
+DataChecksumsWorkerShmemSize(void)
+{
+	Size		size;
+
+	size = sizeof(DataChecksumsWorkerShmemStruct);
+	size = MAXALIGN(size);
+
+	return size;
+}
+
+/*
+ * DataChecksumsWorkerShmemInit
+ *		Allocate and initialize datachecksumsworker-related shared memory
+ */
+void
+DataChecksumsWorkerShmemInit(void)
+{
+	bool		found;
+
+	DataChecksumsWorkerShmem = (DataChecksumsWorkerShmemStruct *)
+		ShmemInitStruct("DataChecksumsWorker Data",
+						DataChecksumsWorkerShmemSize(),
+						&found);
+
+	MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
+
+	/*
+	 * Even if this is a redundant assignment, we want to be explicit about
+	 * our intent for readability, since we want to be able to query this
+	 * state in case of restartability.
+	 */
+	DataChecksumsWorkerShmem->launch_enable_checksums = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	DataChecksumsWorkerShmem->launch_fast = false;
+}
+
+/*
+ * BuildDatabaseList
+ *		Compile a list of all currently available databases in the cluster
+ *
+ * This creates the list of databases for the datachecksumsworker workers to
+ * add checksums to. If the caller wants to ensure that no concurrently
+ * running CREATE DATABASE calls exist, this needs to be preceded by a call
+ * to WaitForAllTransactionsToFinish().
+ */
+static List *
+BuildDatabaseList(void)
+{
+	List	   *DatabaseList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(DatabaseRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_database pgdb = (Form_pg_database) GETSTRUCT(tup);
+		DataChecksumsWorkerDatabase *db;
+
+		oldctx = MemoryContextSwitchTo(ctx);
+
+		db = (DataChecksumsWorkerDatabase *) palloc0(sizeof(DataChecksumsWorkerDatabase));
+
+		db->dboid = pgdb->oid;
+		db->dbname = pstrdup(NameStr(pgdb->datname));
+
+		DatabaseList = lappend(DatabaseList, db);
+
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return DatabaseList;
+}
+
+static void
+FreeDatabaseList(List *dblist)
+{
+	if (!dblist)
+		return;
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, dblist)
+	{
+		if (db->dbname != NULL)
+			pfree(db->dbname);
+	}
+
+	list_free_deep(dblist);
+}
+
+/*
+ * BuildRelationList
+ *		Compile a list of relations in the database
+ *
+ * Returns a list of OIDs for the request relation types. If temp_relations
+ * is True then only temporary relations are returned. If temp_relations is
+ * False then non-temporary relations which have data checksums are returned.
+ * If include_shared is True then shared relations are included as well in a
+ * non-temporary list. include_shared has no relevance when building a list of
+ * temporary relations.
+ */
+static List *
+BuildRelationList(bool temp_relations, bool include_shared)
+{
+	List	   *RelationList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(RelationRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_class pgc = (Form_pg_class) GETSTRUCT(tup);
+
+		/*
+		 * Only include temporary relations when asked for a temp relation
+		 * list.
+		 */
+		if (pgc->relpersistence == RELPERSISTENCE_TEMP)
+		{
+			if (!temp_relations)
+				continue;
+		}
+		else
+		{
+			/*
+			 * If we are only interested in temp relations then continue
+			 * immediately as the current relation isn't a temp relation.
+			 */
+			if (temp_relations)
+				continue;
+
+			if (!RELKIND_HAS_STORAGE(pgc->relkind))
+				continue;
+
+			if (pgc->relisshared && !include_shared)
+				continue;
+		}
+
+		oldctx = MemoryContextSwitchTo(ctx);
+		RelationList = lappend_oid(RelationList, pgc->oid);
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return RelationList;
+}
+
+/*
+ * DataChecksumsWorkerMain
+ *
+ * Main function for enabling checksums in a single database, This is the
+ * function set as the bgw_function_name in the dynamic background worker
+ * process initiated for each database by the worker launcher. After enabling
+ * data checksums in each applicable relation in the database, it will wait for
+ * all temporary relations that were present when the function started to
+ * disappear before returning. This is required since we cannot rewrite
+ * existing temporary relations with data checksums.
+ */
+void
+DataChecksumsWorkerMain(Datum arg)
+{
+	Oid			dboid = DatumGetObjectId(arg);
+	List	   *RelationList = NIL;
+	List	   *InitialTempTableList = NIL;
+	BufferAccessStrategy strategy;
+	bool		aborted = false;
+
+	enabling_checksums = true;
+
+	pqsignal(SIGTERM, die);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_WORKER;
+	init_ps_display(NULL);
+
+	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
+											  BGWORKER_BYPASS_ALLOWCONN);
+
+	/*
+	 * Get a list of all temp tables present as we start in this database. We
+	 * need to wait until they are all gone until we are done, since we cannot
+	 * access these relations and modify them.
+	 */
+	InitialTempTableList = BuildRelationList(true, false);
+
+	/*
+	 * Enable vacuum cost delay, if any.
+	 */
+	Assert(DataChecksumsWorkerShmem->enabling_checksums);
+	VacuumCostDelay = DataChecksumsWorkerShmem->cost_delay;
+	VacuumCostLimit = DataChecksumsWorkerShmem->cost_limit;
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumCostPageHit = 0;
+	VacuumCostPageMiss = 0;
+	VacuumCostPageDirty = 0;
+
+	/*
+	 * Create and set the vacuum strategy as our buffer strategy.
+	 */
+	strategy = GetAccessStrategy(BAS_VACUUM);
+
+	RelationList = BuildRelationList(false,
+									 DataChecksumsWorkerShmem->process_shared_catalogs);
+	foreach_oid(reloid, RelationList)
+	{
+		if (!ProcessSingleRelationByOid(reloid, strategy))
+		{
+			aborted = true;
+			break;
+		}
+	}
+	list_free(RelationList);
+
+	if (aborted)
+	{
+		DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+		ereport(DEBUG1,
+				(errmsg("data checksum processing aborted in database OID %u",
+						dboid)));
+		return;
+	}
+
+	/*
+	 * Wait for all temp tables that existed when we started to go away. This
+	 * is necessary since we cannot "reach" them to enable checksums. Any temp
+	 * tables created after we started will already have checksums in them
+	 * (due to the "inprogress-on" state), so no need to wait for those.
+	 */
+	for (;;)
+	{
+		List	   *CurrentTempTables;
+		int			numleft;
+		char		activity[64];
+
+		CurrentTempTables = BuildRelationList(true, false);
+		numleft = 0;
+		foreach_oid(tmptbloid, InitialTempTableList)
+		{
+			if (list_member_oid(CurrentTempTables, tmptbloid))
+				numleft++;
+		}
+		list_free(CurrentTempTables);
+
+		if (numleft == 0)
+			break;
+
+		/*
+		 * At least one temp table is left to wait for, indicate in pgstat
+		 * activity and progress reporting.
+		 */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for %d temp tables to be removed", numleft);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		(void) WaitLatch(MyLatch,
+						 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+						 5000,
+						 WAIT_EVENT_CHECKSUM_ENABLE_FINISHCONDITION);
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		aborted = DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums;
+		LWLockRelease(DataChecksumsWorkerLock);
+
+		if (aborted || abort_requested)
+		{
+			DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+			ereport(DEBUG1,
+					(errmsg("data checksum processing aborted in database OID %u",
+							dboid)));
+			return;
+		}
+	}
+
+	list_free(InitialTempTableList);
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
+}
diff --git a/src/backend/postmaster/meson.build b/src/backend/postmaster/meson.build
index 0008603cfee..ce10ef1059a 100644
--- a/src/backend/postmaster/meson.build
+++ b/src/backend/postmaster/meson.build
@@ -6,6 +6,7 @@ backend_sources += files(
   'bgworker.c',
   'bgwriter.c',
   'checkpointer.c',
+  'datachecksumsworker.c',
   'fork_process.c',
   'interrupt.c',
   'launch_backend.c',
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 24d88f368d8..00cce4c7673 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -186,6 +186,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 		case XLOG_FPW_CHANGE:
 		case XLOG_FPI_FOR_HINT:
 		case XLOG_FPI:
+		case XLOG_CHECKSUMS:
 		case XLOG_OVERWRITE_CONTRECORD:
 		case XLOG_CHECKPOINT_REDO:
 			break;
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 174eed70367..e7761a3ddb6 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -30,6 +30,8 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "postmaster/postmaster.h"
 #include "postmaster/walsummarizer.h"
 #include "replication/logicallauncher.h"
 #include "replication/origin.h"
@@ -148,6 +150,7 @@ CalculateShmemSize(int *num_semaphores)
 	size = add_size(size, WaitEventCustomShmemSize());
 	size = add_size(size, InjectionPointShmemSize());
 	size = add_size(size, SlotSyncShmemSize());
+	size = add_size(size, DataChecksumsWorkerShmemSize());
 
 	/* include additional requested shmem from preload libraries */
 	size = add_size(size, total_addin_request);
@@ -330,6 +333,7 @@ CreateOrAttachShmemStructs(void)
 	PgArchShmemInit();
 	ApplyLauncherShmemInit();
 	SlotSyncShmemInit();
+	DataChecksumsWorkerShmemInit();
 
 	/*
 	 * Set up other modules that need some shared memory space
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 7d201965503..2b13a8cd260 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -18,6 +18,7 @@
 #include <unistd.h>
 
 #include "access/parallel.h"
+#include "access/xlog.h"
 #include "commands/async.h"
 #include "miscadmin.h"
 #include "pgstat.h"
@@ -575,6 +576,18 @@ ProcessProcSignalBarrier(void)
 					case PROCSIGNAL_BARRIER_SMGRRELEASE:
 						processed = ProcessBarrierSmgrRelease();
 						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON:
+						processed = AbsorbChecksumsOnInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_ON:
+						processed = AbsorbChecksumsOnBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF:
+						processed = AbsorbChecksumsOffInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_OFF:
+						processed = AbsorbChecksumsOffBarrier();
+						break;
 				}
 
 				/*
diff --git a/src/backend/storage/page/README b/src/backend/storage/page/README
index e30d7ac59ad..73c36a63908 100644
--- a/src/backend/storage/page/README
+++ b/src/backend/storage/page/README
@@ -10,7 +10,9 @@ http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf, discussed
 2010/12/22 on -hackers list.
 
 Current implementation requires this be enabled system-wide at initdb time, or
-by using the pg_checksums tool on an offline cluster.
+by using the pg_checksums tool on an offline cluster.  Checksums can also be
+enabled at runtime using pg_enable_data_checksums(), and disabled by using
+pg_disable_data_checksums().
 
 The checksum is not valid at all times on a data page!!
 The checksum is valid when the page leaves the shared pool and is checked
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index ecc81aacfc3..8bd5fed8c85 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -98,7 +98,7 @@ PageIsVerifiedExtended(PageData *page, BlockNumber blkno, int flags)
 	 */
 	if (!PageIsNew(page))
 	{
-		if (DataChecksumsEnabled())
+		if (DataChecksumsNeedVerify())
 		{
 			checksum = pg_checksum_page(page, blkno);
 
@@ -1501,7 +1501,7 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 	static char *pageCopy = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return page;
 
 	/*
@@ -1531,7 +1531,7 @@ void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
 	/* If we don't need a checksum, just return */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return;
 
 	((PageHeader) page)->pd_checksum = pg_checksum_page(page, blkno);
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index eb575025596..e7b418a000e 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -370,6 +370,8 @@ pgstat_tracks_io_bktype(BackendType bktype)
 		case B_LOGGER:
 			return false;
 
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 		case B_AUTOVAC_LAUNCHER:
 		case B_AUTOVAC_WORKER:
 		case B_BACKEND:
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index e199f071628..c7cd48e8d33 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -115,6 +115,8 @@ CHECKPOINT_DELAY_COMPLETE	"Waiting for a backend that blocks a checkpoint from c
 CHECKPOINT_DELAY_START	"Waiting for a backend that blocks a checkpoint from starting."
 CHECKPOINT_DONE	"Waiting for a checkpoint to complete."
 CHECKPOINT_START	"Waiting for a checkpoint to start."
+CHECKSUM_ENABLE_STARTCONDITION	"Waiting for data checksums enabling to start."
+CHECKSUM_ENABLE_FINISHCONDITION	"Waiting for data checksums to be enabled."
 EXECUTE_GATHER	"Waiting for activity from a child process while executing a <literal>Gather</literal> plan node."
 HASH_BATCH_ALLOCATE	"Waiting for an elected Parallel Hash participant to allocate a hash table."
 HASH_BATCH_ELECT	"Waiting to elect a Parallel Hash participant to allocate a hash table."
@@ -346,6 +348,7 @@ WALSummarizer	"Waiting to read or update WAL summarization state."
 DSMRegistry	"Waiting to read or update the dynamic shared memory registry."
 InjectionPoint	"Waiting to read or update information related to injection points."
 SerialControl	"Waiting to read or update shared <filename>pg_serial</filename> state."
+DataChecksumsWorker	"Waiting for data checksumsworker."
 
 #
 # END OF PREDEFINED LWLOCKS (DO NOT CHANGE THIS LINE)
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 9172e1cb9d2..f6bd1e9f0ca 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -274,6 +274,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		cmdtype = PROGRESS_COMMAND_BASEBACKUP;
 	else if (pg_strcasecmp(cmd, "COPY") == 0)
 		cmdtype = PROGRESS_COMMAND_COPY;
+	else if (pg_strcasecmp(cmd, "DATACHECKSUMS") == 0)
+		cmdtype = PROGRESS_COMMAND_DATACHECKSUMS;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -1146,9 +1148,6 @@ pg_stat_get_db_checksum_failures(PG_FUNCTION_ARGS)
 	int64		result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
@@ -1164,9 +1163,6 @@ pg_stat_get_db_checksum_last_failure(PG_FUNCTION_ARGS)
 	TimestampTz result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index dc3521457c7..a071ba6f455 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -293,6 +293,12 @@ GetBackendTypeDesc(BackendType backendType)
 		case B_CHECKPOINTER:
 			backendDesc = gettext_noop("checkpointer");
 			break;
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+			backendDesc = "datachecksumsworker launcher";
+			break;
+		case B_DATACHECKSUMSWORKER_WORKER:
+			backendDesc = "datachecksumsworker worker";
+			break;
 		case B_LOGGER:
 			backendDesc = gettext_noop("logger");
 			break;
@@ -892,7 +898,8 @@ InitializeSessionUserIdStandalone(void)
 	 * workers, in slot sync worker and in background workers.
 	 */
 	Assert(!IsUnderPostmaster || AmAutoVacuumWorkerProcess() ||
-		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess());
+		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess() ||
+		   AmDataChecksumsWorkerProcess());
 
 	/* call only once */
 	Assert(!OidIsValid(AuthenticatedUserId));
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index ee1a9d5d98b..70899e6ef2d 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -739,6 +739,11 @@ InitPostgres(const char *in_dbname, Oid dboid,
 	 */
 	SharedInvalBackendInit(false);
 
+	/*
+	 * Set up backend local cache of Controldata values.
+	 */
+	InitLocalControldata();
+
 	ProcSignalInit(MyCancelKeyValid, MyCancelKey);
 
 	/*
@@ -869,7 +874,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
 					 errhint("You should immediately run CREATE USER \"%s\" SUPERUSER;.",
 							 username != NULL ? username : "postgres")));
 	}
-	else if (AmBackgroundWorkerProcess())
+	else if (AmBackgroundWorkerProcess() || AmDataChecksumsWorkerProcess())
 	{
 		if (username == NULL && !OidIsValid(useroid))
 		{
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index ad25cbb39c5..05cba3a02e3 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -476,6 +476,14 @@ static const struct config_enum_entry wal_compression_options[] = {
 	{NULL, 0, false}
 };
 
+static const struct config_enum_entry data_checksums_options[] = {
+	{"on", DATA_CHECKSUMS_ON, true},
+	{"off", DATA_CHECKSUMS_OFF, true},
+	{"inprogress-on", DATA_CHECKSUMS_INPROGRESS_ON, true},
+	{"inprogress-off", DATA_CHECKSUMS_INPROGRESS_OFF, true},
+	{NULL, 0, false}
+};
+
 /*
  * Options for enum values stored in other modules
  */
@@ -601,7 +609,7 @@ static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
 static int	num_os_semaphores;
-static bool data_checksums;
+static int	data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
@@ -1952,17 +1960,6 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
-	{
-		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
-			gettext_noop("Shows whether data checksums are turned on for this cluster."),
-			NULL,
-			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
-		},
-		&data_checksums,
-		false,
-		NULL, NULL, NULL
-	},
-
 	{
 		{"syslog_sequence_numbers", PGC_SIGHUP, LOGGING_WHERE,
 			gettext_noop("Add sequence number to syslog messages to avoid duplicate suppression."),
@@ -5299,6 +5296,17 @@ struct config_enum ConfigureNamesEnum[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows whether data checksums are turned on for this cluster."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+		},
+		&data_checksums,
+		DATA_CHECKSUMS_OFF, data_checksums_options,
+		NULL, NULL, show_data_checksums
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 867aeddc601..79c3f86357e 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -576,7 +576,7 @@ main(int argc, char *argv[])
 		mode == PG_MODE_DISABLE)
 		pg_fatal("data checksums are already disabled in cluster");
 
-	if (ControlFile->data_checksum_version > 0 &&
+	if (ControlFile->data_checksum_version == DATA_CHECKSUMS_ON &&
 		mode == PG_MODE_ENABLE)
 		pg_fatal("data checksums are already enabled in cluster");
 
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index bd49ea867bf..f48936d3b00 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -14,6 +14,7 @@
 
 #include "common/string.h"
 #include "pg_upgrade.h"
+#include "storage/bufpage.h"
 
 
 /*
@@ -735,6 +736,14 @@ check_control_data(ControlData *oldctrl,
 	 * check_for_isn_and_int8_passing_mismatch().
 	 */
 
+	/*
+	 * If data checksums are in any in-progress state then disallow the
+	 * upgrade. The user should either let the process finish, or turn off
+	 * data checksums, before retrying.
+	 */
+	if (oldctrl->data_checksum_version > PG_DATA_CHECKSUM_VERSION)
+		pg_fatal("checksums are being enabled in the old cluster");
+
 	/*
 	 * We might eventually allow upgrades from checksum to no-checksum
 	 * clusters.
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index d313099c027..0bb32c9c0fc 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -117,7 +117,7 @@ extern PGDLLIMPORT int wal_level;
  * of the bits make it to disk, but the checksum wouldn't match.  Also WAL-log
  * them if forced by wal_log_hints=on.
  */
-#define XLogHintBitIsNeeded() (DataChecksumsEnabled() || wal_log_hints)
+#define XLogHintBitIsNeeded() (wal_log_hints || DataChecksumsNeedWrite())
 
 /* Do we need to WAL-log information required only for Hot Standby and logical replication? */
 #define XLogStandbyInfoActive() (wal_level >= WAL_LEVEL_REPLICA)
@@ -230,7 +230,19 @@ extern XLogRecPtr GetXLogWriteRecPtr(void);
 
 extern uint64 GetSystemIdentifier(void);
 extern char *GetMockAuthenticationNonce(void);
-extern bool DataChecksumsEnabled(void);
+extern bool DataChecksumsNeedWrite(void);
+extern bool DataChecksumsNeedVerify(void);
+extern bool DataChecksumsOnInProgress(void);
+extern bool DataChecksumsOffInProgress(void);
+extern void SetDataChecksumsOnInProgress(void);
+extern void SetDataChecksumsOn(void);
+extern void SetDataChecksumsOff(void);
+extern bool AbsorbChecksumsOnInProgressBarrier(void);
+extern bool AbsorbChecksumsOffInProgressBarrier(void);
+extern bool AbsorbChecksumsOnBarrier(void);
+extern bool AbsorbChecksumsOffBarrier(void);
+extern const char *show_data_checksums(void);
+extern void InitLocalControldata(void);
 extern bool GetDefaultCharSignedness(void);
 extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
 extern Size XLOGShmemSize(void);
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index 2cf8d55d706..ee82d4ce73d 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,6 +25,7 @@
 #include "lib/stringinfo.h"
 #include "pgtime.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/relfilelocator.h"
 
 
@@ -289,6 +290,12 @@ typedef struct xl_restore_point
 	char		rp_name[MAXFNAMELEN];
 } xl_restore_point;
 
+/* Information logged when data checksum level is changed */
+typedef struct xl_checksum_state
+{
+	ChecksumType new_checksumtype;
+} xl_checksum_state;
+
 /* Overwrite of prior contrecord */
 typedef struct xl_overwrite_contrecord
 {
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 63e834a6ce4..9b5a50adf54 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -80,6 +80,7 @@ typedef struct CheckPoint
 /* 0xC0 is used in Postgres 9.5-11 */
 #define XLOG_OVERWRITE_CONTRECORD		0xD0
 #define XLOG_CHECKPOINT_REDO			0xE0
+#define XLOG_CHECKSUMS					0xF0
 
 
 /*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index cede992b6e2..fb0e062b984 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12246,6 +12246,23 @@
   proname => 'jsonb_subscript_handler', prorettype => 'internal',
   proargtypes => 'internal', prosrc => 'jsonb_subscript_handler' },
 
+# data checksum management functions
+{ oid => '9258',
+  descr => 'disable data checksums',
+  proname => 'pg_disable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => '',
+  prosrc => 'disable_data_checksums' },
+
+{ oid => '9257',
+  descr => 'enable data checksums',
+  proname => 'pg_enable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'int4 int4 bool', proallargtypes => '{int4,int4,bool}',
+  proargmodes => '{i,i,i}',
+  proargnames => '{cost_delay,cost_limit,fast}',
+  prosrc => 'enable_data_checksums' },
+
 # collation management functions
 { oid => '3445', descr => 'import collations from operating system',
   proname => 'pg_import_system_collations', procost => '100',
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 7c736e7b03b..1ebd0c792b4 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -157,4 +157,22 @@
 #define PROGRESS_COPY_TYPE_PIPE 3
 #define PROGRESS_COPY_TYPE_CALLBACK 4
 
+/* Progress parameters for PROGRESS_DATACHECKSUMS */
+#define PROGRESS_DATACHECKSUMS_PHASE 0
+#define PROGRESS_DATACHECKSUMS_TOTAL_DB 1
+#define PROGRESS_DATACHECKSUMS_TOTAL_REL 2
+#define PROGRESS_DATACHECKSUMS_PROCESSED_DB 3
+#define PROGRESS_DATACHECKSUMS_PROCESSED_REL 4
+#define PROGRESS_DATACHECKSUMS_CUR_DB 5
+#define PROGRESS_DATACHECKSUMS_CUR_REL 6
+#define PROGRESS_DATACHECKSUMS_CUR_REL_TOTAL_BLOCKS 7
+#define PROGRESS_DATACHECKSUMS_CUR_REL_PROCESSED_BLOCKS 8
+
+/* Phases of datachecksumsworker operation */
+#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING 0
+#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING 1
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS 2
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL 3
+#define PROGRESS_DATACHECKSUMS_DONE 4
+
 #endif
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index a2b63495eec..9923c7f518d 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -365,6 +365,9 @@ typedef enum BackendType
 	B_WAL_SUMMARIZER,
 	B_WAL_WRITER,
 
+	B_DATACHECKSUMSWORKER_LAUNCHER,
+	B_DATACHECKSUMSWORKER_WORKER,
+
 	/*
 	 * Logger is not connected to shared memory and does not have a PGPROC
 	 * entry.
@@ -389,6 +392,7 @@ extern PGDLLIMPORT BackendType MyBackendType;
 #define AmWalReceiverProcess()		(MyBackendType == B_WAL_RECEIVER)
 #define AmWalSummarizerProcess()	(MyBackendType == B_WAL_SUMMARIZER)
 #define AmWalWriterProcess()		(MyBackendType == B_WAL_WRITER)
+#define AmDataChecksumsWorkerProcess()	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
 
 #define AmSpecialWorkerProcess() \
 	(AmAutoVacuumLauncherProcess() || \
diff --git a/src/include/postmaster/datachecksumsworker.h b/src/include/postmaster/datachecksumsworker.h
new file mode 100644
index 00000000000..59c9000d646
--- /dev/null
+++ b/src/include/postmaster/datachecksumsworker.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.h
+ *	  header file for checksum helper background worker
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/postmaster/datachecksumsworker.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef DATACHECKSUMSWORKER_H
+#define DATACHECKSUMSWORKER_H
+
+/* Shared memory */
+extern Size DataChecksumsWorkerShmemSize(void);
+extern void DataChecksumsWorkerShmemInit(void);
+
+/* Start the background processes for enabling or disabling checksums */
+void		StartDataChecksumsWorkerLauncher(bool enable_checksums,
+											 int cost_delay,
+											 int cost_limit,
+											 bool fast);
+
+/* Background worker entrypoints */
+void		DataChecksumsWorkerLauncherMain(Datum arg);
+void		DataChecksumsWorkerMain(Datum arg);
+
+#endif							/* DATACHECKSUMSWORKER_H */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 6646b6f6371..5326782171f 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -205,7 +205,17 @@ typedef PageHeaderData *PageHeader;
  * handling pages.
  */
 #define PG_PAGE_LAYOUT_VERSION		4
+
+/*
+ * Checksum version 0 is used for when data checksums are disabled.
+ * PG_DATA_CHECKSUM_VERSION defines that data checksums are enabled in the
+ * cluster and PG_DATA_CHECKSUM_INPROGRESS_{ON|OFF}_VERSION defines that data
+ * checksums are either currently being enabled or disabled.
+ */
 #define PG_DATA_CHECKSUM_VERSION	1
+#define PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION	2
+#define PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION	3
+
 
 /* ----------------------------------------------------------------
  *						page support functions
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 25d13a798d1..6faff962ef0 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -15,6 +15,14 @@
 
 #include "storage/block.h"
 
+typedef enum ChecksumType
+{
+	DATA_CHECKSUMS_OFF = 0,
+	DATA_CHECKSUMS_ON,
+	DATA_CHECKSUMS_INPROGRESS_ON,
+	DATA_CHECKSUMS_INPROGRESS_OFF
+} ChecksumType;
+
 /*
  * Compute the checksum for a Postgres page.  The page must be aligned on a
  * 4-byte boundary.
diff --git a/src/include/storage/lwlocklist.h b/src/include/storage/lwlocklist.h
index cf565452382..afe39db753c 100644
--- a/src/include/storage/lwlocklist.h
+++ b/src/include/storage/lwlocklist.h
@@ -83,3 +83,4 @@ PG_LWLOCK(49, WALSummarizer)
 PG_LWLOCK(50, DSMRegistry)
 PG_LWLOCK(51, InjectionPoint)
 PG_LWLOCK(52, SerialControl)
+PG_LWLOCK(53, DataChecksumsWorker)
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 114eb1f8f76..61a3e3af9d4 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -447,9 +447,10 @@ extern PGDLLIMPORT PGPROC *PreparedXactProcs;
  * Background writer, checkpointer, WAL writer, WAL summarizer, and archiver
  * run during normal operation.  Startup process and WAL receiver also consume
  * 2 slots, but WAL writer is launched only after startup has exited, so we
- * only need 6 slots.
+ * only need 6 slots to cover these. The DataChecksums worker and launcher
+ * can consume 2 slots when data checksums are enabled or disabled.
  */
-#define NUM_AUXILIARY_PROCS		6
+#define NUM_AUXILIARY_PROCS		8
 
 /* configurable options */
 extern PGDLLIMPORT int DeadlockTimeout;
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 022fd8ed933..8937fa6ed3d 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -54,6 +54,11 @@ typedef enum
 typedef enum
 {
 	PROCSIGNAL_BARRIER_SMGRRELEASE, /* ask smgr to close files */
+
+	PROCSIGNAL_BARRIER_CHECKSUM_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_ON,
 } ProcSignalBarrierType;
 
 /*
diff --git a/src/include/utils/backend_progress.h b/src/include/utils/backend_progress.h
index dda813ab407..c664e92dbfe 100644
--- a/src/include/utils/backend_progress.h
+++ b/src/include/utils/backend_progress.h
@@ -28,6 +28,7 @@ typedef enum ProgressCommandType
 	PROGRESS_COMMAND_CREATE_INDEX,
 	PROGRESS_COMMAND_BASEBACKUP,
 	PROGRESS_COMMAND_COPY,
+	PROGRESS_COMMAND_DATACHECKSUMS,
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	20
diff --git a/src/test/Makefile b/src/test/Makefile
index 511a72e6238..278ce3e8a86 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,16 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = perl postmaster regress isolation modules authentication recovery subscription
+SUBDIRS = \
+		perl \
+		postmaster \
+		regress \
+		isolation \
+		modules \
+		authentication \
+		recovery \
+		subscription \
+		checksum
 
 ifeq ($(with_icu),yes)
 SUBDIRS += icu
diff --git a/src/test/checksum/.gitignore b/src/test/checksum/.gitignore
new file mode 100644
index 00000000000..871e943d50e
--- /dev/null
+++ b/src/test/checksum/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/checksum/Makefile b/src/test/checksum/Makefile
new file mode 100644
index 00000000000..fd03bf73df4
--- /dev/null
+++ b/src/test/checksum/Makefile
@@ -0,0 +1,23 @@
+#-------------------------------------------------------------------------
+#
+# Makefile for src/test/checksum
+#
+# Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/checksum/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/test/checksum
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
+
+clean distclean maintainer-clean:
+	rm -rf tmp_check
diff --git a/src/test/checksum/README b/src/test/checksum/README
new file mode 100644
index 00000000000..0f0317060b3
--- /dev/null
+++ b/src/test/checksum/README
@@ -0,0 +1,22 @@
+src/test/checksum/README
+
+Regression tests for data checksums
+===================================
+
+This directory contains a test suite for enabling data checksums
+in a running cluster.
+
+Running the tests
+=================
+
+    make check
+
+or
+
+    make installcheck
+
+NOTE: This creates a temporary installation (in the case of "check"),
+with multiple nodes, be they master or standby(s) for the purpose of
+the tests.
+
+NOTE: This requires the --enable-tap-tests argument to configure.
diff --git a/src/test/checksum/meson.build b/src/test/checksum/meson.build
new file mode 100644
index 00000000000..5f96b5c246d
--- /dev/null
+++ b/src/test/checksum/meson.build
@@ -0,0 +1,15 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+tests += {
+  'name': 'checksums',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'tests': [
+      't/001_basic.pl',
+      't/002_restarts.pl',
+      't/003_standby_restarts.pl',
+      't/004_offline.pl',
+    ],
+  },
+}
diff --git a/src/test/checksum/t/001_basic.pl b/src/test/checksum/t/001_basic.pl
new file mode 100644
index 00000000000..4c64f6a14fc
--- /dev/null
+++ b/src/test/checksum/t/001_basic.pl
@@ -0,0 +1,88 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are turned off
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable data checksums
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+# Wait for checksums to become enabled
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Enable data checksums again which should be a no-op..
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+# ..and make sure we can still read/write data
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums again
+$node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are disabled');
+
+# Test reading again
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure previously checksummed pages can be read back');
+
+# Re-enable checksums and make sure that the underlying data has changed to
+# ensure that checksums will be different.
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back the data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$node->stop;
+
+done_testing();
diff --git a/src/test/checksum/t/002_restarts.pl b/src/test/checksum/t/002_restarts.pl
new file mode 100644
index 00000000000..2697b722257
--- /dev/null
+++ b/src/test/checksum/t/002_restarts.pl
@@ -0,0 +1,92 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# restarting the processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, '1', "ensure checksums aren't enabled yet");
+
+$bsession->quit;
+$node->stop;
+$node->start;
+
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'inprogress-on', "ensure checksums aren't enabled yet");
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are turned on');
+
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+$result = $node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are turned off');
+
+done_testing();
diff --git a/src/test/checksum/t/003_standby_restarts.pl b/src/test/checksum/t/003_standby_restarts.pl
new file mode 100644
index 00000000000..6c0fe8f3bf8
--- /dev/null
+++ b/src/test/checksum/t/003_standby_restarts.pl
@@ -0,0 +1,131 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# streaming replication
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize primary node
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+$node_primary->init(allows_streaming => 1, no_data_checksums => 1);
+$node_primary->start;
+my $backup_name = 'my_backup';
+
+# Take backup
+$node_primary->backup($backup_name);
+
+# Create streaming standby linking to primary
+my $node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $backup_name,
+	has_streaming => 1);
+$node_standby_1->start;
+
+# Create some content on the primary to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Wait for standbys to catch up
+$node_primary->wait_for_catchup($node_standby_1, 'replay',
+	$node_primary->lsn('insert'));
+
+# Check that checksums are turned off on all nodes
+my $result = $node_primary->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on primary');
+
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on standby_1');
+
+# Enable checksums for the cluster
+$node_primary->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+# Ensure that the primary switches to "inprogress-on"
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	"inprogress-on");
+is($result, 1, 'ensure checksums are in progress on primary');
+
+# Wait for checksum enable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Ensure that the standby has switched to "inprogress-on" or "on".  Normally it
+# would be "inprogress-on", but it is theoretically possible for the primary to
+# complete the checksum enabling *and* have the standby replay that record
+# before we reach the check below.
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting = 'off' FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'f');
+is($result, 1, 'ensure standby has absorbed the inprogress-on barrier');
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+
+is(($result eq 'inprogress-on' || $result eq 'on'),
+	1, 'ensure checksums are on, or in progress, on standby_1');
+
+# Insert some more data which should be checksummed on INSERT
+$node_primary->safe_psql('postgres',
+	"INSERT INTO t VALUES (generate_series(1, 10000));");
+
+# Wait for checksums enabled on the primary
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the primary');
+
+# Wait for checksums enabled on the standby
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the standby');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, '20000', 'ensure we can safely read all data with checksums');
+
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+# Disable checksums and ensure it's propagated to standby and that we can
+# still read all data
+$node_primary->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+# Wait for checksum disable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure data checksums are disabled on the primary 2');
+
+# Ensure that the standby has switched to off
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are off on standby_1');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, "20000", 'ensure we can safely read all data without checksums');
+
+done_testing();
diff --git a/src/test/checksum/t/004_offline.pl b/src/test/checksum/t/004_offline.pl
new file mode 100644
index 00000000000..9cee62c9b52
--- /dev/null
+++ b/src/test/checksum/t/004_offline.pl
@@ -0,0 +1,101 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums offline from various states
+# of checksum processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable checksums offline using pg_checksums
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are enabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums offline again using pg_checksums
+$node->stop;
+$node->checksum_disable_offline;
+$node->start;
+
+# Ensure that checksums are disabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, 1, 'ensure checksums are in the process of being enabled');
+
+# Turn the cluster off and enable checksums offline, then start back up
+$bsession->quit;
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are now enabled even though processing wasn't
+# restarted
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+done_testing();
diff --git a/src/test/meson.build b/src/test/meson.build
index ccc31d6a86a..c4bfef2c0e2 100644
--- a/src/test/meson.build
+++ b/src/test/meson.build
@@ -8,6 +8,7 @@ subdir('postmaster')
 subdir('recovery')
 subdir('subscription')
 subdir('modules')
+subdir('checksum')
 
 if ssl.found()
   subdir('ssl')
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index b105cba05a6..666bd2a2d4c 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3748,6 +3748,40 @@ sub advance_wal
 	}
 }
 
+=item $node->checksum_enable_offline()
+
+Enable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_enable_offline
+{
+	my ($self) = @_;
+
+	print "### Enabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-e');
+	return;
+}
+
+=item checksum_disable_offline
+
+Disable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_disable_offline
+{
+	my ($self) = @_;
+
+	print "### Disabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-d');
+	return;
+}
+
 =pod
 
 =back
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 62f69ac20b2..36bed4b168d 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2041,6 +2041,34 @@ pg_stat_progress_create_index| SELECT s.pid,
     s.param15 AS partitions_done
    FROM (pg_stat_get_progress_info('CREATE INDEX'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
+pg_stat_progress_data_checksums| SELECT s.pid,
+    s.datid,
+    d.datname,
+        CASE s.param1
+            WHEN 0 THEN 'enabling'::text
+            WHEN 1 THEN 'disabling'::text
+            WHEN 2 THEN 'waiting'::text
+            WHEN 3 THEN 'waiting on backends'::text
+            WHEN 4 THEN 'waiting on temporary tables'::text
+            WHEN 5 THEN 'done'::text
+            ELSE NULL::text
+        END AS phase,
+        CASE s.param2
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param2
+        END AS databases_total,
+        CASE s.param3
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param3
+        END AS relations_total,
+    s.param4 AS databases_processed,
+    s.param5 AS relations_processed,
+    s.param6 AS databases_current,
+    s.param7 AS relation_current,
+    s.param8 AS relation_current_blocks,
+    s.param9 AS relation_current_blocks_processed
+   FROM (pg_stat_get_progress_info('DATACHECKSUMS'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 9840060997f..86e4057f61d 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -403,6 +403,7 @@ CheckPointStmt
 CheckpointStatsData
 CheckpointerRequest
 CheckpointerShmemStruct
+ChecksumType
 Chromosome
 CkptSortItem
 CkptTsStatus
@@ -593,6 +594,10 @@ DataPageDeleteStack
 DataTypesUsageChecks
 DataTypesUsageVersionCheck
 DatabaseInfo
+DataChecksumsWorkerDatabase
+DataChecksumsWorkerResult
+DataChecksumsWorkerResultEntry
+DataChecksumsWorkerShmemStruct
 DateADT
 DateTimeErrorExtra
 Datum
@@ -4141,6 +4146,7 @@ xl_btree_split
 xl_btree_unlink_page
 xl_btree_update
 xl_btree_vacuum
+xl_checksum_state
 xl_clog_truncate
 xl_commit_ts_truncate
 xl_dbase_create_file_copy_rec
-- 
2.48.1

v20250310b-0002-simple-post-rebase-fixes.patchtext/x-patch; charset=UTF-8; name=v20250310b-0002-simple-post-rebase-fixes.patchDownload
From 252b6c3ad06b555e1cc20ca33155ee408f3cff33 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Fri, 7 Mar 2025 20:19:29 +0100
Subject: [PATCH v20250310b 2/5] simple post-rebase fixes

- Update checks in PostmasterStateMachine to account for datachecksum
  workers, etc.

- Remove pgstat_bestart() call - it would need to be _initial(), but I
  don't think it's needed.

- Update vacuum_delay_point() call.

- Cast PID to long in elog call (same as we do in postmaster.c)

- Fix test 003_standby_restarts by adding a replication slot
---
 src/backend/postmaster/datachecksumsworker.c |  7 ++-----
 src/backend/postmaster/postmaster.c          |  5 +++++
 src/backend/utils/activity/pgstat_backend.c  |  2 ++
 src/test/checksum/t/003_standby_restarts.pl  | 10 +++++++++-
 4 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
index de7a077f9c2..b30d5481b06 100644
--- a/src/backend/postmaster/datachecksumsworker.c
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -450,7 +450,7 @@ ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrateg
 		if (abort_requested)
 			return false;
 
-		vacuum_delay_point();
+		vacuum_delay_point(false);
 	}
 
 	pfree(relns);
@@ -587,7 +587,7 @@ ProcessDatabase(DataChecksumsWorkerDatabase *db)
 					db->dbname)));
 
 	snprintf(activity, sizeof(activity) - 1,
-			 "Waiting for worker in database %s (pid %d)", db->dbname, pid);
+			 "Waiting for worker in database %s (pid %ld)", db->dbname, (long) pid);
 	pgstat_report_activity(STATE_RUNNING, activity);
 
 	status = WaitForBackgroundWorkerShutdown(bgw_handle);
@@ -752,9 +752,6 @@ DataChecksumsWorkerLauncherMain(Datum arg)
 	 */
 	BackgroundWorkerInitializeConnectionByOid(InvalidOid, InvalidOid, 0);
 
-	/* Initialize backend status information */
-	pgstat_bestart();
-
 	enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
 	DataChecksumsWorkerShmem->launcher_running = true;
 	DataChecksumsWorkerShmem->enabling_checksums = enabling_checksums;
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index d2a7a7add6f..2fc438987b5 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2947,6 +2947,11 @@ PostmasterStateMachine(void)
 									B_INVALID,
 									B_STANDALONE_BACKEND);
 
+			/* also add checksumming processes */
+			remainMask = btmask_add(remainMask,
+									B_DATACHECKSUMSWORKER_LAUNCHER,
+									B_DATACHECKSUMSWORKER_WORKER);
+
 			/* All types should be included in targetMask or remainMask */
 			Assert((remainMask.mask | targetMask.mask) == BTYPE_MASK_ALL.mask);
 		}
diff --git a/src/backend/utils/activity/pgstat_backend.c b/src/backend/utils/activity/pgstat_backend.c
index a9343b7b59e..f4b2f3b91d8 100644
--- a/src/backend/utils/activity/pgstat_backend.c
+++ b/src/backend/utils/activity/pgstat_backend.c
@@ -292,6 +292,8 @@ pgstat_tracks_backend_bktype(BackendType bktype)
 		case B_BG_WRITER:
 		case B_CHECKPOINTER:
 		case B_STARTUP:
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 			return false;
 
 		case B_AUTOVAC_WORKER:
diff --git a/src/test/checksum/t/003_standby_restarts.pl b/src/test/checksum/t/003_standby_restarts.pl
index 6c0fe8f3bf8..6782664f4e6 100644
--- a/src/test/checksum/t/003_standby_restarts.pl
+++ b/src/test/checksum/t/003_standby_restarts.pl
@@ -20,15 +20,23 @@ my $enable_params = '0, 100, true';
 my $node_primary = PostgreSQL::Test::Cluster->new('primary');
 $node_primary->init(allows_streaming => 1, no_data_checksums => 1);
 $node_primary->start;
-my $backup_name = 'my_backup';
+
+my $slotname = 'physical_slot';
+$node_primary->safe_psql('postgres',
+	"SELECT pg_create_physical_replication_slot('$slotname')");
 
 # Take backup
+my $backup_name = 'my_backup';
 $node_primary->backup($backup_name);
 
 # Create streaming standby linking to primary
 my $node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
 $node_standby_1->init_from_backup($node_primary, $backup_name,
 	has_streaming => 1);
+$node_standby_1->append_conf(
+	'postgresql.conf', qq[
+primary_slot_name = '$slotname'
+]);
 $node_standby_1->start;
 
 # Create some content on the primary to have un-checksummed data in the cluster
-- 
2.48.1

v20250310b-0003-sync-the-data_checksums-GUC-with-the-loca.patchtext/x-patch; charset=UTF-8; name=v20250310b-0003-sync-the-data_checksums-GUC-with-the-loca.patchDownload
From 1b6849f9eb1fc52b7cf3c59e9be2559e13ce40c0 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Fri, 7 Mar 2025 23:07:44 +0100
Subject: [PATCH v20250310b 3/5] sync the data_checksums GUC with the local
 variable

We now have three places that in some way express the state of data
checksums in the instance.

- control file / data_checksum_version

- LocalDataChecksumVersion as a local cache to reduce locking

- data_checksums backing the GUC

We need to keep the GUC variable in sync, to ensure we get the correct
value even early during startup (e.g. with -C).

Introduces a new SetLocalDataChecksumVersion() which sets the local
variable, and also updates the data_checksum GUC at the same time.
---
 src/backend/access/transam/xlog.c   | 51 +++++++++++++++++++++++++----
 src/backend/utils/misc/guc_tables.c |  3 +-
 src/include/access/xlog.h           |  1 +
 3 files changed, 47 insertions(+), 8 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index b5190a7e104..f137cdc6d42 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -657,6 +657,14 @@ static bool updateMinRecoveryPoint = true;
  */
 static uint32 LocalDataChecksumVersion = 0;
 
+/*
+ * Variable backing the GUC, keep it in sync with LocalDataChecksumVersion.
+ * See SetLocalDataChecksumVersion().
+ */
+int			data_checksums = 0;
+
+static void SetLocalDataChecksumVersion(uint32 data_checksum_version);
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -4594,7 +4602,7 @@ ReadControlFile(void)
 
 	CalculateCheckpointSegments();
 
-	LocalDataChecksumVersion = ControlFile->data_checksum_version;
+	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
 }
 
 /*
@@ -4913,7 +4921,7 @@ SetDataChecksumsOff(void)
 bool
 AbsorbChecksumsOnInProgressBarrier(void)
 {
-	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
 	return true;
 }
 
@@ -4921,21 +4929,21 @@ bool
 AbsorbChecksumsOnBarrier(void)
 {
 	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
-	LocalDataChecksumVersion = PG_DATA_CHECKSUM_VERSION;
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_VERSION);
 	return true;
 }
 
 bool
 AbsorbChecksumsOffInProgressBarrier(void)
 {
-	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
 	return true;
 }
 
 bool
 AbsorbChecksumsOffBarrier(void)
 {
-	LocalDataChecksumVersion = 0;
+	SetLocalDataChecksumVersion(0);
 	return true;
 }
 
@@ -4951,10 +4959,41 @@ void
 InitLocalControldata(void)
 {
 	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
-	LocalDataChecksumVersion = ControlFile->data_checksum_version;
+	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
 	LWLockRelease(ControlFileLock);
 }
 
+/*
+ * XXX probably should be called in all places that modify the value of
+ * LocalDataChecksumVersion (to make sure data_checksums GUC is in sync)
+ *
+ * XXX aren't PG_DATA_ and DATA_ constants the same? why do we need both?
+ */
+void
+SetLocalDataChecksumVersion(uint32 data_checksum_version)
+{
+	LocalDataChecksumVersion = data_checksum_version;
+
+	switch (LocalDataChecksumVersion)
+	{
+		case PG_DATA_CHECKSUM_VERSION:
+			data_checksums = DATA_CHECKSUMS_ON;
+			break;
+
+		case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+			data_checksums = DATA_CHECKSUMS_INPROGRESS_ON;
+			break;
+
+		case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+			data_checksums = DATA_CHECKSUMS_INPROGRESS_OFF;
+			break;
+
+		default:
+			data_checksums = DATA_CHECKSUMS_OFF;
+			break;
+	}
+}
+
 /* guc hook */
 const char *
 show_data_checksums(void)
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 05cba3a02e3..66abf056159 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -609,7 +609,6 @@ static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
 static int	num_os_semaphores;
-static int	data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
@@ -5300,7 +5299,7 @@ struct config_enum ConfigureNamesEnum[] =
 		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
 			gettext_noop("Shows whether data checksums are turned on for this cluster."),
 			NULL,
-			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
 		},
 		&data_checksums,
 		DATA_CHECKSUMS_OFF, data_checksums_options,
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 0bb32c9c0fc..aec3ea0bc63 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -56,6 +56,7 @@ extern PGDLLIMPORT int CommitDelay;
 extern PGDLLIMPORT int CommitSiblings;
 extern PGDLLIMPORT bool track_wal_io_timing;
 extern PGDLLIMPORT int wal_decode_buffer_size;
+extern PGDLLIMPORT int data_checksums;
 
 extern PGDLLIMPORT int CheckPointSegments;
 
-- 
2.48.1

v20250310b-0004-make-progress-reporting-work.patchtext/x-patch; charset=UTF-8; name=v20250310b-0004-make-progress-reporting-work.patchDownload
From 351f66d51c9f604a956080aa3cc5cbd91becc8f0 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Sat, 8 Mar 2025 19:34:50 +0100
Subject: [PATCH v20250310b 4/5] make progress reporting work

- Splits the progress status by worker type - one row for launcher,
  one row for checksum worker (might be more with parallel workers).

- The launcher only updates database counters, the workers only set
  relation/block counters.

- Also reworks the columns in the system view a bit, discards the
  "current" fields (we still know the database for each worker).

- Issue: Not sure what to do about relation forks, at this point it
  tracks only blocks for MAIN fork.
---
 src/backend/catalog/system_views.sql         | 36 ++++----
 src/backend/postmaster/datachecksumsworker.c | 91 +++++++++++++++++---
 src/include/commands/progress.h              | 27 +++---
 src/test/regress/expected/rules.out          | 29 ++++---
 4 files changed, 131 insertions(+), 52 deletions(-)

diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 0d62976dc1f..4330d0ad656 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1335,25 +1335,25 @@ CREATE VIEW pg_stat_progress_copy AS
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
 CREATE VIEW pg_stat_progress_data_checksums AS
-	SELECT
-		S.pid AS pid, S.datid AS datid, D.datname AS datname,
-		CASE S.param1 WHEN 0 THEN 'enabling'
+    SELECT
+        S.pid AS pid, S.datid, D.datname AS datname,
+        CASE S.param1 WHEN 0 THEN 'enabling'
                       WHEN 1 THEN 'disabling'
-					  WHEN 2 THEN 'waiting'
-					  WHEN 3 THEN 'waiting on backends'
-					  WHEN 4 THEN 'waiting on temporary tables'
-					  WHEN 5 THEN 'done'
-					  END AS phase,
-		CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
-		CASE S.param3 WHEN -1 THEN NULL ELSE S.param3 END AS relations_total,
-		S.param4 AS databases_processed,
-		S.param5 AS relations_processed,
-		S.param6 AS databases_current,
-		S.param7 AS relation_current,
-		S.param8 AS relation_current_blocks,
-		S.param9 AS relation_current_blocks_processed
-	FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
-		LEFT JOIN pg_database D ON S.datid = D.oid;
+                      WHEN 2 THEN 'waiting'
+                      WHEN 3 THEN 'waiting on backends'
+                      WHEN 4 THEN 'waiting on temporary tables'
+                      WHEN 5 THEN 'waiting on checkpoint'
+                      WHEN 6 THEN 'done'
+                      END AS phase,
+        CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+        S.param3 AS databases_done,
+        CASE S.param4 WHEN -1 THEN NULL ELSE S.param4 END AS relations_total,
+        CASE S.param5 WHEN -1 THEN NULL ELSE S.param5 END AS relations_done,
+        CASE S.param6 WHEN -1 THEN NULL ELSE S.param6 END AS blocks_total,
+        CASE S.param7 WHEN -1 THEN NULL ELSE S.param7 END AS blocks_done
+    FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid
+    ORDER BY S.datid; -- return the launcher process first
 
 CREATE VIEW pg_user_mappings AS
     SELECT
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
index b30d5481b06..4101fdff107 100644
--- a/src/backend/postmaster/datachecksumsworker.c
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -409,6 +409,11 @@ ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrateg
 			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
 	pgstat_report_activity(STATE_RUNNING, activity);
 
+	/* XXX only do this for main forks, maybe we should do it for all? */
+	if (forkNum == MAIN_FORKNUM)
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+									 numblocks);
+
 	/*
 	 * We are looping over the blocks which existed at the time of process
 	 * start, which is safe since new blocks are created with checksums set
@@ -450,6 +455,11 @@ ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrateg
 		if (abort_requested)
 			return false;
 
+		/* XXX only do this for main forks, maybe we should do it for all? */
+		if (forkNum == MAIN_FORKNUM)
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+										 (blknum + 1));
+
 		vacuum_delay_point(false);
 	}
 
@@ -798,6 +808,8 @@ again:
 		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
 									 PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS);
 
+		/* XXX isn't it weird there's no wait between the phase updates? */
+
 		/*
 		 * Set the state to inprogress-on and wait on the procsignal barrier.
 		 */
@@ -908,8 +920,29 @@ ProcessAllDatabases(bool immediate_checkpoint)
 	 * columns for processed databases is instead increased such that it can
 	 * be compared against the total.
 	 */
-	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_DB,
-								 list_length(DatabaseList));
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_DBS_TOTAL,
+			PROGRESS_DATACHECKSUMS_DBS_DONE,
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE,
+			PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+			PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+		};
+
+		int64	vals[6];
+
+		vals[0] = list_length(DatabaseList);
+		vals[1] = 0;
+
+		/* translated to NULL */
+		vals[2] = -1;
+		vals[3] = -1;
+		vals[4] = -1;
+		vals[5] = -1;
+
+		pgstat_progress_update_multi_param(6, index, vals);
+	}
 
 	while (true)
 	{
@@ -921,14 +954,6 @@ ProcessAllDatabases(bool immediate_checkpoint)
 			DataChecksumsWorkerResultEntry *entry;
 			bool		found;
 
-			/*
-			 * Indicate which database is being processed set the number of
-			 * relations to -1 to clear field from previous values. -1 will
-			 * translate to NULL in the progress view.
-			 */
-			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_CUR_DB, db->dboid);
-			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_REL, -1);
-
 			/*
 			 * Check if this database has been processed already, and if so
 			 * whether it should be retried or skipped.
@@ -957,6 +982,12 @@ ProcessAllDatabases(bool immediate_checkpoint)
 			result = ProcessDatabase(db);
 			processed_databases++;
 
+			/*
+			 * Update the number of processed databases in the progress report.
+			 */
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_DBS_DONE,
+										 processed_databases);
+
 			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
 			{
 				/*
@@ -1050,6 +1081,13 @@ ProcessAllDatabases(bool immediate_checkpoint)
 				 errhint("The server log might have more information on the cause of the error.")));
 	}
 
+	/*
+	 * When enabling checksums, we have to wait for a checkpoint for the
+	 * checksums to e.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT);
+
 	/*
 	 * Force a checkpoint to get everything out to disk. The use of immediate
 	 * checkpoints is for running tests, as they would otherwise not execute
@@ -1255,6 +1293,7 @@ DataChecksumsWorkerMain(Datum arg)
 	List	   *InitialTempTableList = NIL;
 	BufferAccessStrategy strategy;
 	bool		aborted = false;
+	int64		rels_done;
 
 	enabling_checksums = true;
 
@@ -1268,6 +1307,10 @@ DataChecksumsWorkerMain(Datum arg)
 	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
 											  BGWORKER_BYPASS_ALLOWCONN);
 
+	/* worker will have a separate entry in pg_stat_progress_data_checksums */
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
 	/*
 	 * Get a list of all temp tables present as we start in this database. We
 	 * need to wait until they are all gone until we are done, since we cannot
@@ -1294,6 +1337,24 @@ DataChecksumsWorkerMain(Datum arg)
 
 	RelationList = BuildRelationList(false,
 									 DataChecksumsWorkerShmem->process_shared_catalogs);
+
+	/* Update the total number of relations to be processed in this DB. */
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE
+		};
+
+		int64	vals[2];
+
+		vals[0] = list_length(RelationList);
+		vals[1] = 0;
+
+		pgstat_progress_update_multi_param(2, index, vals);
+	}
+
+	/* process the relations */
+	rels_done = 0;
 	foreach_oid(reloid, RelationList)
 	{
 		if (!ProcessSingleRelationByOid(reloid, strategy))
@@ -1301,6 +1362,9 @@ DataChecksumsWorkerMain(Datum arg)
 			aborted = true;
 			break;
 		}
+
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_RELS_DONE,
+									 ++rels_done);
 	}
 	list_free(RelationList);
 
@@ -1313,6 +1377,10 @@ DataChecksumsWorkerMain(Datum arg)
 		return;
 	}
 
+	/* The worker is about to wait for temporary tables to go away. */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL);
+
 	/*
 	 * Wait for all temp tables that existed when we started to go away. This
 	 * is necessary since we cannot "reach" them to enable checksums. Any temp
@@ -1369,5 +1437,8 @@ DataChecksumsWorkerMain(Datum arg)
 
 	list_free(InitialTempTableList);
 
+	/* worker done */
+	pgstat_progress_end_command();
+
 	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
 }
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 1ebd0c792b4..94b478a6cc9 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -158,21 +158,20 @@
 #define PROGRESS_COPY_TYPE_CALLBACK 4
 
 /* Progress parameters for PROGRESS_DATACHECKSUMS */
-#define PROGRESS_DATACHECKSUMS_PHASE 0
-#define PROGRESS_DATACHECKSUMS_TOTAL_DB 1
-#define PROGRESS_DATACHECKSUMS_TOTAL_REL 2
-#define PROGRESS_DATACHECKSUMS_PROCESSED_DB 3
-#define PROGRESS_DATACHECKSUMS_PROCESSED_REL 4
-#define PROGRESS_DATACHECKSUMS_CUR_DB 5
-#define PROGRESS_DATACHECKSUMS_CUR_REL 6
-#define PROGRESS_DATACHECKSUMS_CUR_REL_TOTAL_BLOCKS 7
-#define PROGRESS_DATACHECKSUMS_CUR_REL_PROCESSED_BLOCKS 8
+#define PROGRESS_DATACHECKSUMS_PHASE		0
+#define PROGRESS_DATACHECKSUMS_DBS_TOTAL	1
+#define PROGRESS_DATACHECKSUMS_DBS_DONE		2
+#define PROGRESS_DATACHECKSUMS_RELS_TOTAL	3
+#define PROGRESS_DATACHECKSUMS_RELS_DONE	4
+#define PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL	5
+#define PROGRESS_DATACHECKSUMS_BLOCKS_DONE	6
 
 /* Phases of datachecksumsworker operation */
-#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING 0
-#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING 1
-#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS 2
-#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL 3
-#define PROGRESS_DATACHECKSUMS_DONE 4
+#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING			0
+#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING			1
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING			2
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS	3
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL	4
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT 5
 
 #endif
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 36bed4b168d..2cfea837554 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2050,25 +2050,34 @@ pg_stat_progress_data_checksums| SELECT s.pid,
             WHEN 2 THEN 'waiting'::text
             WHEN 3 THEN 'waiting on backends'::text
             WHEN 4 THEN 'waiting on temporary tables'::text
-            WHEN 5 THEN 'done'::text
+            WHEN 5 THEN 'waiting on checkpoint'::text
+            WHEN 6 THEN 'done'::text
             ELSE NULL::text
         END AS phase,
         CASE s.param2
             WHEN '-1'::integer THEN NULL::bigint
             ELSE s.param2
         END AS databases_total,
-        CASE s.param3
+    s.param3 AS databases_done,
+        CASE s.param4
             WHEN '-1'::integer THEN NULL::bigint
-            ELSE s.param3
+            ELSE s.param4
         END AS relations_total,
-    s.param4 AS databases_processed,
-    s.param5 AS relations_processed,
-    s.param6 AS databases_current,
-    s.param7 AS relation_current,
-    s.param8 AS relation_current_blocks,
-    s.param9 AS relation_current_blocks_processed
+        CASE s.param5
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param5
+        END AS relations_done,
+        CASE s.param6
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param6
+        END AS blocks_total,
+        CASE s.param7
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param7
+        END AS blocks_done
    FROM (pg_stat_get_progress_info('DATACHECKSUMS'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
-     LEFT JOIN pg_database d ON ((s.datid = d.oid)));
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)))
+  ORDER BY s.datid;
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,
-- 
2.48.1

v20250310b-0005-update-docs.patchtext/x-patch; charset=UTF-8; name=v20250310b-0005-update-docs.patchDownload
From b23637b35099a581aab737a3afd58fbeddc9016a Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Sun, 9 Mar 2025 13:34:10 +0100
Subject: [PATCH v20250310b 5/5] update docs

---
 doc/src/sgml/monitoring.sgml | 84 ++++++++++++++++++++----------------
 doc/src/sgml/wal.sgml        |  8 ++--
 2 files changed, 51 insertions(+), 41 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 4a6ef5f1605..2a568ee2357 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6804,8 +6804,8 @@ FROM pg_stat_get_backend_idset() AS backendid;
   <para>
    When data checksums are being enabled on a running cluster, the
    <structname>pg_stat_progress_data_checksums</structname> view will contain
-   a row for each background worker which is currently calculating checksums
-   for the data pages.
+   a row for the launcher process, and one row for each worker process which
+   is currently calculating checksums for the data pages in one database.
   </para>
 
   <table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
@@ -6837,37 +6837,33 @@ FROM pg_stat_get_backend_idset() AS backendid;
      </row>
 
      <row>
-      <entry role="catalog_table_entry">
-       <para role="column_definition">
-        <structfield>phase</structfield> <type>text</type>
-       </para>
-       <para>
-        Current processing phase, see <xref linkend="datachecksum-phases"/>
-        for description of the phases.
-       </para>
-      </entry>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datid</structfield> <type>oid</type>
+      </para>
+      <para>
+       OID of this database, or 0 for the launcher process
+       relation
+      </para></entry>
      </row>
 
      <row>
-      <entry role="catalog_table_entry">
-       <para role="column_definition">
-        <structfield>databases_total</structfield> <type>integer</type>
-       </para>
-       <para>
-        The total number of databases which will be processed.
-       </para>
-      </entry>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datname</structfield> <type>name</type>
+      </para>
+      <para>
+       Name of this database, or <literal>NULL</literal> for the
+       launcher process.
+      </para></entry>
      </row>
 
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>relations_total</structfield> <type>integer</type>
+        <structfield>phase</structfield> <type>text</type>
        </para>
        <para>
-        The total number of relations which will be processed, or
-        <literal>NULL</literal> if the data checksums worker process hasn't
-        calculated the number of relations yet.
+        Current processing phase, see <xref linkend="datachecksum-phases"/>
+        for description of the phases.
        </para>
       </entry>
      </row>
@@ -6875,10 +6871,12 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>databases_processed</structfield> <type>integer</type>
+        <structfield>databases_total</structfield> <type>integer</type>
        </para>
        <para>
-        The number of databases which have been processed.
+        The total number of databases which will be processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6886,10 +6884,12 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>relations_processed</structfield> <type>integer</type>
+        <structfield>databases_done</structfield> <type>integer</type>
        </para>
        <para>
-        The number of relations which have been processed.
+        The number of databases which have been processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6897,10 +6897,13 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>databases_current</structfield> <type>oid</type>
+        <structfield>relations_total</structfield> <type>integer</type>
        </para>
        <para>
-        OID of the database currently being processed.
+        The total number of relations which will be processed, or
+        <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of relations yet. The launcher process has
+        this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6908,10 +6911,11 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>relation_current</structfield> <type>oid</type>
+        <structfield>relations_done</structfield> <type>integer</type>
        </para>
        <para>
-        OID of the relation currently being processed.
+        The number of relations which have been processed. The launcher
+        process has this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6919,10 +6923,13 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>relation_current_blocks</structfield> <type>integer</type>
+        <structfield>blocks_total</structfield> <type>integer</type>
        </para>
        <para>
-        The total number of blocks in the relation currently being processed.
+        The number of blocks in the current relation which will be processed,
+        or <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of blocks yet. The launcher process has
+        this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6930,11 +6937,11 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>relation_current_blocks_processed</structfield> <type>integer</type>
+        <structfield>blocks_done</structfield> <type>integer</type>
        </para>
        <para>
-        The number of blocks which have been processed in the relation currently
-        being processed.
+        The number of blocks in the current relation which have been processed.
+        The launcher process has this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6982,9 +6989,10 @@ FROM pg_stat_get_backend_idset() AS backendid;
       </entry>
      </row>
      <row>
-      <entry><literal>done</literal></entry>
+      <entry><literal>waiting on checkpoint</literal></entry>
       <entry>
-       The command has finished processing and is exiting.
+       The command is currently waiting for a checkpoint to update the checksum
+       state at the end.
       </entry>
      </row>
     </tbody>
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index 2562b1a002c..0ada90ca0b1 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -310,15 +310,17 @@
     If the cluster is stopped while in <literal>inprogress-on</literal> mode, for
     any reason, then this process must be restarted manually. To do this,
     re-execute the function <function>pg_enable_data_checksums()</function>
-    once the cluster has been restarted. The background worker will attempt
-    to resume the work from where it was interrupted.
+    once the cluster has been restarted. The process will start over, there is
+    no support for resuming work from where it was interrupted.
    </para>
 
    <note>
     <para>
      Enabling checksums can cause significant I/O to the system, as most of the
      database pages will need to be rewritten, and will be written both to the
-     data files and the WAL.
+     data files and the WAL. The impact may be limited by throttling using the
+     <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter>
+     parameters of the <function>pg_enable_data_checksums</function> function.
     </para>
    </note>
 
-- 
2.48.1

#23Tomas Vondra
tomas@vondra.me
In reply to: Tomas Vondra (#22)
5 attachment(s)
Re: Changing the state of data checksums in a running cluster

On 3/10/25 01:18, Tomas Vondra wrote:

...

There's still a failure on windows, though. I'd bet that's due to the
data_checksum/LocalDatachecksumVersion sync not working correctly on
builds with EXEC_BACKEND, or something like that, but it's too late so
I'll take a closer look tomorrow.

Just like I suspected, there was a bug in EXEC_BACKEND, although a bit
different from what I guessed - the worker state in shmem was zeroed
every time, not just once. And a second issue was child_process_kinds
got out of sync with BackendType (mea culpa).

For me, this passes all CI tests, hopefully cfbot will be happy too.

regards

--
Tomas Vondra

Attachments:

v20250310c-0001-Online-enabling-and-disabling-of-data-che.patchtext/x-patch; charset=UTF-8; name=v20250310c-0001-Online-enabling-and-disabling-of-data-che.patchDownload
From adf6848e0cff38ac861f921dc3b7ef6462f0d86a Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Fri, 7 Mar 2025 19:21:22 +0100
Subject: [PATCH v20250310c 1/5] Online enabling and disabling of data
 checksums

This allows data checksums to be enabled, or disabled, in a running
cluster without restricting access to the cluster during processing.

Data checksums could prior to this only be enabled during initdb or
when the cluster is offline using the pg_checksums app. This commit
introduce functionality to enable, or disable, data checksums while
the cluster is running regardless of how it was initialized.

A background worker launcher process is responsible for launching a
dynamic per-database background worker which will mark all buffers
dirty for all relation with storage in order for them to have data
checksums calcuated on write.  Once all relations in all databases
have been processed, the data_checksums state will be set to on and
the cluster will at that point be identical to one which had data
checksums enabled during initialization or via offline processing.

When data checksums are being enabled, concurrent I/O operations
from backends other than the data checksums worker will write the
checksums but not verify them on reading.  Only when all backends
have absorbed the procsignalbarrier for setting data_checksums to
on will they also start verifying checksums on reading.  The same
process is repeated during disabling; all backends write checksums
but do not verify them until the barrier for setting the state to
off has been absorbed by all.  This in-progress state is used to
ensure there are no false negatives (or positives) due to reading
a checksum which is not in sync with the page.

This work is based on an earlier version of this patch which was
reviewed by among others Heikki Linnakangas, Robert Haas, Andres
Freund, Tomas Vondra, Michael Banck and Andrey Borodin.

Author: Daniel Gustafsson <daniel@yesql.se>
Author: Magnus Hagander <magnus@hagander.net>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com
Discussion: https://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de
Discussion: https://postgr.es/m/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com
---
 doc/src/sgml/func.sgml                        |   71 +
 doc/src/sgml/glossary.sgml                    |   23 +
 doc/src/sgml/monitoring.sgml                  |  207 ++-
 doc/src/sgml/ref/pg_checksums.sgml            |    6 +
 doc/src/sgml/wal.sgml                         |   57 +-
 src/backend/access/rmgrdesc/xlogdesc.c        |   24 +
 src/backend/access/transam/xlog.c             |  450 +++++-
 src/backend/access/transam/xlogfuncs.c        |   41 +
 src/backend/backup/basebackup.c               |    6 +-
 src/backend/catalog/system_functions.sql      |   11 +
 src/backend/catalog/system_views.sql          |   21 +
 src/backend/postmaster/Makefile               |    1 +
 src/backend/postmaster/bgworker.c             |    7 +
 src/backend/postmaster/datachecksumsworker.c  | 1376 +++++++++++++++++
 src/backend/postmaster/meson.build            |    1 +
 src/backend/replication/logical/decode.c      |    1 +
 src/backend/storage/ipc/ipci.c                |    4 +
 src/backend/storage/ipc/procsignal.c          |   13 +
 src/backend/storage/page/README               |    4 +-
 src/backend/storage/page/bufpage.c            |    6 +-
 src/backend/utils/activity/pgstat_io.c        |    2 +
 .../utils/activity/wait_event_names.txt       |    3 +
 src/backend/utils/adt/pgstatfuncs.c           |    8 +-
 src/backend/utils/init/miscinit.c             |    9 +-
 src/backend/utils/init/postinit.c             |    7 +-
 src/backend/utils/misc/guc_tables.c           |   32 +-
 src/bin/pg_checksums/pg_checksums.c           |    2 +-
 src/bin/pg_upgrade/controldata.c              |    9 +
 src/include/access/xlog.h                     |   16 +-
 src/include/access/xlog_internal.h            |    7 +
 src/include/catalog/pg_control.h              |    1 +
 src/include/catalog/pg_proc.dat               |   17 +
 src/include/commands/progress.h               |   18 +
 src/include/miscadmin.h                       |    4 +
 src/include/postmaster/datachecksumsworker.h  |   31 +
 src/include/storage/bufpage.h                 |   10 +
 src/include/storage/checksum.h                |    8 +
 src/include/storage/lwlocklist.h              |    1 +
 src/include/storage/proc.h                    |    5 +-
 src/include/storage/procsignal.h              |    5 +
 src/include/utils/backend_progress.h          |    1 +
 src/test/Makefile                             |   11 +-
 src/test/checksum/.gitignore                  |    2 +
 src/test/checksum/Makefile                    |   23 +
 src/test/checksum/README                      |   22 +
 src/test/checksum/meson.build                 |   15 +
 src/test/checksum/t/001_basic.pl              |   88 ++
 src/test/checksum/t/002_restarts.pl           |   92 ++
 src/test/checksum/t/003_standby_restarts.pl   |  131 ++
 src/test/checksum/t/004_offline.pl            |  101 ++
 src/test/meson.build                          |    1 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |   34 +
 src/test/regress/expected/rules.out           |   28 +
 src/tools/pgindent/typedefs.list              |    6 +
 54 files changed, 3031 insertions(+), 49 deletions(-)
 create mode 100644 src/backend/postmaster/datachecksumsworker.c
 create mode 100644 src/include/postmaster/datachecksumsworker.h
 create mode 100644 src/test/checksum/.gitignore
 create mode 100644 src/test/checksum/Makefile
 create mode 100644 src/test/checksum/README
 create mode 100644 src/test/checksum/meson.build
 create mode 100644 src/test/checksum/t/001_basic.pl
 create mode 100644 src/test/checksum/t/002_restarts.pl
 create mode 100644 src/test/checksum/t/003_standby_restarts.pl
 create mode 100644 src/test/checksum/t/004_offline.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 51dd8ad6571..a09843e4ecf 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29865,6 +29865,77 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
 
   </sect2>
 
+  <sect2 id="functions-admin-checksum">
+   <title>Data Checksum Functions</title>
+
+   <para>
+    The functions shown in <xref linkend="functions-checksums-table" /> can
+    be used to enable or disable data checksums in a running cluster.
+    See <xref linkend="checksums" /> for details.
+   </para>
+
+   <table id="functions-checksums-table">
+    <title>Data Checksum Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_enable_data_checksums</primary>
+        </indexterm>
+        <function>pg_enable_data_checksums</function> ( <optional><parameter>cost_delay</parameter> <type>int</type>, <parameter>cost_limit</parameter> <type>int</type></optional> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Initiates data checksums for the cluster. This will switch the data
+        checksums mode to <literal>inprogress-on</literal> as well as start a
+        background worker that will process all pages in the database and
+        enable checksums on them. When all data pages have had checksums
+        enabled, the cluster will automatically switch data checksums mode to
+        <literal>on</literal>.
+       </para>
+       <para>
+        If <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter> are
+        specified, the speed of the process is throttled using the same principles as
+        <link linkend="runtime-config-resource-vacuum-cost">Cost-based Vacuum Delay</link>.
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_disable_data_checksums</primary>
+        </indexterm>
+        <function>pg_disable_data_checksums</function> ()
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Disables data checksum validation and calculation for the cluster. This
+        will switch the data checksum mode to <literal>inprogress-off</literal>
+        while data checksums are being disabled. When all active backends have
+        stopped validating data checksums, the data checksum mode will be
+        changed to <literal>off</literal>.  At this point the data pages will
+        still have checksums recorded but they are not updated when pages are
+        modified.
+       </para></entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+
+  </sect2>
+
   <sect2 id="functions-admin-dbobject">
    <title>Database Object Management Functions</title>
 
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index c0f812e3f5e..547e8586d4b 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -159,6 +159,8 @@
      (but not the autovacuum workers),
      the <glossterm linkend="glossary-background-writer">background writer</glossterm>,
      the <glossterm linkend="glossary-checkpointer">checkpointer</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker">data checksums worker</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker-launcher">data checksums worker launcher</glossterm>,
      the <glossterm linkend="glossary-logger">logger</glossterm>,
      the <glossterm linkend="glossary-startup-process">startup process</glossterm>,
      the <glossterm linkend="glossary-wal-archiver">WAL archiver</glossterm>,
@@ -548,6 +550,27 @@
    </glossdef>
   </glossentry>
 
+  <glossentry id="glossary-data-checksums-worker">
+   <glossterm>Data Checksums Worker</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which enables or disables data checksums in a specific database.
+    </para>
+   </glossdef>
+  </glossentry>
+
+  <glossentry id="glossary-data-checksums-worker-launcher">
+   <glossterm>Data Checksums Worker Launcher</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which starts <glossterm linkend="glossary-data-checksums-worker"> processes</glossterm>
+     for each database.
+    </para>
+   </glossdef>
+  </glossentry>
+
   <glossentry id="glossary-db-cluster">
    <glossterm>Database cluster</glossterm>
    <glossdef>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 16646f560e8..4a6ef5f1605 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3497,8 +3497,9 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Number of data page checksum failures detected in this
-       database (or on a shared object), or NULL if data checksums are
-       disabled.
+       database (or on a shared object).
+       Detected failures are reported regardless of the
+       <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -3508,8 +3509,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Time at which the last data page checksum failure was detected in
-       this database (or on a shared object), or NULL if data checksums are
-       disabled.
+       this database (or on a shared object). Last failure is reported
+       regardless of the <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -6793,6 +6794,204 @@ FROM pg_stat_get_backend_idset() AS backendid;
 
  </sect2>
 
+ <sect2 id="data-checksum-progress-reporting">
+  <title>Data Checksum Progress Reporting</title>
+
+  <indexterm>
+   <primary>pg_stat_progress_data_checksums</primary>
+  </indexterm>
+
+  <para>
+   When data checksums are being enabled on a running cluster, the
+   <structname>pg_stat_progress_data_checksums</structname> view will contain
+   a row for each background worker which is currently calculating checksums
+   for the data pages.
+  </para>
+
+  <table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
+   <title><structname>pg_stat_progress_data_checksums</structname> View</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        Column Type
+       </para>
+       <para>
+        Description>
+       </para>
+      </entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>pid</structfield> <type>integer</type>
+       </para>
+       <para>
+        Process ID of a datachecksumworker process.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>phase</structfield> <type>text</type>
+       </para>
+       <para>
+        Current processing phase, see <xref linkend="datachecksum-phases"/>
+        for description of the phases.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of databases which will be processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of relations which will be processed, or
+        <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of relations yet.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of databases which have been processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of relations which have been processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_current</structfield> <type>oid</type>
+       </para>
+       <para>
+        OID of the database currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current</structfield> <type>oid</type>
+       </para>
+       <para>
+        OID of the relation currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current_blocks</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of blocks in the relation currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current_blocks_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks which have been processed in the relation currently
+        being processed.
+       </para>
+      </entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+  <table id="datachecksum-phases">
+   <title>Data Checksum Phases</title>
+   <tgroup cols="2">
+    <colspec colname="col1" colwidth="1*"/>
+    <colspec colname="col2" colwidth="2*"/>
+    <thead>
+     <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+    <tbody>
+     <row>
+      <entry><literal>enabling</literal></entry>
+      <entry>
+       The command is currently enabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>disabling</literal></entry>
+      <entry>
+       The command is currently disabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on backends</literal></entry>
+      <entry>
+       The command is currently waiting for backends to acknowledge the data
+       checksum operation.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on temporary tables</literal></entry>
+      <entry>
+       The command is currently waiting for all temporary tables which existed
+       at the time the command was started to be removed.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>done</literal></entry>
+      <entry>
+       The command has finished processing and is exiting.
+      </entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+ </sect2>
+
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index 95043aa329c..0343710af53 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -45,6 +45,12 @@ PostgreSQL documentation
    exit status is nonzero if the operation failed.
   </para>
 
+  <para>
+   When enabling checksums, if checksums were in the process of being enabled
+   when the cluster was shut down, <application>pg_checksums</application>
+   will still process all relations regardless of the online processing.
+  </para>
+
   <para>
    When verifying checksums, every file in the cluster is scanned. When
    enabling checksums, each relation file block with a changed checksum is
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index f3b86b26be9..2562b1a002c 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -246,9 +246,10 @@
   <para>
    Checksums can be disabled when the cluster is initialized using <link
    linkend="app-initdb-data-checksums"><application>initdb</application></link>.
-   They can also be enabled or disabled at a later time as an offline
-   operation. Data checksums are enabled or disabled at the full cluster
-   level, and cannot be specified individually for databases or tables.
+   They can also be enabled or disabled at a later time either as an offline
+   operation or online in a running cluster allowing concurrent access. Data
+   checksums are enabled or disabled at the full cluster level, and cannot be
+   specified individually for databases or tables.
   </para>
 
   <para>
@@ -265,7 +266,7 @@
   </para>
 
   <sect2 id="checksums-offline-enable-disable">
-   <title>Off-line Enabling of Checksums</title>
+   <title>Offline Enabling of Checksums</title>
 
    <para>
     The <link linkend="app-pgchecksums"><application>pg_checksums</application></link>
@@ -274,6 +275,54 @@
    </para>
 
   </sect2>
+
+  <sect2 id="checksums-online-enable-disable">
+   <title>Online Enabling of Checksums</title>
+
+   <para>
+    Checksums can be enabled or disabled online, by calling the appropriate
+    <link linkend="functions-admin-checksum">functions</link>.
+   </para>
+
+   <para>
+    Enabling checksums will put the cluster checksum mode in
+    <literal>inprogress-on</literal> mode.  During this time, checksums will be
+    written but not verified. In addition to this, a background worker process
+    is started that enables checksums on all existing data in the cluster. Once
+    this worker has completed processing all databases in the cluster, the
+    checksum mode will automatically switch to <literal>on</literal>. The
+    processing will consume two background worker processes, make sure that
+    <varname>max_worker_processes</varname> allows for at least two more
+    additional processes.
+   </para>
+
+   <para>
+    The process will initially wait for all open transactions to finish before
+    it starts, so that it can be certain that there are no tables that have been
+    created inside a transaction that has not committed yet and thus would not
+    be visible to the process enabling checksums. It will also, for each database,
+    wait for all pre-existing temporary tables to get removed before it finishes.
+    If long-lived temporary tables are used in the application it may be necessary
+    to terminate these application connections to allow the process to complete.
+   </para>
+
+   <para>
+    If the cluster is stopped while in <literal>inprogress-on</literal> mode, for
+    any reason, then this process must be restarted manually. To do this,
+    re-execute the function <function>pg_enable_data_checksums()</function>
+    once the cluster has been restarted. The background worker will attempt
+    to resume the work from where it was interrupted.
+   </para>
+
+   <note>
+    <para>
+     Enabling checksums can cause significant I/O to the system, as most of the
+     database pages will need to be rewritten, and will be written both to the
+     data files and the WAL.
+    </para>
+   </note>
+
+  </sect2>
  </sect1>
 
   <sect1 id="wal-intro">
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 58040f28656..1e3ad2ab68d 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -18,6 +18,7 @@
 #include "access/xlog.h"
 #include "access/xlog_internal.h"
 #include "catalog/pg_control.h"
+#include "storage/bufpage.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -167,6 +168,26 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
 		memcpy(&wal_level, rec, sizeof(int));
 		appendStringInfo(buf, "wal_level %s", get_wal_level_string(wal_level));
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state xlrec;
+
+		memcpy(&xlrec, rec, sizeof(xl_checksum_state));
+		switch (xlrec.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_VERSION:
+				appendStringInfoString(buf, "on");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				appendStringInfoString(buf, "inprogress-off");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				appendStringInfoString(buf, "inprogress-on");
+				break;
+			default:
+				appendStringInfoString(buf, "off");
+		}
+	}
 }
 
 const char *
@@ -218,6 +239,9 @@ xlog_identify(uint8 info)
 		case XLOG_CHECKPOINT_REDO:
 			id = "CHECKPOINT_REDO";
 			break;
+		case XLOG_CHECKSUMS:
+			id = "CHECKSUMS";
+			break;
 	}
 
 	return id;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 799fc739e18..b5190a7e104 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -647,6 +647,16 @@ static XLogRecPtr LocalMinRecoveryPoint;
 static TimeLineID LocalMinRecoveryPointTLI;
 static bool updateMinRecoveryPoint = true;
 
+/*
+ * Local state fror Controlfile data_checksum_version. After initialization
+ * this is only updated when absorbing a procsignal barrier during interrupt
+ * processing. The reason for keeping a copy in backend-private memory is to
+ * avoid locking for interrogating checksum state. Possible values are the
+ * checksum versions defined in storage/bufpage.h as well as zero when data
+ * checksums are disabled.
+ */
+static uint32 LocalDataChecksumVersion = 0;
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -715,6 +725,8 @@ static void WALInsertLockAcquireExclusive(void);
 static void WALInsertLockRelease(void);
 static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);
 
+static void XLogChecksums(ChecksumType new_type);
+
 /*
  * Insert an XLOG record represented by an already-constructed chain of data
  * chunks.  This is a low-level routine; to construct the WAL record header
@@ -828,9 +840,10 @@ XLogInsertRecord(XLogRecData *rdata,
 		 * only happen just after a checkpoint, so it's better to be slow in
 		 * this case and fast otherwise.
 		 *
-		 * Also check to see if fullPageWrites was just turned on or there's a
-		 * running backup (which forces full-page writes); if we weren't
-		 * already doing full-page writes then go back and recompute.
+		 * Also check to see if fullPageWrites was just turned on, there's a
+		 * running backup or if checksums are enabled (all of which forces
+		 * full-page writes); if we weren't already doing full-page writes
+		 * then go back and recompute.
 		 *
 		 * If we aren't doing full-page writes then RedoRecPtr doesn't
 		 * actually affect the contents of the XLOG record, so we'll update
@@ -843,7 +856,9 @@ XLogInsertRecord(XLogRecData *rdata,
 			Assert(RedoRecPtr < Insert->RedoRecPtr);
 			RedoRecPtr = Insert->RedoRecPtr;
 		}
-		doPageWrites = (Insert->fullPageWrites || Insert->runningBackups > 0);
+		doPageWrites = (Insert->fullPageWrites ||
+						Insert->runningBackups > 0 ||
+						DataChecksumsNeedWrite());
 
 		if (doPageWrites &&
 			(!prevDoPageWrites ||
@@ -4579,9 +4594,7 @@ ReadControlFile(void)
 
 	CalculateCheckpointSegments();
 
-	/* Make the initdb settings visible as GUC variables, too */
-	SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no",
-					PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
+	LocalDataChecksumVersion = ControlFile->data_checksum_version;
 }
 
 /*
@@ -4615,13 +4628,345 @@ GetMockAuthenticationNonce(void)
 }
 
 /*
- * Are checksums enabled for data pages?
+ * DataChecksumsNeedWrite
+ *		Returns whether data checksums must be written or not
+ *
+ * Returns true iff data checksums are enabled or are in the process of being
+ * enabled. During "inprogress-on" and "inprogress-off" states checksums must
+ * be written even though they are not verified (see datachecksumsworker.c for
+ * a longer discussion).
+ *
+ * This function is intended for callsites which are about to write a data page
+ * to storage, and need to know whether to re-calculate the checksum for the
+ * page header. Calling this function must be performed as close to the write
+ * operation as possible to keep the critical section short.
+ */
+bool
+DataChecksumsNeedWrite(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * DataChecksumsNeedVerify
+ *		Returns whether data checksums must be verified or not
+ *
+ * Data checksums are only verified if they are fully enabled in the cluster.
+ * During the "inprogress-on" and "inprogress-off" states they are only
+ * updated, not verified (see datachecksumsworker.c for a longer discussion).
+ *
+ * This function is intended for callsites which have read data and are about
+ * to perform checksum validation based on the result of this.  Calling this
+ * function must be performed as close to the validation call as possible to
+ * keep the critical section short. This is in order to protect against time of
+ * check/time of use situations around data checksum validation.
+ */
+bool
+DataChecksumsNeedVerify(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
+}
+
+/*
+ * DataChecksumsOnInProgress
+ *		Returns whether data checksums are being enabled
+ *
+ * Most operations don't need to worry about the "inprogress" states, and
+ * should use DataChecksumsNeedVerify() or DataChecksumsNeedWrite(). The
+ * "inprogress-on" state for enabling checksums is used when the checksum
+ * worker is setting checksums on all pages, it can thus be used to check for
+ * aborted checksum processing which need to be restarted.
+ */
+inline bool
+DataChecksumsOnInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+}
+
+/*
+ * DataChecksumsOffInProgress
+ *		Returns whether data checksums are being disabled
+ *
+ * The "inprogress-off" state for disabling checksums is used for when the
+ * worker resets the catalog state.  DataChecksumsNeedVerify() or
+ * DataChecksumsNeedWrite() should be used for deciding whether to read/write
+ * checksums.
  */
 bool
-DataChecksumsEnabled(void)
+DataChecksumsOffInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * SetDataChecksumsOnInProgress
+ *		Sets the data checksum state to "inprogress-on" to enable checksums
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on". See
+ * SetDataChecksumsOn below for a description on how this state change works.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOnInProgress(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile != NULL);
+
+	/*
+	 * The state transition is performed in a critical section with
+	 * checkpoints held off to provide crash safety.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state change in all backends to ensure that all backends are in
+	 * "inprogress-on". Once done we know that all backends are writing data
+	 * checksums.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOn
+ *		Enables data checksums cluster-wide
+ *
+ * Enabling data checksums is performed using two barriers, the first one to
+ * set the checksums state to "inprogress-on" (which is performed by
+ * SetDataChecksumsOnInProgress()) and the second one to set the state to "on"
+ * (performed here).
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on".  This state
+ * requires data checksums to be written but not verified. This ensures that
+ * all data pages can be checksummed without the risk of false negatives in
+ * validation during the process.  When all existing pages are guaranteed to
+ * have checksums, and all new pages will be initiated with checksums, the
+ * state can be changed to "on". Once the state is "on" checksums will be both
+ * written and verified. See datachecksumsworker.c for a longer discussion on
+ * how data checksums can be enabled in a running cluster.
+ *
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOn(void)
 {
+	uint64		barrier;
+
 	Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/*
+	 * The only allowed state transition to "on" is from "inprogress-on" since
+	 * that state ensures that all pages will have data checksums written.
+	 */
+	if (ControlFile->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+		elog(ERROR, "checksums not in \"inprogress-on\" mode");
+	}
+
+	LWLockRelease(ControlFileLock);
+
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state transition of "on" in all backends. When done we know that
+	 * data checksums are enabled in all backends and data checksums are both
+	 * written and verified.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOff
+ *		Disables data checksums cluster-wide
+ *
+ * Disabling data checksums must be performed with two sets of barriers, each
+ * carrying a different state. The state is first set to "inprogress-off"
+ * during which checksums are still written but not verified. This ensures that
+ * backends which have yet to observe the state change from "on" won't get
+ * validation errors on concurrently modified pages. Once all backends have
+ * changed to "inprogress-off", the barrier for moving to "off" can be emitted.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOff(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/* If data checksums are already disabled there is nothing to do */
+	if (ControlFile->data_checksum_version == 0)
+	{
+		LWLockRelease(ControlFileLock);
+		return;
+	}
+
+	/*
+	 * If data checksums are currently enabled we first transition to the
+	 * "inprogress-off" state during which backends continue to write
+	 * checksums without verifying them. When all backends are in
+	 * "inprogress-off" the next transition to "off" can be performed, after
+	 * which all data checksum processing is disabled.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+
+		MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+		START_CRIT_SECTION();
+
+		XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+		LWLockRelease(ControlFileLock);
+
+		barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+
+		END_CRIT_SECTION();
+		MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+
+		/*
+		 * Update local state in all backends to ensure that any backend in
+		 * "on" state is changed to "inprogress-off".
+		 */
+		WaitForProcSignalBarrier(barrier);
+
+		/*
+		 * At this point we know that no backends are verifying data checksums
+		 * during reading. Next, we can safely move to state "off" to also
+		 * stop writing checksums.
+		 */
+	}
+	else
+	{
+		/*
+		 * Ending up here implies that the checksums state is "inprogress-on"
+		 * or "inprogress-off" and we can transition directly to "off" from
+		 * there.
+		 */
+		LWLockRelease(ControlFileLock);
+	}
+
+	/*
+	 * Ensure that we don't incur a checkpoint during disabling checksums.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(0);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = 0;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * ProcSignalBarrier absorption functions for enabling and disabling data
+ * checksums in a running cluster. The procsignalbarriers are emitted in the
+ * SetDataChecksums* functions.
+ */
+bool
+AbsorbChecksumsOnInProgressBarrier(void)
+{
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOnBarrier(void)
+{
+	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffInProgressBarrier(void)
+{
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffBarrier(void)
+{
+	LocalDataChecksumVersion = 0;
+	return true;
+}
+
+/*
+ * InitLocalControlData
+ *
+ * Set up backend local caches of controldata variables which may change at
+ * any point during runtime and thus require special cased locking. So far
+ * this only applies to data_checksum_version, but it's intended to be general
+ * purpose enough to handle future cases.
+ */
+void
+InitLocalControldata(void)
+{
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	LocalDataChecksumVersion = ControlFile->data_checksum_version;
+	LWLockRelease(ControlFileLock);
+}
+
+/* guc hook */
+const char *
+show_data_checksums(void)
+{
+	if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)
+		return "on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		return "inprogress-on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+		return "inprogress-off";
+	else
+		return "off";
 }
 
 /*
@@ -6194,6 +6539,33 @@ StartupXLOG(void)
 	 */
 	CompleteCommitTsInitialization();
 
+	/*
+	 * If we reach this point with checksums being enabled ("inprogress-on"
+	 * state), we notify the user that they need to manually restart the
+	 * process to enable checksums. This is because we cannot launch a dynamic
+	 * background worker directly from here, it has to be launched from a
+	 * regular backend.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		ereport(WARNING,
+				(errmsg("data checksums are being enabled, but no worker is running"),
+				 errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
+
+	/*
+	 * If data checksums were being disabled when the cluster was shut down,
+	 * we know that we have a state where all backends have stopped validating
+	 * checksums and we can move to off instead of prompting the user to
+	 * perform any action.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+	{
+		XLogChecksums(0);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = 0;
+		LWLockRelease(ControlFileLock);
+	}
+
 	/*
 	 * All done with end-of-recovery actions.
 	 *
@@ -8201,6 +8573,24 @@ XLogReportParameters(void)
 	}
 }
 
+/*
+ * Log the new state of checksums
+ */
+static void
+XLogChecksums(ChecksumType new_type)
+{
+	xl_checksum_state xlrec;
+	XLogRecPtr	recptr;
+
+	xlrec.new_checksumtype = new_type;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, sizeof(xl_checksum_state));
+
+	recptr = XLogInsert(RM_XLOG_ID, XLOG_CHECKSUMS);
+	XLogFlush(recptr);
+}
+
 /*
  * Update full_page_writes in shared memory, and write an
  * XLOG_FPW_CHANGE record if necessary.
@@ -8629,6 +9019,46 @@ xlog_redo(XLogReaderState *record)
 	{
 		/* nothing to do here, just for informational purposes */
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state state;
+		uint64		barrier;
+
+		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = state.new_checksumtype;
+		LWLockRelease(ControlFileLock);
+
+		/*
+		 * Block on a procsignalbarrier to await all processes having seen the
+		 * change to checksum status. Once the barrier has been passed we can
+		 * initiate the corresponding processing.
+		 */
+		switch (state.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			default:
+				Assert(state.new_checksumtype == 0);
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+		}
+	}
 }
 
 /*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 8c3090165f0..6b1c2ed85ea 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -26,6 +26,7 @@
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "postmaster/datachecksumsworker.h"
 #include "replication/walreceiver.h"
 #include "storage/fd.h"
 #include "storage/latch.h"
@@ -748,3 +749,43 @@ pg_promote(PG_FUNCTION_ARGS)
 						   wait_seconds)));
 	PG_RETURN_BOOL(false);
 }
+
+/*
+ * Disables data checksums for the cluster, if applicable. Starts a background
+ * worker which turns off the data checksums.
+ */
+Datum
+disable_data_checksums(PG_FUNCTION_ARGS)
+{
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	StartDataChecksumsWorkerLauncher(false, 0, 0, false);
+	PG_RETURN_VOID();
+}
+
+/*
+ * Enables data checksums for the cluster, if applicable.  Supports vacuum-
+ * like cost based throttling to limit system load. Starts a background worker
+ * which updates data checksums on existing data.
+ */
+Datum
+enable_data_checksums(PG_FUNCTION_ARGS)
+{
+	int			cost_delay = PG_GETARG_INT32(0);
+	int			cost_limit = PG_GETARG_INT32(1);
+	bool		fast = PG_GETARG_BOOL(2);
+
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	if (cost_delay < 0)
+		ereport(ERROR, errmsg("cost delay cannot be a negative value"));
+
+	if (cost_limit <= 0)
+		ereport(ERROR, errmsg("cost limit must be greater than zero"));
+
+	StartDataChecksumsWorkerLauncher(true, cost_delay, cost_limit, fast);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 3f8a3c55725..598dd41bae6 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1613,7 +1613,8 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	 * enabled for this cluster, and if this is a relation file, then verify
 	 * the checksum.
 	 */
-	if (!noverify_checksums && DataChecksumsEnabled() &&
+	if (!noverify_checksums &&
+		DataChecksumsNeedWrite() &&
 		RelFileNumberIsValid(relfilenumber))
 		verify_checksum = true;
 
@@ -2006,6 +2007,9 @@ verify_page_checksum(Page page, XLogRecPtr start_lsn, BlockNumber blkno,
 	if (PageIsNew(page) || PageGetLSN(page) >= start_lsn)
 		return true;
 
+	if (!DataChecksumsNeedVerify())
+		return true;
+
 	/* Perform the actual checksum calculation. */
 	checksum = pg_checksum_page(page, blkno);
 
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 566f308e443..b18db2b5dde 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -650,6 +650,13 @@ LANGUAGE INTERNAL
 CALLED ON NULL INPUT VOLATILE PARALLEL SAFE
 AS 'pg_stat_reset_slru';
 
+CREATE OR REPLACE FUNCTION
+  pg_enable_data_checksums(cost_delay integer DEFAULT 0,
+                           cost_limit integer DEFAULT 100,
+						   fast boolean DEFAULT false)
+  RETURNS void STRICT VOLATILE LANGUAGE internal AS 'enable_data_checksums'
+  PARALLEL RESTRICTED;
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
@@ -775,6 +782,10 @@ REVOKE EXECUTE ON FUNCTION pg_ls_logicalmapdir() FROM PUBLIC;
 
 REVOKE EXECUTE ON FUNCTION pg_ls_replslotdir(text) FROM PUBLIC;
 
+REVOKE EXECUTE ON FUNCTION pg_enable_data_checksums(integer, integer, boolean) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_disable_data_checksums() FROM public;
+
 --
 -- We also set up some things as accessible to standard roles.
 --
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index a4d2cfdcaf5..0d62976dc1f 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1334,6 +1334,27 @@ CREATE VIEW pg_stat_progress_copy AS
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_data_checksums AS
+	SELECT
+		S.pid AS pid, S.datid AS datid, D.datname AS datname,
+		CASE S.param1 WHEN 0 THEN 'enabling'
+                      WHEN 1 THEN 'disabling'
+					  WHEN 2 THEN 'waiting'
+					  WHEN 3 THEN 'waiting on backends'
+					  WHEN 4 THEN 'waiting on temporary tables'
+					  WHEN 5 THEN 'done'
+					  END AS phase,
+		CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+		CASE S.param3 WHEN -1 THEN NULL ELSE S.param3 END AS relations_total,
+		S.param4 AS databases_processed,
+		S.param5 AS relations_processed,
+		S.param6 AS databases_current,
+		S.param7 AS relation_current,
+		S.param8 AS relation_current_blocks,
+		S.param9 AS relation_current_blocks_processed
+	FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+		LEFT JOIN pg_database D ON S.datid = D.oid;
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/postmaster/Makefile b/src/backend/postmaster/Makefile
index 0f4435d2d97..0c36765acfe 100644
--- a/src/backend/postmaster/Makefile
+++ b/src/backend/postmaster/Makefile
@@ -18,6 +18,7 @@ OBJS = \
 	bgworker.o \
 	bgwriter.o \
 	checkpointer.o \
+	datachecksumsworker.o \
 	fork_process.o \
 	interrupt.o \
 	launch_backend.o \
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 116ddf7b835..5720f66b0fd 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -18,6 +18,7 @@
 #include "pgstat.h"
 #include "port/atomics.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "replication/logicallauncher.h"
 #include "replication/logicalworker.h"
@@ -132,6 +133,12 @@ static const struct
 	},
 	{
 		"TablesyncWorkerMain", TablesyncWorkerMain
+	},
+	{
+		"DataChecksumsWorkerLauncherMain", DataChecksumsWorkerLauncherMain
+	},
+	{
+		"DataChecksumsWorkerMain", DataChecksumsWorkerMain
 	}
 };
 
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
new file mode 100644
index 00000000000..de7a077f9c2
--- /dev/null
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -0,0 +1,1376 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.c
+ *	  Background worker for enabling or disabling data checksums online
+ *
+ * When enabling data checksums on a database at initdb time or when shut down
+ * with pg_checksums, no extra process is required as each page is checksummed,
+ * and verified, when accessed.  When enabling checksums on an already running
+ * cluster, this worker will ensure that all pages are checksummed before
+ * verification of the checksums is turned on. In the case of disabling
+ * checksums, the state transition is performed only in the control file, no
+ * changes are performed on the data pages.
+ *
+ * Checksums can be either enabled or disabled cluster-wide, with on/off being
+ * the end state for data_checksums.
+ *
+ * Enabling checksums
+ * ------------------
+ * When enabling checksums in an online cluster, data_checksums will be set to
+ * "inprogress-on" which signals that write operations MUST compute and write
+ * the checksum on the data page, but during reading the checksum SHALL NOT be
+ * verified. This ensures that all objects created during checksumming will
+ * have checksums set, but no reads will fail due to incorrect checksum. The
+ * DataChecksumsWorker will compile a list of databases which exist at the
+ * start of checksumming, and all of these which haven't been dropped during
+ * the processing MUST have been processed successfully in order for checksums
+ * to be enabled. Any new relation created during processing will see the
+ * in-progress state and will automatically be checksummed.
+ *
+ * For each database, all relations which have storage are read and every data
+ * page is marked dirty to force a write with the checksum. This will generate
+ * a lot of WAL as the entire database is read and written.
+ *
+ * If the processing is interrupted by a cluster restart, it will be restarted
+ * from the beginning again as state isn't persisted.
+ *
+ * Disabling checksums
+ * -------------------
+ * When disabling checksums, data_checksums will be set to "inprogress-off"
+ * which signals that checksums are written but no longer verified. This ensure
+ * that backends which have yet to move from the "on" state will still be able
+ * to process data checksum validation.
+ *
+ * Synchronization and Correctness
+ * -------------------------------
+ * The processes involved in enabling, or disabling, data checksums in an
+ * online cluster must be properly synchronized with the normal backends
+ * serving concurrent queries to ensure correctness. Correctness is defined
+ * as the following:
+ *
+ *    - Backends SHALL NOT violate local data_checksums state
+ *    - Data checksums SHALL NOT be considered enabled cluster-wide until all
+ *      currently connected backends have the local state "enabled"
+ *
+ * There are two levels of synchronization required for enabling data checksums
+ * in an online cluster: (i) changing state in the active backends ("on",
+ * "off", "inprogress-on" and "inprogress-off"), and (ii) ensuring no
+ * incompatible objects and processes are left in a database when workers end.
+ * The former deals with cluster-wide agreement on data checksum state and the
+ * latter with ensuring that any concurrent activity cannot break the data
+ * checksum contract during processing.
+ *
+ * Synchronizing the state change is done with procsignal barriers, where the
+ * WAL logging backend updating the global state in the controlfile will wait
+ * for all other backends to absorb the barrier. Barrier absorption will happen
+ * during interrupt processing, which means that connected backends will change
+ * state at different times. To prevent data checksum state changes when
+ * writing and verifying checksums, interrupts shall be held off before
+ * interrogating state and resumed when the IO operation has been performed.
+ *
+ *   When Enabling Data Checksums
+ *   ----------------------------
+ *   A process which fails to observe data checksums being enabled can induce
+ *   two types of errors: failing to write the checksum when modifying the page
+ *   and failing to validate the data checksum on the page when reading it.
+ *
+ *   When processing starts all backends belong to one of the below sets, with
+ *   one set being empty:
+ *
+ *   Bd: Backends in "off" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   If processing is started in an online cluster then all backends are in Bd.
+ *   If processing was halted by the cluster shutting down, the controlfile
+ *   state "inprogress-on" will be observed on system startup and all backends
+ *   will be in Bd. Backends transition Bd -> Bi via a procsignalbarrier.  When
+ *   the DataChecksumsWorker has finished writing checksums on all pages and
+ *   enables data checksums cluster-wide, there are four sets of backends where
+ *   Bd shall be an empty set:
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   Backends in Bi and Be will write checksums when modifying a page, but only
+ *   backends in Be will verify the checksum during reading. The Bg backend is
+ *   blocked waiting for all backends in Bi to process interrupts and move to
+ *   Be. Any backend starting while Bg is waiting on the procsignalbarrier will
+ *   observe the global state being "on" and will thus automatically belong to
+ *   Be.  Checksums are enabled cluster-wide when Bi is an empty set. Bi and Be
+ *   are compatible sets while still operating based on their local state as
+ *   both write data checksums.
+ *
+ *   When Disabling Data Checksums
+ *   -----------------------------
+ *   A process which fails to observe that data checksums have been disabled
+ *   can induce two types of errors: writing the checksum when modifying the
+ *   page and validating a data checksum which is no longer correct due to
+ *   modifications to the page.
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bo: Backends in "inprogress-off" state
+ *
+ *   Backends transition from the Be state to Bd like so: Be -> Bo -> Bd
+ *
+ *   The goal is to transition all backends to Bd making the others empty sets.
+ *   Backends in Bo write data checksums, but don't validate them, such that
+ *   backends still in Be can continue to validate pages until the barrier has
+ *   been absorbed such that they are in Bo. Once all backends are in Bo, the
+ *   barrier to transition to "off" can be raised and all backends can safely
+ *   stop writing data checksums as no backend is enforcing data checksum
+ *   validation any longer.
+ *
+ *
+ * Potential optimizations
+ * -----------------------
+ * Below are some potential optimizations and improvements which were brought
+ * up during reviews of this feature, but which weren't implemented in the
+ * initial version. These are ideas listed without any validation on their
+ * feasibility or potential payoff. More discussion on these can be found on
+ * the -hackers threads linked to in the commit message of this feature.
+ *
+ *   * Launching datachecksumsworker for resuming operation from the startup
+ *     process: Currently users have to restart processing manually after a
+ *     restart since dynamic background worker cannot be started from the
+ *     postmaster. Changing the startup process could make restarting the
+ *     processing automatic on cluster restart.
+ *   * Avoid dirtying the page when checksums already match: Iff the checksum
+ *     on the page happens to already match we still dirty the page. It should
+ *     be enough to only do the log_newpage_buffer() call in that case.
+ *   * Invent a lightweight WAL record that doesn't contain the full-page
+ *     image but just the block number: On replay, the redo routine would read
+ *     the page from disk.
+ *   * Teach pg_checksums to avoid checksummed pages when pg_checksums is used
+ *     to enable checksums on a cluster which is in inprogress-on state and
+ *     may have checksummed pages (make pg_checksums be able to resume an
+ *     online operation).
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/postmaster/datachecksumsworker.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/htup_details.h"
+#include "access/xact.h"
+#include "access/xloginsert.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_database.h"
+#include "commands/progress.h"
+#include "commands/vacuum.h"
+#include "common/relpath.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "postmaster/bgworker.h"
+#include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/bufmgr.h"
+#include "storage/checksum.h"
+#include "storage/ipc.h"
+#include "storage/lmgr.h"
+#include "storage/lwlock.h"
+#include "storage/procarray.h"
+#include "storage/smgr.h"
+#include "tcop/tcopprot.h"
+#include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
+#include "utils/ps_status.h"
+#include "utils/syscache.h"
+
+/*
+ * Number of times we retry to open a database before giving up and consider
+ * it to have failed processing.
+ */
+#define DATACHECKSUMSWORKER_MAX_DB_RETRIES 5
+
+typedef enum
+{
+	DATACHECKSUMSWORKER_SUCCESSFUL = 0,
+	DATACHECKSUMSWORKER_ABORTED,
+	DATACHECKSUMSWORKER_FAILED,
+	DATACHECKSUMSWORKER_RETRYDB,
+} DataChecksumsWorkerResult;
+
+/*
+ * Signaling between backends calling pg_enable/disable_data_checksums, the
+ * checksums launcher process, and the checksums worker process.
+ *
+ * This struct is protected by DataChecksumsWorkerLock
+ */
+typedef struct DataChecksumsWorkerShmemStruct
+{
+	/*
+	 * These are set by pg_enable/disable_data_checksums, to tell the launcher
+	 * what the target state is.
+	 */
+	bool		launch_enable_checksums;	/* True if checksums are being
+											 * enabled, else false */
+	int			launch_cost_delay;
+	int			launch_cost_limit;
+	bool		launch_fast;
+
+	/*
+	 * Is a launcher process is currently running?
+	 *
+	 * This is set by the launcher process, after it has read the above
+	 * launch_* parameters.
+	 */
+	bool		launcher_running;
+
+	/*
+	 * These fields indicate the target state that the launcher is currently
+	 * working towards. They can be different from the corresponding launch_*
+	 * fields, if a new pg_enable/disable_data_checksums() call was made while
+	 * the launcher/worker was already running.
+	 *
+	 * The below members are set when the launcher starts, and are only
+	 * accessed read-only by the single worker. Thus, we can access these
+	 * without a lock. If multiple workers, or dynamic cost parameters, are
+	 * supported at some point then this would need to be revisited.
+	 */
+	bool		enabling_checksums; /* True if checksums are being enabled,
+									 * else false */
+	int			cost_delay;
+	int			cost_limit;
+	bool		immediate_checkpoint;
+
+	/*
+	 * Signaling between the launcher and the worker process.
+	 *
+	 * As there is only a single worker, and the launcher won't read these
+	 * until the worker exits, they can be accessed without the need for a
+	 * lock. If multiple workers are supported then this will have to be
+	 * revisited.
+	 */
+
+	/* result, set by worker before exiting */
+	DataChecksumsWorkerResult success;
+
+	/*
+	 * tells the worker process whether it should also process the shared
+	 * catalogs
+	 */
+	bool		process_shared_catalogs;
+} DataChecksumsWorkerShmemStruct;
+
+/* Shared memory segment for datachecksumsworker */
+static DataChecksumsWorkerShmemStruct *DataChecksumsWorkerShmem;
+
+/* Bookkeeping for work to do */
+typedef struct DataChecksumsWorkerDatabase
+{
+	Oid			dboid;
+	char	   *dbname;
+} DataChecksumsWorkerDatabase;
+
+typedef struct DataChecksumsWorkerResultEntry
+{
+	Oid			dboid;
+	DataChecksumsWorkerResult result;
+	int			retries;
+} DataChecksumsWorkerResultEntry;
+
+
+/*
+ * Flag set by the interrupt handler
+ */
+static volatile sig_atomic_t abort_requested = false;
+
+/*
+ * Have we set the DataChecksumsWorkerShmemStruct->launcher_running flag?
+ * If we have, we need to clear it before exiting!
+ */
+static volatile sig_atomic_t launcher_running = false;
+
+/*
+ * Are we enabling data checksums, or disabling them?
+ */
+static bool enabling_checksums;
+
+/* Prototypes */
+static List *BuildDatabaseList(void);
+static List *BuildRelationList(bool temp_relations, bool include_shared);
+static void FreeDatabaseList(List *dblist);
+static DataChecksumsWorkerResult ProcessDatabase(DataChecksumsWorkerDatabase *db);
+static bool ProcessAllDatabases(bool immediate_checkpoint);
+static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy);
+static void launcher_cancel_handler(SIGNAL_ARGS);
+static void WaitForAllTransactionsToFinish(void);
+
+/*
+ * StartDataChecksumsWorkerLauncher
+ *		Main entry point for datachecksumsworker launcher process
+ *
+ * The main entrypoint for starting data checksums processing for enabling as
+ * well as disabling.
+ */
+void
+StartDataChecksumsWorkerLauncher(bool enable_checksums,
+								 int cost_delay,
+								 int cost_limit,
+								 bool fast)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	bool		launcher_running;
+
+	/* the cost delay settings have no effect when disabling */
+	Assert(enable_checksums || cost_delay == 0);
+	Assert(enable_checksums || cost_limit == 0);
+
+	/* store the desired state in shared memory */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	DataChecksumsWorkerShmem->launch_enable_checksums = enable_checksums;
+	DataChecksumsWorkerShmem->launch_cost_delay = cost_delay;
+	DataChecksumsWorkerShmem->launch_cost_limit = cost_limit;
+	DataChecksumsWorkerShmem->launch_fast = fast;
+
+	/* is the launcher already running? */
+	launcher_running = DataChecksumsWorkerShmem->launcher_running;
+
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * Launch a new launcher process, if it's not running already.
+	 *
+	 * If the launcher is currently busy enabling the checksums, and we want
+	 * them disabled (or vice versa), the launcher will notice that at latest
+	 * when it's about to exit, and will loop back process the new request. So
+	 * if the launcher is already running, we don't need to do anything more
+	 * here to abort it.
+	 *
+	 * If you call pg_enable/disable_data_checksums() twice in a row, before
+	 * the launcher has had a chance to start up, we still end up launching it
+	 * twice.  That's OK, the second invocation will see that a launcher is
+	 * already running and exit quickly.
+	 *
+	 * TODO: We could optimize here and skip launching the launcher, if we are
+	 * already in the desired state, i.e. if the checksums are already enabled
+	 * and you call pg_enable_data_checksums().
+	 */
+	if (!launcher_running)
+	{
+		/*
+		 * Prepare the BackgroundWorker and launch it.
+		 */
+		memset(&bgw, 0, sizeof(bgw));
+		bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+		bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+		snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+		snprintf(bgw.bgw_function_name, BGW_MAXLEN, "DataChecksumsWorkerLauncherMain");
+		snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker launcher");
+		snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker launcher");
+		bgw.bgw_restart_time = BGW_NEVER_RESTART;
+		bgw.bgw_notify_pid = MyProcPid;
+		bgw.bgw_main_arg = (Datum) 0;
+
+		if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+			ereport(ERROR,
+					(errmsg("failed to start background worker to process data checksums")));
+	}
+}
+
+/*
+ * ProcessSingleRelationFork
+ *		Enable data checksums in a single relation/fork.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy)
+{
+	BlockNumber numblocks = RelationGetNumberOfBlocksInFork(reln, forkNum);
+	BlockNumber blknum;
+	char		activity[NAMEDATALEN * 2 + 128];
+	char	   *relns;
+
+	relns = get_namespace_name(RelationGetNamespace(reln));
+
+	if (!relns)
+		return false;
+
+	/* Report the current relation to pgstat_activity */
+	snprintf(activity, sizeof(activity) - 1, "processing: %s.%s (%s, %dblocks)",
+			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	/*
+	 * We are looping over the blocks which existed at the time of process
+	 * start, which is safe since new blocks are created with checksums set
+	 * already due to the state being "inprogress-on".
+	 */
+	for (blknum = 0; blknum < numblocks; blknum++)
+	{
+		Buffer		buf = ReadBufferExtended(reln, forkNum, blknum, RBM_NORMAL, strategy);
+
+		/* Need to get an exclusive lock before we can flag as dirty */
+		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+
+		/*
+		 * Mark the buffer as dirty and force a full page write.  We have to
+		 * re-write the page to WAL even if the checksum hasn't changed,
+		 * because if there is a replica it might have a slightly different
+		 * version of the page with an invalid checksum, caused by unlogged
+		 * changes (e.g. hintbits) on the master happening while checksums
+		 * were off. This can happen if there was a valid checksum on the page
+		 * at one point in the past, so only when checksums are first on, then
+		 * off, and then turned on again. Iff wal_level is set to "minimal",
+		 * this could be avoided iff the checksum is calculated to be correct.
+		 */
+		START_CRIT_SECTION();
+		MarkBufferDirty(buf);
+		log_newpage_buffer(buf, false);
+		END_CRIT_SECTION();
+
+		UnlockReleaseBuffer(buf);
+
+		/*
+		 * This is the only place where we check if we are asked to abort, the
+		 * abortion will bubble up from here. It's safe to check this without
+		 * a lock, because if we miss it being set, we will try again soon.
+		 */
+		Assert(enabling_checksums);
+		if (!DataChecksumsWorkerShmem->launch_enable_checksums)
+			abort_requested = true;
+		if (abort_requested)
+			return false;
+
+		vacuum_delay_point();
+	}
+
+	pfree(relns);
+	return true;
+}
+
+/*
+ * ProcessSingleRelationByOid
+ *		Process a single relation based on oid.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationByOid(Oid relationId, BufferAccessStrategy strategy)
+{
+	Relation	rel;
+	ForkNumber	fnum;
+	bool		aborted = false;
+
+	StartTransactionCommand();
+
+	rel = try_relation_open(relationId, AccessShareLock);
+	if (rel == NULL)
+	{
+		/*
+		 * Relation no longer exists. We don't consider this an error since
+		 * there are no pages in it that need data checksums, and thus return
+		 * true. The worker operates off a list of relations generated at the
+		 * start of processing, so relations being dropped in the meantime is
+		 * to be expected.
+		 */
+		CommitTransactionCommand();
+		pgstat_report_activity(STATE_IDLE, NULL);
+		return true;
+	}
+	RelationGetSmgr(rel);
+
+	for (fnum = 0; fnum <= MAX_FORKNUM; fnum++)
+	{
+		if (smgrexists(rel->rd_smgr, fnum))
+		{
+			if (!ProcessSingleRelationFork(rel, fnum, strategy))
+			{
+				aborted = true;
+				break;
+			}
+		}
+	}
+	relation_close(rel, AccessShareLock);
+	elog(DEBUG2,
+		 "data checksum processing done for relation with OID %u: %s",
+		 relationId, (aborted ? "aborted" : "finished"));
+
+	CommitTransactionCommand();
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return !aborted;
+}
+
+/*
+ * ProcessDatabase
+ *		Enable data checksums in a single database.
+ *
+ * We do this by launching a dynamic background worker into this database, and
+ * waiting for it to finish.  We have to do this in a separate worker, since
+ * each process can only be connected to one database during its lifetime.
+ */
+static DataChecksumsWorkerResult
+ProcessDatabase(DataChecksumsWorkerDatabase *db)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	BgwHandleStatus status;
+	pid_t		pid;
+	char		activity[NAMEDATALEN + 64];
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_FAILED;
+
+	memset(&bgw, 0, sizeof(bgw));
+	bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+	bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+	snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+	snprintf(bgw.bgw_function_name, BGW_MAXLEN, "%s", "DataChecksumsWorkerMain");
+	snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker worker");
+	snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker worker");
+	bgw.bgw_restart_time = BGW_NEVER_RESTART;
+	bgw.bgw_notify_pid = MyProcPid;
+	bgw.bgw_main_arg = ObjectIdGetDatum(db->dboid);
+
+	/*
+	 * If there are no worker slots available, make sure we retry processing
+	 * this database. This will make the datachecksumsworker move on to the
+	 * next database and quite likely fail with the same problem. TODO: Maybe
+	 * we need a backoff to avoid running through all the databases here in
+	 * short order.
+	 */
+	if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+	{
+		ereport(WARNING,
+				(errmsg("failed to start worker for enabling data checksums in database \"%s\", retrying",
+						db->dbname),
+				 errhint("The max_worker_processes setting might be too low.")));
+		return DATACHECKSUMSWORKER_RETRYDB;
+	}
+
+	status = WaitForBackgroundWorkerStartup(bgw_handle, &pid);
+	if (status == BGWH_STOPPED)
+	{
+		ereport(WARNING,
+				(errmsg("could not start background worker for enabling data checksums in database \"%s\"",
+						db->dbname),
+				 errhint("More details on the error might be found in the server log.")));
+		return DATACHECKSUMSWORKER_FAILED;
+	}
+
+	/*
+	 * If the postmaster crashed we cannot end up with a processed database so
+	 * we have no alternative other than exiting. When enabling checksums we
+	 * won't at this time have changed the pg_control version to enabled so
+	 * when the cluster comes back up processing will have to be restarted.
+	 * When disabling, the pg_control version will be set to off before this
+	 * so when the cluster comes up checksums will be off as expected.
+	 */
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("cannot enable data checksums without the postmaster process"),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	Assert(status == BGWH_STARTED);
+	ereport(DEBUG1,
+			(errmsg("initiating data checksum processing in database \"%s\"",
+					db->dbname)));
+
+	snprintf(activity, sizeof(activity) - 1,
+			 "Waiting for worker in database %s (pid %d)", db->dbname, pid);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	status = WaitForBackgroundWorkerShutdown(bgw_handle);
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("postmaster exited during data checksum processing in \"%s\"",
+						db->dbname),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	if (DataChecksumsWorkerShmem->success == DATACHECKSUMSWORKER_ABORTED)
+		ereport(LOG,
+				(errmsg("data checksums processing was aborted in database \"%s\"",
+						db->dbname)));
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return DataChecksumsWorkerShmem->success;
+}
+
+/*
+ * launcher_exit
+ *
+ * Internal routine for cleaning up state when the launcher process exits. We
+ * need to clean up the abort flag to ensure that processing can be restarted
+ * again after it was previously aborted.
+ */
+static void
+launcher_exit(int code, Datum arg)
+{
+	if (launcher_running)
+	{
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		launcher_running = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		LWLockRelease(DataChecksumsWorkerLock);
+	}
+}
+
+/*
+ * launcher_cancel_handler
+ *
+ * Internal routine for reacting to SIGINT and flagging the worker to abort.
+ * The worker won't be interrupted immediately but will check for abort flag
+ * between each block in a relation.
+ */
+static void
+launcher_cancel_handler(SIGNAL_ARGS)
+{
+	int			save_errno = errno;
+
+	abort_requested = true;
+
+	/*
+	 * There is no sleeping in the main loop, the flag will be checked
+	 * periodically in ProcessSingleRelationFork. The worker does however
+	 * sleep when waiting for concurrent transactions to end so we still need
+	 * to set the latch.
+	 */
+	SetLatch(MyLatch);
+
+	errno = save_errno;
+}
+
+/*
+ * WaitForAllTransactionsToFinish
+ *		Blocks awaiting all current transactions to finish
+ *
+ * Returns when all transactions which are active at the call of the function
+ * have ended, or if the postmaster dies while waiting. If the postmaster dies
+ * the abort flag will be set to indicate that the caller of this shouldn't
+ * proceed.
+ *
+ * NB: this will return early, if aborted by SIGINT or if the target state
+ * is changed while we're running.
+ */
+static void
+WaitForAllTransactionsToFinish(void)
+{
+	TransactionId waitforxid;
+
+	LWLockAcquire(XidGenLock, LW_SHARED);
+	waitforxid = XidFromFullTransactionId(TransamVariables->nextXid);
+	LWLockRelease(XidGenLock);
+
+	while (TransactionIdPrecedes(GetOldestActiveTransactionId(), waitforxid))
+	{
+		char		activity[64];
+		int			rc;
+
+		/* Oldest running xid is older than us, so wait */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for current transactions to finish (waiting for %u)",
+				 waitforxid);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		rc = WaitLatch(MyLatch,
+					   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+					   5000,
+					   WAIT_EVENT_CHECKSUM_ENABLE_STARTCONDITION);
+
+		/*
+		 * If the postmaster died we won't be able to enable checksums
+		 * cluster-wide so abort and hope to continue when restarted.
+		 */
+		if (rc & WL_POSTMASTER_DEATH)
+			ereport(FATAL,
+					(errmsg("postmaster exited during data checksum processing"),
+					 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_SHARED);
+		if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			abort_requested = true;
+		LWLockRelease(DataChecksumsWorkerLock);
+		if (abort_requested)
+			break;
+	}
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+	return;
+}
+
+/*
+ * DataChecksumsWorkerLauncherMain
+ *
+ * Main function for launching dynamic background workers for processing data
+ * checksums in databases. This function has the bgworker management, with
+ * ProcessAllDatabases being responsible for looping over the databases and
+ * initiating processing.
+ */
+void
+DataChecksumsWorkerLauncherMain(Datum arg)
+{
+	on_shmem_exit(launcher_exit, 0);
+
+	ereport(DEBUG1,
+			errmsg("background worker \"datachecksumsworker\" launcher started"));
+
+	pqsignal(SIGTERM, die);
+	pqsignal(SIGINT, launcher_cancel_handler);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_LAUNCHER;
+	init_ps_display(NULL);
+
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	if (DataChecksumsWorkerShmem->launcher_running)
+	{
+		/* Launcher was already running, let it finish */
+		LWLockRelease(DataChecksumsWorkerLock);
+		return;
+	}
+
+	launcher_running = true;
+
+	/*
+	 * Initialize a connection to shared catalogs only.
+	 */
+	BackgroundWorkerInitializeConnectionByOid(InvalidOid, InvalidOid, 0);
+
+	/* Initialize backend status information */
+	pgstat_bestart();
+
+	enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+	DataChecksumsWorkerShmem->launcher_running = true;
+	DataChecksumsWorkerShmem->enabling_checksums = enabling_checksums;
+	DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+	DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+	DataChecksumsWorkerShmem->immediate_checkpoint = DataChecksumsWorkerShmem->launch_fast;
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * The target state can change while we are busy enabling/disabling
+	 * checksums, if the user calls pg_disable/enable_data_checksums() before
+	 * we are finished with the previous request. In that case, we will loop
+	 * back here, to process the new request.
+	 */
+again:
+
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	/*
+	 * If we're asked to enable checksums, we need to check if processing was
+	 * previously interrupted such that we should resume rather than start
+	 * from scratch.
+	 */
+	if (enabling_checksums)
+	{
+		/*
+		 * If we are asked to enable checksums in a cluster which already has
+		 * checksums enabled, exit immediately as there is nothing more to do.
+		 * Hold interrupts to make sure state doesn't change during checking.
+		 */
+		HOLD_INTERRUPTS();
+		if (DataChecksumsNeedVerify())
+		{
+			RESUME_INTERRUPTS();
+			goto done;
+		}
+		RESUME_INTERRUPTS();
+
+		/*
+		 * Initialize progress and indicate that we are waiting on the other
+		 * backends to clear the procsignalbarrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS);
+
+		/*
+		 * Set the state to inprogress-on and wait on the procsignal barrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_ENABLING);
+		SetDataChecksumsOnInProgress();
+
+		if (!ProcessAllDatabases(DataChecksumsWorkerShmem->immediate_checkpoint))
+		{
+			/*
+			 * If the target state changed during processing then it's not a
+			 * failure, so restart processing instead.
+			 */
+			LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+			if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			{
+				LWLockRelease(DataChecksumsWorkerLock);
+				goto done;
+			}
+			LWLockRelease(DataChecksumsWorkerLock);
+			ereport(ERROR,
+					(errmsg("unable to enable data checksums in cluster")));
+		}
+
+		SetDataChecksumsOn();
+	}
+	else
+	{
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_DISABLING);
+		SetDataChecksumsOff();
+	}
+
+done:
+
+	/*
+	 * All done. But before we exit, check if the target state was changed
+	 * while we were running. In that case we will have to start all over
+	 * again.
+	 */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+	if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+	{
+		DataChecksumsWorkerShmem->enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+		DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+		LWLockRelease(DataChecksumsWorkerLock);
+		goto again;
+	}
+
+	/* Shut down progress reporting as we are done */
+	pgstat_progress_end_command();
+
+	launcher_running = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	LWLockRelease(DataChecksumsWorkerLock);
+}
+
+/*
+ * ProcessAllDatabases
+ *		Compute the list of all databases and process checksums in each
+ *
+ * This will repeatedly generate a list of databases to process for enabling
+ * checksums. Until no new databases are found, this will loop around computing
+ * a new list and comparing it to the already seen ones.
+ *
+ * If immediate_checkpoint is set to true then a CHECKPOINT_IMMEDIATE will be
+ * issued. This is useful for testing but should be avoided in production use
+ * as it may affect cluster performance drastically.
+ */
+static bool
+ProcessAllDatabases(bool immediate_checkpoint)
+{
+	List	   *DatabaseList;
+	HTAB	   *ProcessedDatabases = NULL;
+	HASHCTL		hash_ctl;
+	bool		found_failed = false;
+	int			flags;
+
+	/* Initialize a hash tracking all processed databases */
+	memset(&hash_ctl, 0, sizeof(hash_ctl));
+	hash_ctl.keysize = sizeof(Oid);
+	hash_ctl.entrysize = sizeof(DataChecksumsWorkerResultEntry);
+	ProcessedDatabases = hash_create("Processed databases",
+									 64,
+									 &hash_ctl,
+									 HASH_ELEM | HASH_BLOBS);
+
+	/*
+	 * Set up so first run processes shared catalogs, but not once in every
+	 * db.
+	 */
+	DataChecksumsWorkerShmem->process_shared_catalogs = true;
+
+	/*
+	 * Get a list of all databases to process. This may include databases that
+	 * were created during our runtime.  Since a database can be created as a
+	 * copy of any other database (which may not have existed in our last
+	 * run), we have to repeat this loop until no new databases show up in the
+	 * list.
+	 */
+	DatabaseList = BuildDatabaseList();
+
+	/*
+	 * Update progress reporting with the total number of databases we need to
+	 * process. This number should not be changed during processing, the
+	 * columns for processed databases is instead increased such that it can
+	 * be compared against the total.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_DB,
+								 list_length(DatabaseList));
+
+	while (true)
+	{
+		int			processed_databases = 0;
+
+		foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+		{
+			DataChecksumsWorkerResult result;
+			DataChecksumsWorkerResultEntry *entry;
+			bool		found;
+
+			/*
+			 * Indicate which database is being processed set the number of
+			 * relations to -1 to clear field from previous values. -1 will
+			 * translate to NULL in the progress view.
+			 */
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_CUR_DB, db->dboid);
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_REL, -1);
+
+			/*
+			 * Check if this database has been processed already, and if so
+			 * whether it should be retried or skipped.
+			 */
+			entry = (DataChecksumsWorkerResultEntry *) hash_search(ProcessedDatabases, &db->dboid,
+																   HASH_FIND, NULL);
+
+			if (entry)
+			{
+				if (entry->result == DATACHECKSUMSWORKER_RETRYDB)
+				{
+					/*
+					 * Limit the number of retries to avoid infinite looping
+					 * in case there simply won't be enough workers in the
+					 * cluster to finish this operation.
+					 */
+					if (entry->retries > DATACHECKSUMSWORKER_MAX_DB_RETRIES)
+						entry->result = DATACHECKSUMSWORKER_FAILED;
+				}
+
+				/* Skip if this database has been processed already */
+				if (entry->result != DATACHECKSUMSWORKER_RETRYDB)
+					continue;
+			}
+
+			result = ProcessDatabase(db);
+			processed_databases++;
+
+			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
+			{
+				/*
+				 * If one database has completed shared catalogs, we don't
+				 * have to process them again.
+				 */
+				if (DataChecksumsWorkerShmem->process_shared_catalogs)
+					DataChecksumsWorkerShmem->process_shared_catalogs = false;
+			}
+			else if (result == DATACHECKSUMSWORKER_ABORTED)
+			{
+				/* Abort flag set, so exit the whole process */
+				return false;
+			}
+
+			entry = hash_search(ProcessedDatabases, &db->dboid, HASH_ENTER, &found);
+			entry->dboid = db->dboid;
+			entry->result = result;
+			if (!found)
+				entry->retries = 0;
+			else
+				entry->retries++;
+		}
+
+		elog(DEBUG1,
+			 "%i databases processed for data checksum enabling, %s",
+			 processed_databases,
+			 (processed_databases ? "process with restart" : "process completed"));
+
+		FreeDatabaseList(DatabaseList);
+
+		/*
+		 * If no databases were processed in this run of the loop, we have now
+		 * finished all databases and no concurrently created ones can exist.
+		 */
+		if (processed_databases == 0)
+			break;
+
+		/*
+		 * Re-generate the list of databases for another pass. Since we wait
+		 * for all pre-existing transactions finish, this way we can be
+		 * certain that there are no databases left without checksums.
+		 */
+		WaitForAllTransactionsToFinish();
+		DatabaseList = BuildDatabaseList();
+	}
+
+	/*
+	 * ProcessedDatabases now has all databases and the results of their
+	 * processing. Failure to enable checksums for a database can be because
+	 * they actually failed for some reason, or because the database was
+	 * dropped between us getting the database list and trying to process it.
+	 * Get a fresh list of databases to detect the second case where the
+	 * database was dropped before we had started processing it. If a database
+	 * still exists, but enabling checksums failed then we fail the entire
+	 * checksumming process and exit with an error.
+	 */
+	WaitForAllTransactionsToFinish();
+	DatabaseList = BuildDatabaseList();
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+	{
+		DataChecksumsWorkerResult *entry;
+		bool		found;
+
+		entry = hash_search(ProcessedDatabases, (void *) &db->dboid,
+							HASH_FIND, &found);
+
+		/*
+		 * We are only interested in the databases where the failed database
+		 * still exists.
+		 */
+		if (found && *entry == DATACHECKSUMSWORKER_FAILED)
+		{
+			ereport(WARNING,
+					(errmsg("failed to enable data checksums in \"%s\"",
+							db->dbname)));
+			found_failed = found;
+			continue;
+		}
+	}
+
+	FreeDatabaseList(DatabaseList);
+
+	if (found_failed)
+	{
+		/* Disable checksums on cluster, because we failed */
+		SetDataChecksumsOff();
+		ereport(ERROR,
+				(errmsg("data checksums failed to get enabled in all databases, aborting"),
+				 errhint("The server log might have more information on the cause of the error.")));
+	}
+
+	/*
+	 * Force a checkpoint to get everything out to disk. The use of immediate
+	 * checkpoints is for running tests, as they would otherwise not execute
+	 * in such a way that they can reliably be placed under timeout control.
+	 */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_IMMEDIATE;
+	RequestCheckpoint(flags);
+
+	return true;
+}
+
+/*
+ * DataChecksumsWorkerShmemSize
+ *		Compute required space for datachecksumsworker-related shared memory
+ */
+Size
+DataChecksumsWorkerShmemSize(void)
+{
+	Size		size;
+
+	size = sizeof(DataChecksumsWorkerShmemStruct);
+	size = MAXALIGN(size);
+
+	return size;
+}
+
+/*
+ * DataChecksumsWorkerShmemInit
+ *		Allocate and initialize datachecksumsworker-related shared memory
+ */
+void
+DataChecksumsWorkerShmemInit(void)
+{
+	bool		found;
+
+	DataChecksumsWorkerShmem = (DataChecksumsWorkerShmemStruct *)
+		ShmemInitStruct("DataChecksumsWorker Data",
+						DataChecksumsWorkerShmemSize(),
+						&found);
+
+	MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
+
+	/*
+	 * Even if this is a redundant assignment, we want to be explicit about
+	 * our intent for readability, since we want to be able to query this
+	 * state in case of restartability.
+	 */
+	DataChecksumsWorkerShmem->launch_enable_checksums = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	DataChecksumsWorkerShmem->launch_fast = false;
+}
+
+/*
+ * BuildDatabaseList
+ *		Compile a list of all currently available databases in the cluster
+ *
+ * This creates the list of databases for the datachecksumsworker workers to
+ * add checksums to. If the caller wants to ensure that no concurrently
+ * running CREATE DATABASE calls exist, this needs to be preceded by a call
+ * to WaitForAllTransactionsToFinish().
+ */
+static List *
+BuildDatabaseList(void)
+{
+	List	   *DatabaseList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(DatabaseRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_database pgdb = (Form_pg_database) GETSTRUCT(tup);
+		DataChecksumsWorkerDatabase *db;
+
+		oldctx = MemoryContextSwitchTo(ctx);
+
+		db = (DataChecksumsWorkerDatabase *) palloc0(sizeof(DataChecksumsWorkerDatabase));
+
+		db->dboid = pgdb->oid;
+		db->dbname = pstrdup(NameStr(pgdb->datname));
+
+		DatabaseList = lappend(DatabaseList, db);
+
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return DatabaseList;
+}
+
+static void
+FreeDatabaseList(List *dblist)
+{
+	if (!dblist)
+		return;
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, dblist)
+	{
+		if (db->dbname != NULL)
+			pfree(db->dbname);
+	}
+
+	list_free_deep(dblist);
+}
+
+/*
+ * BuildRelationList
+ *		Compile a list of relations in the database
+ *
+ * Returns a list of OIDs for the request relation types. If temp_relations
+ * is True then only temporary relations are returned. If temp_relations is
+ * False then non-temporary relations which have data checksums are returned.
+ * If include_shared is True then shared relations are included as well in a
+ * non-temporary list. include_shared has no relevance when building a list of
+ * temporary relations.
+ */
+static List *
+BuildRelationList(bool temp_relations, bool include_shared)
+{
+	List	   *RelationList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(RelationRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_class pgc = (Form_pg_class) GETSTRUCT(tup);
+
+		/*
+		 * Only include temporary relations when asked for a temp relation
+		 * list.
+		 */
+		if (pgc->relpersistence == RELPERSISTENCE_TEMP)
+		{
+			if (!temp_relations)
+				continue;
+		}
+		else
+		{
+			/*
+			 * If we are only interested in temp relations then continue
+			 * immediately as the current relation isn't a temp relation.
+			 */
+			if (temp_relations)
+				continue;
+
+			if (!RELKIND_HAS_STORAGE(pgc->relkind))
+				continue;
+
+			if (pgc->relisshared && !include_shared)
+				continue;
+		}
+
+		oldctx = MemoryContextSwitchTo(ctx);
+		RelationList = lappend_oid(RelationList, pgc->oid);
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return RelationList;
+}
+
+/*
+ * DataChecksumsWorkerMain
+ *
+ * Main function for enabling checksums in a single database, This is the
+ * function set as the bgw_function_name in the dynamic background worker
+ * process initiated for each database by the worker launcher. After enabling
+ * data checksums in each applicable relation in the database, it will wait for
+ * all temporary relations that were present when the function started to
+ * disappear before returning. This is required since we cannot rewrite
+ * existing temporary relations with data checksums.
+ */
+void
+DataChecksumsWorkerMain(Datum arg)
+{
+	Oid			dboid = DatumGetObjectId(arg);
+	List	   *RelationList = NIL;
+	List	   *InitialTempTableList = NIL;
+	BufferAccessStrategy strategy;
+	bool		aborted = false;
+
+	enabling_checksums = true;
+
+	pqsignal(SIGTERM, die);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_WORKER;
+	init_ps_display(NULL);
+
+	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
+											  BGWORKER_BYPASS_ALLOWCONN);
+
+	/*
+	 * Get a list of all temp tables present as we start in this database. We
+	 * need to wait until they are all gone until we are done, since we cannot
+	 * access these relations and modify them.
+	 */
+	InitialTempTableList = BuildRelationList(true, false);
+
+	/*
+	 * Enable vacuum cost delay, if any.
+	 */
+	Assert(DataChecksumsWorkerShmem->enabling_checksums);
+	VacuumCostDelay = DataChecksumsWorkerShmem->cost_delay;
+	VacuumCostLimit = DataChecksumsWorkerShmem->cost_limit;
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumCostPageHit = 0;
+	VacuumCostPageMiss = 0;
+	VacuumCostPageDirty = 0;
+
+	/*
+	 * Create and set the vacuum strategy as our buffer strategy.
+	 */
+	strategy = GetAccessStrategy(BAS_VACUUM);
+
+	RelationList = BuildRelationList(false,
+									 DataChecksumsWorkerShmem->process_shared_catalogs);
+	foreach_oid(reloid, RelationList)
+	{
+		if (!ProcessSingleRelationByOid(reloid, strategy))
+		{
+			aborted = true;
+			break;
+		}
+	}
+	list_free(RelationList);
+
+	if (aborted)
+	{
+		DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+		ereport(DEBUG1,
+				(errmsg("data checksum processing aborted in database OID %u",
+						dboid)));
+		return;
+	}
+
+	/*
+	 * Wait for all temp tables that existed when we started to go away. This
+	 * is necessary since we cannot "reach" them to enable checksums. Any temp
+	 * tables created after we started will already have checksums in them
+	 * (due to the "inprogress-on" state), so no need to wait for those.
+	 */
+	for (;;)
+	{
+		List	   *CurrentTempTables;
+		int			numleft;
+		char		activity[64];
+
+		CurrentTempTables = BuildRelationList(true, false);
+		numleft = 0;
+		foreach_oid(tmptbloid, InitialTempTableList)
+		{
+			if (list_member_oid(CurrentTempTables, tmptbloid))
+				numleft++;
+		}
+		list_free(CurrentTempTables);
+
+		if (numleft == 0)
+			break;
+
+		/*
+		 * At least one temp table is left to wait for, indicate in pgstat
+		 * activity and progress reporting.
+		 */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for %d temp tables to be removed", numleft);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		(void) WaitLatch(MyLatch,
+						 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+						 5000,
+						 WAIT_EVENT_CHECKSUM_ENABLE_FINISHCONDITION);
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		aborted = DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums;
+		LWLockRelease(DataChecksumsWorkerLock);
+
+		if (aborted || abort_requested)
+		{
+			DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+			ereport(DEBUG1,
+					(errmsg("data checksum processing aborted in database OID %u",
+							dboid)));
+			return;
+		}
+	}
+
+	list_free(InitialTempTableList);
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
+}
diff --git a/src/backend/postmaster/meson.build b/src/backend/postmaster/meson.build
index 0008603cfee..ce10ef1059a 100644
--- a/src/backend/postmaster/meson.build
+++ b/src/backend/postmaster/meson.build
@@ -6,6 +6,7 @@ backend_sources += files(
   'bgworker.c',
   'bgwriter.c',
   'checkpointer.c',
+  'datachecksumsworker.c',
   'fork_process.c',
   'interrupt.c',
   'launch_backend.c',
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 24d88f368d8..00cce4c7673 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -186,6 +186,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 		case XLOG_FPW_CHANGE:
 		case XLOG_FPI_FOR_HINT:
 		case XLOG_FPI:
+		case XLOG_CHECKSUMS:
 		case XLOG_OVERWRITE_CONTRECORD:
 		case XLOG_CHECKPOINT_REDO:
 			break;
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 174eed70367..e7761a3ddb6 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -30,6 +30,8 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "postmaster/postmaster.h"
 #include "postmaster/walsummarizer.h"
 #include "replication/logicallauncher.h"
 #include "replication/origin.h"
@@ -148,6 +150,7 @@ CalculateShmemSize(int *num_semaphores)
 	size = add_size(size, WaitEventCustomShmemSize());
 	size = add_size(size, InjectionPointShmemSize());
 	size = add_size(size, SlotSyncShmemSize());
+	size = add_size(size, DataChecksumsWorkerShmemSize());
 
 	/* include additional requested shmem from preload libraries */
 	size = add_size(size, total_addin_request);
@@ -330,6 +333,7 @@ CreateOrAttachShmemStructs(void)
 	PgArchShmemInit();
 	ApplyLauncherShmemInit();
 	SlotSyncShmemInit();
+	DataChecksumsWorkerShmemInit();
 
 	/*
 	 * Set up other modules that need some shared memory space
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 7d201965503..2b13a8cd260 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -18,6 +18,7 @@
 #include <unistd.h>
 
 #include "access/parallel.h"
+#include "access/xlog.h"
 #include "commands/async.h"
 #include "miscadmin.h"
 #include "pgstat.h"
@@ -575,6 +576,18 @@ ProcessProcSignalBarrier(void)
 					case PROCSIGNAL_BARRIER_SMGRRELEASE:
 						processed = ProcessBarrierSmgrRelease();
 						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON:
+						processed = AbsorbChecksumsOnInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_ON:
+						processed = AbsorbChecksumsOnBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF:
+						processed = AbsorbChecksumsOffInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_OFF:
+						processed = AbsorbChecksumsOffBarrier();
+						break;
 				}
 
 				/*
diff --git a/src/backend/storage/page/README b/src/backend/storage/page/README
index e30d7ac59ad..73c36a63908 100644
--- a/src/backend/storage/page/README
+++ b/src/backend/storage/page/README
@@ -10,7 +10,9 @@ http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf, discussed
 2010/12/22 on -hackers list.
 
 Current implementation requires this be enabled system-wide at initdb time, or
-by using the pg_checksums tool on an offline cluster.
+by using the pg_checksums tool on an offline cluster.  Checksums can also be
+enabled at runtime using pg_enable_data_checksums(), and disabled by using
+pg_disable_data_checksums().
 
 The checksum is not valid at all times on a data page!!
 The checksum is valid when the page leaves the shared pool and is checked
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index ecc81aacfc3..8bd5fed8c85 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -98,7 +98,7 @@ PageIsVerifiedExtended(PageData *page, BlockNumber blkno, int flags)
 	 */
 	if (!PageIsNew(page))
 	{
-		if (DataChecksumsEnabled())
+		if (DataChecksumsNeedVerify())
 		{
 			checksum = pg_checksum_page(page, blkno);
 
@@ -1501,7 +1501,7 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 	static char *pageCopy = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return page;
 
 	/*
@@ -1531,7 +1531,7 @@ void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
 	/* If we don't need a checksum, just return */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return;
 
 	((PageHeader) page)->pd_checksum = pg_checksum_page(page, blkno);
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index eb575025596..e7b418a000e 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -370,6 +370,8 @@ pgstat_tracks_io_bktype(BackendType bktype)
 		case B_LOGGER:
 			return false;
 
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 		case B_AUTOVAC_LAUNCHER:
 		case B_AUTOVAC_WORKER:
 		case B_BACKEND:
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index 3c594415bfd..b1b5cdcf36c 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -115,6 +115,8 @@ CHECKPOINT_DELAY_COMPLETE	"Waiting for a backend that blocks a checkpoint from c
 CHECKPOINT_DELAY_START	"Waiting for a backend that blocks a checkpoint from starting."
 CHECKPOINT_DONE	"Waiting for a checkpoint to complete."
 CHECKPOINT_START	"Waiting for a checkpoint to start."
+CHECKSUM_ENABLE_STARTCONDITION	"Waiting for data checksums enabling to start."
+CHECKSUM_ENABLE_FINISHCONDITION	"Waiting for data checksums to be enabled."
 EXECUTE_GATHER	"Waiting for activity from a child process while executing a <literal>Gather</literal> plan node."
 HASH_BATCH_ALLOCATE	"Waiting for an elected Parallel Hash participant to allocate a hash table."
 HASH_BATCH_ELECT	"Waiting to elect a Parallel Hash participant to allocate a hash table."
@@ -346,6 +348,7 @@ WALSummarizer	"Waiting to read or update WAL summarization state."
 DSMRegistry	"Waiting to read or update the dynamic shared memory registry."
 InjectionPoint	"Waiting to read or update information related to injection points."
 SerialControl	"Waiting to read or update shared <filename>pg_serial</filename> state."
+DataChecksumsWorker	"Waiting for data checksumsworker."
 
 #
 # END OF PREDEFINED LWLOCKS (DO NOT CHANGE THIS LINE)
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 9172e1cb9d2..f6bd1e9f0ca 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -274,6 +274,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		cmdtype = PROGRESS_COMMAND_BASEBACKUP;
 	else if (pg_strcasecmp(cmd, "COPY") == 0)
 		cmdtype = PROGRESS_COMMAND_COPY;
+	else if (pg_strcasecmp(cmd, "DATACHECKSUMS") == 0)
+		cmdtype = PROGRESS_COMMAND_DATACHECKSUMS;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -1146,9 +1148,6 @@ pg_stat_get_db_checksum_failures(PG_FUNCTION_ARGS)
 	int64		result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
@@ -1164,9 +1163,6 @@ pg_stat_get_db_checksum_last_failure(PG_FUNCTION_ARGS)
 	TimestampTz result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index dc3521457c7..a071ba6f455 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -293,6 +293,12 @@ GetBackendTypeDesc(BackendType backendType)
 		case B_CHECKPOINTER:
 			backendDesc = gettext_noop("checkpointer");
 			break;
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+			backendDesc = "datachecksumsworker launcher";
+			break;
+		case B_DATACHECKSUMSWORKER_WORKER:
+			backendDesc = "datachecksumsworker worker";
+			break;
 		case B_LOGGER:
 			backendDesc = gettext_noop("logger");
 			break;
@@ -892,7 +898,8 @@ InitializeSessionUserIdStandalone(void)
 	 * workers, in slot sync worker and in background workers.
 	 */
 	Assert(!IsUnderPostmaster || AmAutoVacuumWorkerProcess() ||
-		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess());
+		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess() ||
+		   AmDataChecksumsWorkerProcess());
 
 	/* call only once */
 	Assert(!OidIsValid(AuthenticatedUserId));
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index ee1a9d5d98b..70899e6ef2d 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -739,6 +739,11 @@ InitPostgres(const char *in_dbname, Oid dboid,
 	 */
 	SharedInvalBackendInit(false);
 
+	/*
+	 * Set up backend local cache of Controldata values.
+	 */
+	InitLocalControldata();
+
 	ProcSignalInit(MyCancelKeyValid, MyCancelKey);
 
 	/*
@@ -869,7 +874,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
 					 errhint("You should immediately run CREATE USER \"%s\" SUPERUSER;.",
 							 username != NULL ? username : "postgres")));
 	}
-	else if (AmBackgroundWorkerProcess())
+	else if (AmBackgroundWorkerProcess() || AmDataChecksumsWorkerProcess())
 	{
 		if (username == NULL && !OidIsValid(useroid))
 		{
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index ad25cbb39c5..05cba3a02e3 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -476,6 +476,14 @@ static const struct config_enum_entry wal_compression_options[] = {
 	{NULL, 0, false}
 };
 
+static const struct config_enum_entry data_checksums_options[] = {
+	{"on", DATA_CHECKSUMS_ON, true},
+	{"off", DATA_CHECKSUMS_OFF, true},
+	{"inprogress-on", DATA_CHECKSUMS_INPROGRESS_ON, true},
+	{"inprogress-off", DATA_CHECKSUMS_INPROGRESS_OFF, true},
+	{NULL, 0, false}
+};
+
 /*
  * Options for enum values stored in other modules
  */
@@ -601,7 +609,7 @@ static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
 static int	num_os_semaphores;
-static bool data_checksums;
+static int	data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
@@ -1952,17 +1960,6 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
-	{
-		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
-			gettext_noop("Shows whether data checksums are turned on for this cluster."),
-			NULL,
-			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
-		},
-		&data_checksums,
-		false,
-		NULL, NULL, NULL
-	},
-
 	{
 		{"syslog_sequence_numbers", PGC_SIGHUP, LOGGING_WHERE,
 			gettext_noop("Add sequence number to syslog messages to avoid duplicate suppression."),
@@ -5299,6 +5296,17 @@ struct config_enum ConfigureNamesEnum[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows whether data checksums are turned on for this cluster."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+		},
+		&data_checksums,
+		DATA_CHECKSUMS_OFF, data_checksums_options,
+		NULL, NULL, show_data_checksums
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 867aeddc601..79c3f86357e 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -576,7 +576,7 @@ main(int argc, char *argv[])
 		mode == PG_MODE_DISABLE)
 		pg_fatal("data checksums are already disabled in cluster");
 
-	if (ControlFile->data_checksum_version > 0 &&
+	if (ControlFile->data_checksum_version == DATA_CHECKSUMS_ON &&
 		mode == PG_MODE_ENABLE)
 		pg_fatal("data checksums are already enabled in cluster");
 
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index bd49ea867bf..f48936d3b00 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -14,6 +14,7 @@
 
 #include "common/string.h"
 #include "pg_upgrade.h"
+#include "storage/bufpage.h"
 
 
 /*
@@ -735,6 +736,14 @@ check_control_data(ControlData *oldctrl,
 	 * check_for_isn_and_int8_passing_mismatch().
 	 */
 
+	/*
+	 * If data checksums are in any in-progress state then disallow the
+	 * upgrade. The user should either let the process finish, or turn off
+	 * data checksums, before retrying.
+	 */
+	if (oldctrl->data_checksum_version > PG_DATA_CHECKSUM_VERSION)
+		pg_fatal("checksums are being enabled in the old cluster");
+
 	/*
 	 * We might eventually allow upgrades from checksum to no-checksum
 	 * clusters.
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index d313099c027..0bb32c9c0fc 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -117,7 +117,7 @@ extern PGDLLIMPORT int wal_level;
  * of the bits make it to disk, but the checksum wouldn't match.  Also WAL-log
  * them if forced by wal_log_hints=on.
  */
-#define XLogHintBitIsNeeded() (DataChecksumsEnabled() || wal_log_hints)
+#define XLogHintBitIsNeeded() (wal_log_hints || DataChecksumsNeedWrite())
 
 /* Do we need to WAL-log information required only for Hot Standby and logical replication? */
 #define XLogStandbyInfoActive() (wal_level >= WAL_LEVEL_REPLICA)
@@ -230,7 +230,19 @@ extern XLogRecPtr GetXLogWriteRecPtr(void);
 
 extern uint64 GetSystemIdentifier(void);
 extern char *GetMockAuthenticationNonce(void);
-extern bool DataChecksumsEnabled(void);
+extern bool DataChecksumsNeedWrite(void);
+extern bool DataChecksumsNeedVerify(void);
+extern bool DataChecksumsOnInProgress(void);
+extern bool DataChecksumsOffInProgress(void);
+extern void SetDataChecksumsOnInProgress(void);
+extern void SetDataChecksumsOn(void);
+extern void SetDataChecksumsOff(void);
+extern bool AbsorbChecksumsOnInProgressBarrier(void);
+extern bool AbsorbChecksumsOffInProgressBarrier(void);
+extern bool AbsorbChecksumsOnBarrier(void);
+extern bool AbsorbChecksumsOffBarrier(void);
+extern const char *show_data_checksums(void);
+extern void InitLocalControldata(void);
 extern bool GetDefaultCharSignedness(void);
 extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
 extern Size XLOGShmemSize(void);
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index 2cf8d55d706..ee82d4ce73d 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,6 +25,7 @@
 #include "lib/stringinfo.h"
 #include "pgtime.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/relfilelocator.h"
 
 
@@ -289,6 +290,12 @@ typedef struct xl_restore_point
 	char		rp_name[MAXFNAMELEN];
 } xl_restore_point;
 
+/* Information logged when data checksum level is changed */
+typedef struct xl_checksum_state
+{
+	ChecksumType new_checksumtype;
+} xl_checksum_state;
+
 /* Overwrite of prior contrecord */
 typedef struct xl_overwrite_contrecord
 {
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 63e834a6ce4..9b5a50adf54 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -80,6 +80,7 @@ typedef struct CheckPoint
 /* 0xC0 is used in Postgres 9.5-11 */
 #define XLOG_OVERWRITE_CONTRECORD		0xD0
 #define XLOG_CHECKPOINT_REDO			0xE0
+#define XLOG_CHECKSUMS					0xF0
 
 
 /*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index cede992b6e2..fb0e062b984 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12246,6 +12246,23 @@
   proname => 'jsonb_subscript_handler', prorettype => 'internal',
   proargtypes => 'internal', prosrc => 'jsonb_subscript_handler' },
 
+# data checksum management functions
+{ oid => '9258',
+  descr => 'disable data checksums',
+  proname => 'pg_disable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => '',
+  prosrc => 'disable_data_checksums' },
+
+{ oid => '9257',
+  descr => 'enable data checksums',
+  proname => 'pg_enable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'int4 int4 bool', proallargtypes => '{int4,int4,bool}',
+  proargmodes => '{i,i,i}',
+  proargnames => '{cost_delay,cost_limit,fast}',
+  prosrc => 'enable_data_checksums' },
+
 # collation management functions
 { oid => '3445', descr => 'import collations from operating system',
   proname => 'pg_import_system_collations', procost => '100',
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 7c736e7b03b..1ebd0c792b4 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -157,4 +157,22 @@
 #define PROGRESS_COPY_TYPE_PIPE 3
 #define PROGRESS_COPY_TYPE_CALLBACK 4
 
+/* Progress parameters for PROGRESS_DATACHECKSUMS */
+#define PROGRESS_DATACHECKSUMS_PHASE 0
+#define PROGRESS_DATACHECKSUMS_TOTAL_DB 1
+#define PROGRESS_DATACHECKSUMS_TOTAL_REL 2
+#define PROGRESS_DATACHECKSUMS_PROCESSED_DB 3
+#define PROGRESS_DATACHECKSUMS_PROCESSED_REL 4
+#define PROGRESS_DATACHECKSUMS_CUR_DB 5
+#define PROGRESS_DATACHECKSUMS_CUR_REL 6
+#define PROGRESS_DATACHECKSUMS_CUR_REL_TOTAL_BLOCKS 7
+#define PROGRESS_DATACHECKSUMS_CUR_REL_PROCESSED_BLOCKS 8
+
+/* Phases of datachecksumsworker operation */
+#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING 0
+#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING 1
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS 2
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL 3
+#define PROGRESS_DATACHECKSUMS_DONE 4
+
 #endif
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index a2b63495eec..9923c7f518d 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -365,6 +365,9 @@ typedef enum BackendType
 	B_WAL_SUMMARIZER,
 	B_WAL_WRITER,
 
+	B_DATACHECKSUMSWORKER_LAUNCHER,
+	B_DATACHECKSUMSWORKER_WORKER,
+
 	/*
 	 * Logger is not connected to shared memory and does not have a PGPROC
 	 * entry.
@@ -389,6 +392,7 @@ extern PGDLLIMPORT BackendType MyBackendType;
 #define AmWalReceiverProcess()		(MyBackendType == B_WAL_RECEIVER)
 #define AmWalSummarizerProcess()	(MyBackendType == B_WAL_SUMMARIZER)
 #define AmWalWriterProcess()		(MyBackendType == B_WAL_WRITER)
+#define AmDataChecksumsWorkerProcess()	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
 
 #define AmSpecialWorkerProcess() \
 	(AmAutoVacuumLauncherProcess() || \
diff --git a/src/include/postmaster/datachecksumsworker.h b/src/include/postmaster/datachecksumsworker.h
new file mode 100644
index 00000000000..59c9000d646
--- /dev/null
+++ b/src/include/postmaster/datachecksumsworker.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.h
+ *	  header file for checksum helper background worker
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/postmaster/datachecksumsworker.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef DATACHECKSUMSWORKER_H
+#define DATACHECKSUMSWORKER_H
+
+/* Shared memory */
+extern Size DataChecksumsWorkerShmemSize(void);
+extern void DataChecksumsWorkerShmemInit(void);
+
+/* Start the background processes for enabling or disabling checksums */
+void		StartDataChecksumsWorkerLauncher(bool enable_checksums,
+											 int cost_delay,
+											 int cost_limit,
+											 bool fast);
+
+/* Background worker entrypoints */
+void		DataChecksumsWorkerLauncherMain(Datum arg);
+void		DataChecksumsWorkerMain(Datum arg);
+
+#endif							/* DATACHECKSUMSWORKER_H */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 6646b6f6371..5326782171f 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -205,7 +205,17 @@ typedef PageHeaderData *PageHeader;
  * handling pages.
  */
 #define PG_PAGE_LAYOUT_VERSION		4
+
+/*
+ * Checksum version 0 is used for when data checksums are disabled.
+ * PG_DATA_CHECKSUM_VERSION defines that data checksums are enabled in the
+ * cluster and PG_DATA_CHECKSUM_INPROGRESS_{ON|OFF}_VERSION defines that data
+ * checksums are either currently being enabled or disabled.
+ */
 #define PG_DATA_CHECKSUM_VERSION	1
+#define PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION	2
+#define PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION	3
+
 
 /* ----------------------------------------------------------------
  *						page support functions
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 25d13a798d1..6faff962ef0 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -15,6 +15,14 @@
 
 #include "storage/block.h"
 
+typedef enum ChecksumType
+{
+	DATA_CHECKSUMS_OFF = 0,
+	DATA_CHECKSUMS_ON,
+	DATA_CHECKSUMS_INPROGRESS_ON,
+	DATA_CHECKSUMS_INPROGRESS_OFF
+} ChecksumType;
+
 /*
  * Compute the checksum for a Postgres page.  The page must be aligned on a
  * 4-byte boundary.
diff --git a/src/include/storage/lwlocklist.h b/src/include/storage/lwlocklist.h
index cf565452382..afe39db753c 100644
--- a/src/include/storage/lwlocklist.h
+++ b/src/include/storage/lwlocklist.h
@@ -83,3 +83,4 @@ PG_LWLOCK(49, WALSummarizer)
 PG_LWLOCK(50, DSMRegistry)
 PG_LWLOCK(51, InjectionPoint)
 PG_LWLOCK(52, SerialControl)
+PG_LWLOCK(53, DataChecksumsWorker)
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 114eb1f8f76..61a3e3af9d4 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -447,9 +447,10 @@ extern PGDLLIMPORT PGPROC *PreparedXactProcs;
  * Background writer, checkpointer, WAL writer, WAL summarizer, and archiver
  * run during normal operation.  Startup process and WAL receiver also consume
  * 2 slots, but WAL writer is launched only after startup has exited, so we
- * only need 6 slots.
+ * only need 6 slots to cover these. The DataChecksums worker and launcher
+ * can consume 2 slots when data checksums are enabled or disabled.
  */
-#define NUM_AUXILIARY_PROCS		6
+#define NUM_AUXILIARY_PROCS		8
 
 /* configurable options */
 extern PGDLLIMPORT int DeadlockTimeout;
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 022fd8ed933..8937fa6ed3d 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -54,6 +54,11 @@ typedef enum
 typedef enum
 {
 	PROCSIGNAL_BARRIER_SMGRRELEASE, /* ask smgr to close files */
+
+	PROCSIGNAL_BARRIER_CHECKSUM_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_ON,
 } ProcSignalBarrierType;
 
 /*
diff --git a/src/include/utils/backend_progress.h b/src/include/utils/backend_progress.h
index dda813ab407..c664e92dbfe 100644
--- a/src/include/utils/backend_progress.h
+++ b/src/include/utils/backend_progress.h
@@ -28,6 +28,7 @@ typedef enum ProgressCommandType
 	PROGRESS_COMMAND_CREATE_INDEX,
 	PROGRESS_COMMAND_BASEBACKUP,
 	PROGRESS_COMMAND_COPY,
+	PROGRESS_COMMAND_DATACHECKSUMS,
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	20
diff --git a/src/test/Makefile b/src/test/Makefile
index 511a72e6238..278ce3e8a86 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,16 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = perl postmaster regress isolation modules authentication recovery subscription
+SUBDIRS = \
+		perl \
+		postmaster \
+		regress \
+		isolation \
+		modules \
+		authentication \
+		recovery \
+		subscription \
+		checksum
 
 ifeq ($(with_icu),yes)
 SUBDIRS += icu
diff --git a/src/test/checksum/.gitignore b/src/test/checksum/.gitignore
new file mode 100644
index 00000000000..871e943d50e
--- /dev/null
+++ b/src/test/checksum/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/checksum/Makefile b/src/test/checksum/Makefile
new file mode 100644
index 00000000000..fd03bf73df4
--- /dev/null
+++ b/src/test/checksum/Makefile
@@ -0,0 +1,23 @@
+#-------------------------------------------------------------------------
+#
+# Makefile for src/test/checksum
+#
+# Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/checksum/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/test/checksum
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
+
+clean distclean maintainer-clean:
+	rm -rf tmp_check
diff --git a/src/test/checksum/README b/src/test/checksum/README
new file mode 100644
index 00000000000..0f0317060b3
--- /dev/null
+++ b/src/test/checksum/README
@@ -0,0 +1,22 @@
+src/test/checksum/README
+
+Regression tests for data checksums
+===================================
+
+This directory contains a test suite for enabling data checksums
+in a running cluster.
+
+Running the tests
+=================
+
+    make check
+
+or
+
+    make installcheck
+
+NOTE: This creates a temporary installation (in the case of "check"),
+with multiple nodes, be they master or standby(s) for the purpose of
+the tests.
+
+NOTE: This requires the --enable-tap-tests argument to configure.
diff --git a/src/test/checksum/meson.build b/src/test/checksum/meson.build
new file mode 100644
index 00000000000..5f96b5c246d
--- /dev/null
+++ b/src/test/checksum/meson.build
@@ -0,0 +1,15 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+tests += {
+  'name': 'checksums',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'tests': [
+      't/001_basic.pl',
+      't/002_restarts.pl',
+      't/003_standby_restarts.pl',
+      't/004_offline.pl',
+    ],
+  },
+}
diff --git a/src/test/checksum/t/001_basic.pl b/src/test/checksum/t/001_basic.pl
new file mode 100644
index 00000000000..4c64f6a14fc
--- /dev/null
+++ b/src/test/checksum/t/001_basic.pl
@@ -0,0 +1,88 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are turned off
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable data checksums
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+# Wait for checksums to become enabled
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Enable data checksums again which should be a no-op..
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+# ..and make sure we can still read/write data
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums again
+$node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are disabled');
+
+# Test reading again
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure previously checksummed pages can be read back');
+
+# Re-enable checksums and make sure that the underlying data has changed to
+# ensure that checksums will be different.
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back the data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$node->stop;
+
+done_testing();
diff --git a/src/test/checksum/t/002_restarts.pl b/src/test/checksum/t/002_restarts.pl
new file mode 100644
index 00000000000..2697b722257
--- /dev/null
+++ b/src/test/checksum/t/002_restarts.pl
@@ -0,0 +1,92 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# restarting the processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, '1', "ensure checksums aren't enabled yet");
+
+$bsession->quit;
+$node->stop;
+$node->start;
+
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'inprogress-on', "ensure checksums aren't enabled yet");
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are turned on');
+
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+$result = $node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are turned off');
+
+done_testing();
diff --git a/src/test/checksum/t/003_standby_restarts.pl b/src/test/checksum/t/003_standby_restarts.pl
new file mode 100644
index 00000000000..6c0fe8f3bf8
--- /dev/null
+++ b/src/test/checksum/t/003_standby_restarts.pl
@@ -0,0 +1,131 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# streaming replication
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize primary node
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+$node_primary->init(allows_streaming => 1, no_data_checksums => 1);
+$node_primary->start;
+my $backup_name = 'my_backup';
+
+# Take backup
+$node_primary->backup($backup_name);
+
+# Create streaming standby linking to primary
+my $node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $backup_name,
+	has_streaming => 1);
+$node_standby_1->start;
+
+# Create some content on the primary to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Wait for standbys to catch up
+$node_primary->wait_for_catchup($node_standby_1, 'replay',
+	$node_primary->lsn('insert'));
+
+# Check that checksums are turned off on all nodes
+my $result = $node_primary->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on primary');
+
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on standby_1');
+
+# Enable checksums for the cluster
+$node_primary->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+# Ensure that the primary switches to "inprogress-on"
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	"inprogress-on");
+is($result, 1, 'ensure checksums are in progress on primary');
+
+# Wait for checksum enable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Ensure that the standby has switched to "inprogress-on" or "on".  Normally it
+# would be "inprogress-on", but it is theoretically possible for the primary to
+# complete the checksum enabling *and* have the standby replay that record
+# before we reach the check below.
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting = 'off' FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'f');
+is($result, 1, 'ensure standby has absorbed the inprogress-on barrier');
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+
+is(($result eq 'inprogress-on' || $result eq 'on'),
+	1, 'ensure checksums are on, or in progress, on standby_1');
+
+# Insert some more data which should be checksummed on INSERT
+$node_primary->safe_psql('postgres',
+	"INSERT INTO t VALUES (generate_series(1, 10000));");
+
+# Wait for checksums enabled on the primary
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the primary');
+
+# Wait for checksums enabled on the standby
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the standby');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, '20000', 'ensure we can safely read all data with checksums');
+
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+# Disable checksums and ensure it's propagated to standby and that we can
+# still read all data
+$node_primary->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+# Wait for checksum disable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure data checksums are disabled on the primary 2');
+
+# Ensure that the standby has switched to off
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are off on standby_1');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, "20000", 'ensure we can safely read all data without checksums');
+
+done_testing();
diff --git a/src/test/checksum/t/004_offline.pl b/src/test/checksum/t/004_offline.pl
new file mode 100644
index 00000000000..9cee62c9b52
--- /dev/null
+++ b/src/test/checksum/t/004_offline.pl
@@ -0,0 +1,101 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums offline from various states
+# of checksum processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable checksums offline using pg_checksums
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are enabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums offline again using pg_checksums
+$node->stop;
+$node->checksum_disable_offline;
+$node->start;
+
+# Ensure that checksums are disabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, 1, 'ensure checksums are in the process of being enabled');
+
+# Turn the cluster off and enable checksums offline, then start back up
+$bsession->quit;
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are now enabled even though processing wasn't
+# restarted
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+done_testing();
diff --git a/src/test/meson.build b/src/test/meson.build
index ccc31d6a86a..c4bfef2c0e2 100644
--- a/src/test/meson.build
+++ b/src/test/meson.build
@@ -8,6 +8,7 @@ subdir('postmaster')
 subdir('recovery')
 subdir('subscription')
 subdir('modules')
+subdir('checksum')
 
 if ssl.found()
   subdir('ssl')
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index b105cba05a6..666bd2a2d4c 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3748,6 +3748,40 @@ sub advance_wal
 	}
 }
 
+=item $node->checksum_enable_offline()
+
+Enable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_enable_offline
+{
+	my ($self) = @_;
+
+	print "### Enabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-e');
+	return;
+}
+
+=item checksum_disable_offline
+
+Disable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_disable_offline
+{
+	my ($self) = @_;
+
+	print "### Disabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-d');
+	return;
+}
+
 =pod
 
 =back
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 62f69ac20b2..36bed4b168d 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2041,6 +2041,34 @@ pg_stat_progress_create_index| SELECT s.pid,
     s.param15 AS partitions_done
    FROM (pg_stat_get_progress_info('CREATE INDEX'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
+pg_stat_progress_data_checksums| SELECT s.pid,
+    s.datid,
+    d.datname,
+        CASE s.param1
+            WHEN 0 THEN 'enabling'::text
+            WHEN 1 THEN 'disabling'::text
+            WHEN 2 THEN 'waiting'::text
+            WHEN 3 THEN 'waiting on backends'::text
+            WHEN 4 THEN 'waiting on temporary tables'::text
+            WHEN 5 THEN 'done'::text
+            ELSE NULL::text
+        END AS phase,
+        CASE s.param2
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param2
+        END AS databases_total,
+        CASE s.param3
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param3
+        END AS relations_total,
+    s.param4 AS databases_processed,
+    s.param5 AS relations_processed,
+    s.param6 AS databases_current,
+    s.param7 AS relation_current,
+    s.param8 AS relation_current_blocks,
+    s.param9 AS relation_current_blocks_processed
+   FROM (pg_stat_get_progress_info('DATACHECKSUMS'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 9840060997f..86e4057f61d 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -403,6 +403,7 @@ CheckPointStmt
 CheckpointStatsData
 CheckpointerRequest
 CheckpointerShmemStruct
+ChecksumType
 Chromosome
 CkptSortItem
 CkptTsStatus
@@ -593,6 +594,10 @@ DataPageDeleteStack
 DataTypesUsageChecks
 DataTypesUsageVersionCheck
 DatabaseInfo
+DataChecksumsWorkerDatabase
+DataChecksumsWorkerResult
+DataChecksumsWorkerResultEntry
+DataChecksumsWorkerShmemStruct
 DateADT
 DateTimeErrorExtra
 Datum
@@ -4141,6 +4146,7 @@ xl_btree_split
 xl_btree_unlink_page
 xl_btree_update
 xl_btree_vacuum
+xl_checksum_state
 xl_clog_truncate
 xl_commit_ts_truncate
 xl_dbase_create_file_copy_rec
-- 
2.48.1

v20250310c-0002-simple-post-rebase-fixes.patchtext/x-patch; charset=UTF-8; name=v20250310c-0002-simple-post-rebase-fixes.patchDownload
From f01c15887f1c612cd7769bf9f3f6dd22525c493d Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Fri, 7 Mar 2025 20:19:29 +0100
Subject: [PATCH v20250310c 2/5] simple post-rebase fixes

- Update checks in PostmasterStateMachine to account for datachecksum
  workers, etc.

- Remove pgstat_bestart() call - it would need to be _initial(), but I
  don't think it's needed.

- Update vacuum_delay_point() call.

- Cast PID to long in elog call (same as we do in postmaster.c)

- Fix test 003_standby_restarts by adding a replication slot

- Fix DataChecksumsWorkerShmemInit to only zero the memory once.

- Update child_process_kinds to keep it in sync with BackendType.
---
 src/backend/postmaster/datachecksumsworker.c | 28 ++++++++++----------
 src/backend/postmaster/launch_backend.c      |  3 +++
 src/backend/postmaster/postmaster.c          |  5 ++++
 src/backend/utils/activity/pgstat_backend.c  |  2 ++
 src/test/checksum/t/003_standby_restarts.pl  | 10 ++++++-
 5 files changed, 33 insertions(+), 15 deletions(-)

diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
index de7a077f9c2..6df92684a3b 100644
--- a/src/backend/postmaster/datachecksumsworker.c
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -450,7 +450,7 @@ ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrateg
 		if (abort_requested)
 			return false;
 
-		vacuum_delay_point();
+		vacuum_delay_point(false);
 	}
 
 	pfree(relns);
@@ -587,7 +587,7 @@ ProcessDatabase(DataChecksumsWorkerDatabase *db)
 					db->dbname)));
 
 	snprintf(activity, sizeof(activity) - 1,
-			 "Waiting for worker in database %s (pid %d)", db->dbname, pid);
+			 "Waiting for worker in database %s (pid %ld)", db->dbname, (long) pid);
 	pgstat_report_activity(STATE_RUNNING, activity);
 
 	status = WaitForBackgroundWorkerShutdown(bgw_handle);
@@ -752,9 +752,6 @@ DataChecksumsWorkerLauncherMain(Datum arg)
 	 */
 	BackgroundWorkerInitializeConnectionByOid(InvalidOid, InvalidOid, 0);
 
-	/* Initialize backend status information */
-	pgstat_bestart();
-
 	enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
 	DataChecksumsWorkerShmem->launcher_running = true;
 	DataChecksumsWorkerShmem->enabling_checksums = enabling_checksums;
@@ -1095,16 +1092,19 @@ DataChecksumsWorkerShmemInit(void)
 						DataChecksumsWorkerShmemSize(),
 						&found);
 
-	MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
+	if (!found)
+	{
+		MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
 
-	/*
-	 * Even if this is a redundant assignment, we want to be explicit about
-	 * our intent for readability, since we want to be able to query this
-	 * state in case of restartability.
-	 */
-	DataChecksumsWorkerShmem->launch_enable_checksums = false;
-	DataChecksumsWorkerShmem->launcher_running = false;
-	DataChecksumsWorkerShmem->launch_fast = false;
+		/*
+		 * Even if this is a redundant assignment, we want to be explicit about
+		 * our intent for readability, since we want to be able to query this
+		 * state in case of restartability.
+		 */
+		DataChecksumsWorkerShmem->launch_enable_checksums = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		DataChecksumsWorkerShmem->launch_fast = false;
+	}
 }
 
 /*
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 47375e5bfaa..92d8017fd56 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -202,6 +202,9 @@ static child_process_kind child_process_kinds[] = {
 	[B_WAL_SUMMARIZER] = {"wal_summarizer", WalSummarizerMain, true},
 	[B_WAL_WRITER] = {"wal_writer", WalWriterMain, true},
 
+	[B_DATACHECKSUMSWORKER_LAUNCHER] = {"datachecksum launcher", NULL, false},
+	[B_DATACHECKSUMSWORKER_WORKER] = {"datachecksum worker", NULL, false},
+
 	[B_LOGGER] = {"syslogger", SysLoggerMain, false},
 };
 
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index d2a7a7add6f..2fc438987b5 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2947,6 +2947,11 @@ PostmasterStateMachine(void)
 									B_INVALID,
 									B_STANDALONE_BACKEND);
 
+			/* also add checksumming processes */
+			remainMask = btmask_add(remainMask,
+									B_DATACHECKSUMSWORKER_LAUNCHER,
+									B_DATACHECKSUMSWORKER_WORKER);
+
 			/* All types should be included in targetMask or remainMask */
 			Assert((remainMask.mask | targetMask.mask) == BTYPE_MASK_ALL.mask);
 		}
diff --git a/src/backend/utils/activity/pgstat_backend.c b/src/backend/utils/activity/pgstat_backend.c
index 6efbb650aa8..bd458f8c1af 100644
--- a/src/backend/utils/activity/pgstat_backend.c
+++ b/src/backend/utils/activity/pgstat_backend.c
@@ -295,6 +295,8 @@ pgstat_tracks_backend_bktype(BackendType bktype)
 		case B_BG_WRITER:
 		case B_CHECKPOINTER:
 		case B_STARTUP:
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 			return false;
 
 		case B_AUTOVAC_WORKER:
diff --git a/src/test/checksum/t/003_standby_restarts.pl b/src/test/checksum/t/003_standby_restarts.pl
index 6c0fe8f3bf8..6782664f4e6 100644
--- a/src/test/checksum/t/003_standby_restarts.pl
+++ b/src/test/checksum/t/003_standby_restarts.pl
@@ -20,15 +20,23 @@ my $enable_params = '0, 100, true';
 my $node_primary = PostgreSQL::Test::Cluster->new('primary');
 $node_primary->init(allows_streaming => 1, no_data_checksums => 1);
 $node_primary->start;
-my $backup_name = 'my_backup';
+
+my $slotname = 'physical_slot';
+$node_primary->safe_psql('postgres',
+	"SELECT pg_create_physical_replication_slot('$slotname')");
 
 # Take backup
+my $backup_name = 'my_backup';
 $node_primary->backup($backup_name);
 
 # Create streaming standby linking to primary
 my $node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
 $node_standby_1->init_from_backup($node_primary, $backup_name,
 	has_streaming => 1);
+$node_standby_1->append_conf(
+	'postgresql.conf', qq[
+primary_slot_name = '$slotname'
+]);
 $node_standby_1->start;
 
 # Create some content on the primary to have un-checksummed data in the cluster
-- 
2.48.1

v20250310c-0003-sync-the-data_checksums-GUC-with-the-loca.patchtext/x-patch; charset=UTF-8; name=v20250310c-0003-sync-the-data_checksums-GUC-with-the-loca.patchDownload
From 5fe7c540af4c3bf004cae9f6b6dfa69d2ab9aa84 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Fri, 7 Mar 2025 23:07:44 +0100
Subject: [PATCH v20250310c 3/5] sync the data_checksums GUC with the local
 variable

We now have three places that in some way express the state of data
checksums in the instance.

- control file / data_checksum_version

- LocalDataChecksumVersion as a local cache to reduce locking

- data_checksums backing the GUC

We need to keep the GUC variable in sync, to ensure we get the correct
value even early during startup (e.g. with -C).

Introduces a new SetLocalDataChecksumVersion() which sets the local
variable, and also updates the data_checksum GUC at the same time.
---
 src/backend/access/transam/xlog.c   | 51 +++++++++++++++++++++++++----
 src/backend/utils/misc/guc_tables.c |  3 +-
 src/include/access/xlog.h           |  1 +
 3 files changed, 47 insertions(+), 8 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index b5190a7e104..f137cdc6d42 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -657,6 +657,14 @@ static bool updateMinRecoveryPoint = true;
  */
 static uint32 LocalDataChecksumVersion = 0;
 
+/*
+ * Variable backing the GUC, keep it in sync with LocalDataChecksumVersion.
+ * See SetLocalDataChecksumVersion().
+ */
+int			data_checksums = 0;
+
+static void SetLocalDataChecksumVersion(uint32 data_checksum_version);
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -4594,7 +4602,7 @@ ReadControlFile(void)
 
 	CalculateCheckpointSegments();
 
-	LocalDataChecksumVersion = ControlFile->data_checksum_version;
+	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
 }
 
 /*
@@ -4913,7 +4921,7 @@ SetDataChecksumsOff(void)
 bool
 AbsorbChecksumsOnInProgressBarrier(void)
 {
-	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
 	return true;
 }
 
@@ -4921,21 +4929,21 @@ bool
 AbsorbChecksumsOnBarrier(void)
 {
 	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
-	LocalDataChecksumVersion = PG_DATA_CHECKSUM_VERSION;
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_VERSION);
 	return true;
 }
 
 bool
 AbsorbChecksumsOffInProgressBarrier(void)
 {
-	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
 	return true;
 }
 
 bool
 AbsorbChecksumsOffBarrier(void)
 {
-	LocalDataChecksumVersion = 0;
+	SetLocalDataChecksumVersion(0);
 	return true;
 }
 
@@ -4951,10 +4959,41 @@ void
 InitLocalControldata(void)
 {
 	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
-	LocalDataChecksumVersion = ControlFile->data_checksum_version;
+	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
 	LWLockRelease(ControlFileLock);
 }
 
+/*
+ * XXX probably should be called in all places that modify the value of
+ * LocalDataChecksumVersion (to make sure data_checksums GUC is in sync)
+ *
+ * XXX aren't PG_DATA_ and DATA_ constants the same? why do we need both?
+ */
+void
+SetLocalDataChecksumVersion(uint32 data_checksum_version)
+{
+	LocalDataChecksumVersion = data_checksum_version;
+
+	switch (LocalDataChecksumVersion)
+	{
+		case PG_DATA_CHECKSUM_VERSION:
+			data_checksums = DATA_CHECKSUMS_ON;
+			break;
+
+		case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+			data_checksums = DATA_CHECKSUMS_INPROGRESS_ON;
+			break;
+
+		case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+			data_checksums = DATA_CHECKSUMS_INPROGRESS_OFF;
+			break;
+
+		default:
+			data_checksums = DATA_CHECKSUMS_OFF;
+			break;
+	}
+}
+
 /* guc hook */
 const char *
 show_data_checksums(void)
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 05cba3a02e3..66abf056159 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -609,7 +609,6 @@ static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
 static int	num_os_semaphores;
-static int	data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
@@ -5300,7 +5299,7 @@ struct config_enum ConfigureNamesEnum[] =
 		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
 			gettext_noop("Shows whether data checksums are turned on for this cluster."),
 			NULL,
-			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
 		},
 		&data_checksums,
 		DATA_CHECKSUMS_OFF, data_checksums_options,
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 0bb32c9c0fc..aec3ea0bc63 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -56,6 +56,7 @@ extern PGDLLIMPORT int CommitDelay;
 extern PGDLLIMPORT int CommitSiblings;
 extern PGDLLIMPORT bool track_wal_io_timing;
 extern PGDLLIMPORT int wal_decode_buffer_size;
+extern PGDLLIMPORT int data_checksums;
 
 extern PGDLLIMPORT int CheckPointSegments;
 
-- 
2.48.1

v20250310c-0004-make-progress-reporting-work.patchtext/x-patch; charset=UTF-8; name=v20250310c-0004-make-progress-reporting-work.patchDownload
From d03c53fca689f5303a5f4f8e8d142687ed2c44d9 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Sat, 8 Mar 2025 19:34:50 +0100
Subject: [PATCH v20250310c 4/5] make progress reporting work

- Splits the progress status by worker type - one row for launcher,
  one row for checksum worker (might be more with parallel workers).

- The launcher only updates database counters, the workers only set
  relation/block counters.

- Also reworks the columns in the system view a bit, discards the
  "current" fields (we still know the database for each worker).

- Issue: Not sure what to do about relation forks, at this point it
  tracks only blocks for MAIN fork.
---
 src/backend/catalog/system_views.sql         | 36 ++++----
 src/backend/postmaster/datachecksumsworker.c | 91 +++++++++++++++++---
 src/include/commands/progress.h              | 27 +++---
 src/test/regress/expected/rules.out          | 29 ++++---
 4 files changed, 131 insertions(+), 52 deletions(-)

diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 0d62976dc1f..4330d0ad656 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1335,25 +1335,25 @@ CREATE VIEW pg_stat_progress_copy AS
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
 CREATE VIEW pg_stat_progress_data_checksums AS
-	SELECT
-		S.pid AS pid, S.datid AS datid, D.datname AS datname,
-		CASE S.param1 WHEN 0 THEN 'enabling'
+    SELECT
+        S.pid AS pid, S.datid, D.datname AS datname,
+        CASE S.param1 WHEN 0 THEN 'enabling'
                       WHEN 1 THEN 'disabling'
-					  WHEN 2 THEN 'waiting'
-					  WHEN 3 THEN 'waiting on backends'
-					  WHEN 4 THEN 'waiting on temporary tables'
-					  WHEN 5 THEN 'done'
-					  END AS phase,
-		CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
-		CASE S.param3 WHEN -1 THEN NULL ELSE S.param3 END AS relations_total,
-		S.param4 AS databases_processed,
-		S.param5 AS relations_processed,
-		S.param6 AS databases_current,
-		S.param7 AS relation_current,
-		S.param8 AS relation_current_blocks,
-		S.param9 AS relation_current_blocks_processed
-	FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
-		LEFT JOIN pg_database D ON S.datid = D.oid;
+                      WHEN 2 THEN 'waiting'
+                      WHEN 3 THEN 'waiting on backends'
+                      WHEN 4 THEN 'waiting on temporary tables'
+                      WHEN 5 THEN 'waiting on checkpoint'
+                      WHEN 6 THEN 'done'
+                      END AS phase,
+        CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+        S.param3 AS databases_done,
+        CASE S.param4 WHEN -1 THEN NULL ELSE S.param4 END AS relations_total,
+        CASE S.param5 WHEN -1 THEN NULL ELSE S.param5 END AS relations_done,
+        CASE S.param6 WHEN -1 THEN NULL ELSE S.param6 END AS blocks_total,
+        CASE S.param7 WHEN -1 THEN NULL ELSE S.param7 END AS blocks_done
+    FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid
+    ORDER BY S.datid; -- return the launcher process first
 
 CREATE VIEW pg_user_mappings AS
     SELECT
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
index 6df92684a3b..bbbce61cfa6 100644
--- a/src/backend/postmaster/datachecksumsworker.c
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -409,6 +409,11 @@ ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrateg
 			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
 	pgstat_report_activity(STATE_RUNNING, activity);
 
+	/* XXX only do this for main forks, maybe we should do it for all? */
+	if (forkNum == MAIN_FORKNUM)
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+									 numblocks);
+
 	/*
 	 * We are looping over the blocks which existed at the time of process
 	 * start, which is safe since new blocks are created with checksums set
@@ -450,6 +455,11 @@ ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrateg
 		if (abort_requested)
 			return false;
 
+		/* XXX only do this for main forks, maybe we should do it for all? */
+		if (forkNum == MAIN_FORKNUM)
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+										 (blknum + 1));
+
 		vacuum_delay_point(false);
 	}
 
@@ -798,6 +808,8 @@ again:
 		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
 									 PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS);
 
+		/* XXX isn't it weird there's no wait between the phase updates? */
+
 		/*
 		 * Set the state to inprogress-on and wait on the procsignal barrier.
 		 */
@@ -908,8 +920,29 @@ ProcessAllDatabases(bool immediate_checkpoint)
 	 * columns for processed databases is instead increased such that it can
 	 * be compared against the total.
 	 */
-	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_DB,
-								 list_length(DatabaseList));
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_DBS_TOTAL,
+			PROGRESS_DATACHECKSUMS_DBS_DONE,
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE,
+			PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+			PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+		};
+
+		int64	vals[6];
+
+		vals[0] = list_length(DatabaseList);
+		vals[1] = 0;
+
+		/* translated to NULL */
+		vals[2] = -1;
+		vals[3] = -1;
+		vals[4] = -1;
+		vals[5] = -1;
+
+		pgstat_progress_update_multi_param(6, index, vals);
+	}
 
 	while (true)
 	{
@@ -921,14 +954,6 @@ ProcessAllDatabases(bool immediate_checkpoint)
 			DataChecksumsWorkerResultEntry *entry;
 			bool		found;
 
-			/*
-			 * Indicate which database is being processed set the number of
-			 * relations to -1 to clear field from previous values. -1 will
-			 * translate to NULL in the progress view.
-			 */
-			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_CUR_DB, db->dboid);
-			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_REL, -1);
-
 			/*
 			 * Check if this database has been processed already, and if so
 			 * whether it should be retried or skipped.
@@ -957,6 +982,12 @@ ProcessAllDatabases(bool immediate_checkpoint)
 			result = ProcessDatabase(db);
 			processed_databases++;
 
+			/*
+			 * Update the number of processed databases in the progress report.
+			 */
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_DBS_DONE,
+										 processed_databases);
+
 			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
 			{
 				/*
@@ -1050,6 +1081,13 @@ ProcessAllDatabases(bool immediate_checkpoint)
 				 errhint("The server log might have more information on the cause of the error.")));
 	}
 
+	/*
+	 * When enabling checksums, we have to wait for a checkpoint for the
+	 * checksums to e.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT);
+
 	/*
 	 * Force a checkpoint to get everything out to disk. The use of immediate
 	 * checkpoints is for running tests, as they would otherwise not execute
@@ -1258,6 +1296,7 @@ DataChecksumsWorkerMain(Datum arg)
 	List	   *InitialTempTableList = NIL;
 	BufferAccessStrategy strategy;
 	bool		aborted = false;
+	int64		rels_done;
 
 	enabling_checksums = true;
 
@@ -1271,6 +1310,10 @@ DataChecksumsWorkerMain(Datum arg)
 	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
 											  BGWORKER_BYPASS_ALLOWCONN);
 
+	/* worker will have a separate entry in pg_stat_progress_data_checksums */
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
 	/*
 	 * Get a list of all temp tables present as we start in this database. We
 	 * need to wait until they are all gone until we are done, since we cannot
@@ -1297,6 +1340,24 @@ DataChecksumsWorkerMain(Datum arg)
 
 	RelationList = BuildRelationList(false,
 									 DataChecksumsWorkerShmem->process_shared_catalogs);
+
+	/* Update the total number of relations to be processed in this DB. */
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE
+		};
+
+		int64	vals[2];
+
+		vals[0] = list_length(RelationList);
+		vals[1] = 0;
+
+		pgstat_progress_update_multi_param(2, index, vals);
+	}
+
+	/* process the relations */
+	rels_done = 0;
 	foreach_oid(reloid, RelationList)
 	{
 		if (!ProcessSingleRelationByOid(reloid, strategy))
@@ -1304,6 +1365,9 @@ DataChecksumsWorkerMain(Datum arg)
 			aborted = true;
 			break;
 		}
+
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_RELS_DONE,
+									 ++rels_done);
 	}
 	list_free(RelationList);
 
@@ -1316,6 +1380,10 @@ DataChecksumsWorkerMain(Datum arg)
 		return;
 	}
 
+	/* The worker is about to wait for temporary tables to go away. */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL);
+
 	/*
 	 * Wait for all temp tables that existed when we started to go away. This
 	 * is necessary since we cannot "reach" them to enable checksums. Any temp
@@ -1372,5 +1440,8 @@ DataChecksumsWorkerMain(Datum arg)
 
 	list_free(InitialTempTableList);
 
+	/* worker done */
+	pgstat_progress_end_command();
+
 	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
 }
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 1ebd0c792b4..94b478a6cc9 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -158,21 +158,20 @@
 #define PROGRESS_COPY_TYPE_CALLBACK 4
 
 /* Progress parameters for PROGRESS_DATACHECKSUMS */
-#define PROGRESS_DATACHECKSUMS_PHASE 0
-#define PROGRESS_DATACHECKSUMS_TOTAL_DB 1
-#define PROGRESS_DATACHECKSUMS_TOTAL_REL 2
-#define PROGRESS_DATACHECKSUMS_PROCESSED_DB 3
-#define PROGRESS_DATACHECKSUMS_PROCESSED_REL 4
-#define PROGRESS_DATACHECKSUMS_CUR_DB 5
-#define PROGRESS_DATACHECKSUMS_CUR_REL 6
-#define PROGRESS_DATACHECKSUMS_CUR_REL_TOTAL_BLOCKS 7
-#define PROGRESS_DATACHECKSUMS_CUR_REL_PROCESSED_BLOCKS 8
+#define PROGRESS_DATACHECKSUMS_PHASE		0
+#define PROGRESS_DATACHECKSUMS_DBS_TOTAL	1
+#define PROGRESS_DATACHECKSUMS_DBS_DONE		2
+#define PROGRESS_DATACHECKSUMS_RELS_TOTAL	3
+#define PROGRESS_DATACHECKSUMS_RELS_DONE	4
+#define PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL	5
+#define PROGRESS_DATACHECKSUMS_BLOCKS_DONE	6
 
 /* Phases of datachecksumsworker operation */
-#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING 0
-#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING 1
-#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS 2
-#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL 3
-#define PROGRESS_DATACHECKSUMS_DONE 4
+#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING			0
+#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING			1
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING			2
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS	3
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL	4
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT 5
 
 #endif
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 36bed4b168d..2cfea837554 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2050,25 +2050,34 @@ pg_stat_progress_data_checksums| SELECT s.pid,
             WHEN 2 THEN 'waiting'::text
             WHEN 3 THEN 'waiting on backends'::text
             WHEN 4 THEN 'waiting on temporary tables'::text
-            WHEN 5 THEN 'done'::text
+            WHEN 5 THEN 'waiting on checkpoint'::text
+            WHEN 6 THEN 'done'::text
             ELSE NULL::text
         END AS phase,
         CASE s.param2
             WHEN '-1'::integer THEN NULL::bigint
             ELSE s.param2
         END AS databases_total,
-        CASE s.param3
+    s.param3 AS databases_done,
+        CASE s.param4
             WHEN '-1'::integer THEN NULL::bigint
-            ELSE s.param3
+            ELSE s.param4
         END AS relations_total,
-    s.param4 AS databases_processed,
-    s.param5 AS relations_processed,
-    s.param6 AS databases_current,
-    s.param7 AS relation_current,
-    s.param8 AS relation_current_blocks,
-    s.param9 AS relation_current_blocks_processed
+        CASE s.param5
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param5
+        END AS relations_done,
+        CASE s.param6
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param6
+        END AS blocks_total,
+        CASE s.param7
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param7
+        END AS blocks_done
    FROM (pg_stat_get_progress_info('DATACHECKSUMS'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
-     LEFT JOIN pg_database d ON ((s.datid = d.oid)));
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)))
+  ORDER BY s.datid;
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,
-- 
2.48.1

v20250310c-0005-update-docs.patchtext/x-patch; charset=UTF-8; name=v20250310c-0005-update-docs.patchDownload
From 10c411612677b7fa1bc01d26fd923e92f4ec108b Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Sun, 9 Mar 2025 13:34:10 +0100
Subject: [PATCH v20250310c 5/5] update docs

---
 doc/src/sgml/monitoring.sgml | 84 ++++++++++++++++++++----------------
 doc/src/sgml/wal.sgml        |  8 ++--
 2 files changed, 51 insertions(+), 41 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 4a6ef5f1605..2a568ee2357 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6804,8 +6804,8 @@ FROM pg_stat_get_backend_idset() AS backendid;
   <para>
    When data checksums are being enabled on a running cluster, the
    <structname>pg_stat_progress_data_checksums</structname> view will contain
-   a row for each background worker which is currently calculating checksums
-   for the data pages.
+   a row for the launcher process, and one row for each worker process which
+   is currently calculating checksums for the data pages in one database.
   </para>
 
   <table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
@@ -6837,37 +6837,33 @@ FROM pg_stat_get_backend_idset() AS backendid;
      </row>
 
      <row>
-      <entry role="catalog_table_entry">
-       <para role="column_definition">
-        <structfield>phase</structfield> <type>text</type>
-       </para>
-       <para>
-        Current processing phase, see <xref linkend="datachecksum-phases"/>
-        for description of the phases.
-       </para>
-      </entry>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datid</structfield> <type>oid</type>
+      </para>
+      <para>
+       OID of this database, or 0 for the launcher process
+       relation
+      </para></entry>
      </row>
 
      <row>
-      <entry role="catalog_table_entry">
-       <para role="column_definition">
-        <structfield>databases_total</structfield> <type>integer</type>
-       </para>
-       <para>
-        The total number of databases which will be processed.
-       </para>
-      </entry>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datname</structfield> <type>name</type>
+      </para>
+      <para>
+       Name of this database, or <literal>NULL</literal> for the
+       launcher process.
+      </para></entry>
      </row>
 
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>relations_total</structfield> <type>integer</type>
+        <structfield>phase</structfield> <type>text</type>
        </para>
        <para>
-        The total number of relations which will be processed, or
-        <literal>NULL</literal> if the data checksums worker process hasn't
-        calculated the number of relations yet.
+        Current processing phase, see <xref linkend="datachecksum-phases"/>
+        for description of the phases.
        </para>
       </entry>
      </row>
@@ -6875,10 +6871,12 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>databases_processed</structfield> <type>integer</type>
+        <structfield>databases_total</structfield> <type>integer</type>
        </para>
        <para>
-        The number of databases which have been processed.
+        The total number of databases which will be processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6886,10 +6884,12 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>relations_processed</structfield> <type>integer</type>
+        <structfield>databases_done</structfield> <type>integer</type>
        </para>
        <para>
-        The number of relations which have been processed.
+        The number of databases which have been processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6897,10 +6897,13 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>databases_current</structfield> <type>oid</type>
+        <structfield>relations_total</structfield> <type>integer</type>
        </para>
        <para>
-        OID of the database currently being processed.
+        The total number of relations which will be processed, or
+        <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of relations yet. The launcher process has
+        this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6908,10 +6911,11 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>relation_current</structfield> <type>oid</type>
+        <structfield>relations_done</structfield> <type>integer</type>
        </para>
        <para>
-        OID of the relation currently being processed.
+        The number of relations which have been processed. The launcher
+        process has this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6919,10 +6923,13 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>relation_current_blocks</structfield> <type>integer</type>
+        <structfield>blocks_total</structfield> <type>integer</type>
        </para>
        <para>
-        The total number of blocks in the relation currently being processed.
+        The number of blocks in the current relation which will be processed,
+        or <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of blocks yet. The launcher process has
+        this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6930,11 +6937,11 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>relation_current_blocks_processed</structfield> <type>integer</type>
+        <structfield>blocks_done</structfield> <type>integer</type>
        </para>
        <para>
-        The number of blocks which have been processed in the relation currently
-        being processed.
+        The number of blocks in the current relation which have been processed.
+        The launcher process has this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6982,9 +6989,10 @@ FROM pg_stat_get_backend_idset() AS backendid;
       </entry>
      </row>
      <row>
-      <entry><literal>done</literal></entry>
+      <entry><literal>waiting on checkpoint</literal></entry>
       <entry>
-       The command has finished processing and is exiting.
+       The command is currently waiting for a checkpoint to update the checksum
+       state at the end.
       </entry>
      </row>
     </tbody>
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index 2562b1a002c..0ada90ca0b1 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -310,15 +310,17 @@
     If the cluster is stopped while in <literal>inprogress-on</literal> mode, for
     any reason, then this process must be restarted manually. To do this,
     re-execute the function <function>pg_enable_data_checksums()</function>
-    once the cluster has been restarted. The background worker will attempt
-    to resume the work from where it was interrupted.
+    once the cluster has been restarted. The process will start over, there is
+    no support for resuming work from where it was interrupted.
    </para>
 
    <note>
     <para>
      Enabling checksums can cause significant I/O to the system, as most of the
      database pages will need to be rewritten, and will be written both to the
-     data files and the WAL.
+     data files and the WAL. The impact may be limited by throttling using the
+     <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter>
+     parameters of the <function>pg_enable_data_checksums</function> function.
     </para>
    </note>
 
-- 
2.48.1

#24Tomas Vondra
tomas@vondra.me
In reply to: Tomas Vondra (#23)
5 attachment(s)
Re: Changing the state of data checksums in a running cluster

On 3/10/25 10:46, Tomas Vondra wrote:

On 3/10/25 01:18, Tomas Vondra wrote:

...

There's still a failure on windows, though. I'd bet that's due to the
data_checksum/LocalDatachecksumVersion sync not working correctly on
builds with EXEC_BACKEND, or something like that, but it's too late so
I'll take a closer look tomorrow.

Just like I suspected, there was a bug in EXEC_BACKEND, although a bit
different from what I guessed - the worker state in shmem was zeroed
every time, not just once. And a second issue was child_process_kinds
got out of sync with BackendType (mea culpa).

For me, this passes all CI tests, hopefully cfbot will be happy too.

A bit embarrassing, I did not notice updating child_process_kinds breaks
the stats regression test, so here's a version fixing that.

regards

--
Tomas Vondra

Attachments:

v20250310d-0001-Online-enabling-and-disabling-of-data-che.patchtext/x-patch; charset=UTF-8; name=v20250310d-0001-Online-enabling-and-disabling-of-data-che.patchDownload
From adf6848e0cff38ac861f921dc3b7ef6462f0d86a Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Fri, 7 Mar 2025 19:21:22 +0100
Subject: [PATCH v20250310d 1/5] Online enabling and disabling of data
 checksums

This allows data checksums to be enabled, or disabled, in a running
cluster without restricting access to the cluster during processing.

Data checksums could prior to this only be enabled during initdb or
when the cluster is offline using the pg_checksums app. This commit
introduce functionality to enable, or disable, data checksums while
the cluster is running regardless of how it was initialized.

A background worker launcher process is responsible for launching a
dynamic per-database background worker which will mark all buffers
dirty for all relation with storage in order for them to have data
checksums calcuated on write.  Once all relations in all databases
have been processed, the data_checksums state will be set to on and
the cluster will at that point be identical to one which had data
checksums enabled during initialization or via offline processing.

When data checksums are being enabled, concurrent I/O operations
from backends other than the data checksums worker will write the
checksums but not verify them on reading.  Only when all backends
have absorbed the procsignalbarrier for setting data_checksums to
on will they also start verifying checksums on reading.  The same
process is repeated during disabling; all backends write checksums
but do not verify them until the barrier for setting the state to
off has been absorbed by all.  This in-progress state is used to
ensure there are no false negatives (or positives) due to reading
a checksum which is not in sync with the page.

This work is based on an earlier version of this patch which was
reviewed by among others Heikki Linnakangas, Robert Haas, Andres
Freund, Tomas Vondra, Michael Banck and Andrey Borodin.

Author: Daniel Gustafsson <daniel@yesql.se>
Author: Magnus Hagander <magnus@hagander.net>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com
Discussion: https://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de
Discussion: https://postgr.es/m/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com
---
 doc/src/sgml/func.sgml                        |   71 +
 doc/src/sgml/glossary.sgml                    |   23 +
 doc/src/sgml/monitoring.sgml                  |  207 ++-
 doc/src/sgml/ref/pg_checksums.sgml            |    6 +
 doc/src/sgml/wal.sgml                         |   57 +-
 src/backend/access/rmgrdesc/xlogdesc.c        |   24 +
 src/backend/access/transam/xlog.c             |  450 +++++-
 src/backend/access/transam/xlogfuncs.c        |   41 +
 src/backend/backup/basebackup.c               |    6 +-
 src/backend/catalog/system_functions.sql      |   11 +
 src/backend/catalog/system_views.sql          |   21 +
 src/backend/postmaster/Makefile               |    1 +
 src/backend/postmaster/bgworker.c             |    7 +
 src/backend/postmaster/datachecksumsworker.c  | 1376 +++++++++++++++++
 src/backend/postmaster/meson.build            |    1 +
 src/backend/replication/logical/decode.c      |    1 +
 src/backend/storage/ipc/ipci.c                |    4 +
 src/backend/storage/ipc/procsignal.c          |   13 +
 src/backend/storage/page/README               |    4 +-
 src/backend/storage/page/bufpage.c            |    6 +-
 src/backend/utils/activity/pgstat_io.c        |    2 +
 .../utils/activity/wait_event_names.txt       |    3 +
 src/backend/utils/adt/pgstatfuncs.c           |    8 +-
 src/backend/utils/init/miscinit.c             |    9 +-
 src/backend/utils/init/postinit.c             |    7 +-
 src/backend/utils/misc/guc_tables.c           |   32 +-
 src/bin/pg_checksums/pg_checksums.c           |    2 +-
 src/bin/pg_upgrade/controldata.c              |    9 +
 src/include/access/xlog.h                     |   16 +-
 src/include/access/xlog_internal.h            |    7 +
 src/include/catalog/pg_control.h              |    1 +
 src/include/catalog/pg_proc.dat               |   17 +
 src/include/commands/progress.h               |   18 +
 src/include/miscadmin.h                       |    4 +
 src/include/postmaster/datachecksumsworker.h  |   31 +
 src/include/storage/bufpage.h                 |   10 +
 src/include/storage/checksum.h                |    8 +
 src/include/storage/lwlocklist.h              |    1 +
 src/include/storage/proc.h                    |    5 +-
 src/include/storage/procsignal.h              |    5 +
 src/include/utils/backend_progress.h          |    1 +
 src/test/Makefile                             |   11 +-
 src/test/checksum/.gitignore                  |    2 +
 src/test/checksum/Makefile                    |   23 +
 src/test/checksum/README                      |   22 +
 src/test/checksum/meson.build                 |   15 +
 src/test/checksum/t/001_basic.pl              |   88 ++
 src/test/checksum/t/002_restarts.pl           |   92 ++
 src/test/checksum/t/003_standby_restarts.pl   |  131 ++
 src/test/checksum/t/004_offline.pl            |  101 ++
 src/test/meson.build                          |    1 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |   34 +
 src/test/regress/expected/rules.out           |   28 +
 src/tools/pgindent/typedefs.list              |    6 +
 54 files changed, 3031 insertions(+), 49 deletions(-)
 create mode 100644 src/backend/postmaster/datachecksumsworker.c
 create mode 100644 src/include/postmaster/datachecksumsworker.h
 create mode 100644 src/test/checksum/.gitignore
 create mode 100644 src/test/checksum/Makefile
 create mode 100644 src/test/checksum/README
 create mode 100644 src/test/checksum/meson.build
 create mode 100644 src/test/checksum/t/001_basic.pl
 create mode 100644 src/test/checksum/t/002_restarts.pl
 create mode 100644 src/test/checksum/t/003_standby_restarts.pl
 create mode 100644 src/test/checksum/t/004_offline.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 51dd8ad6571..a09843e4ecf 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29865,6 +29865,77 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
 
   </sect2>
 
+  <sect2 id="functions-admin-checksum">
+   <title>Data Checksum Functions</title>
+
+   <para>
+    The functions shown in <xref linkend="functions-checksums-table" /> can
+    be used to enable or disable data checksums in a running cluster.
+    See <xref linkend="checksums" /> for details.
+   </para>
+
+   <table id="functions-checksums-table">
+    <title>Data Checksum Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_enable_data_checksums</primary>
+        </indexterm>
+        <function>pg_enable_data_checksums</function> ( <optional><parameter>cost_delay</parameter> <type>int</type>, <parameter>cost_limit</parameter> <type>int</type></optional> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Initiates data checksums for the cluster. This will switch the data
+        checksums mode to <literal>inprogress-on</literal> as well as start a
+        background worker that will process all pages in the database and
+        enable checksums on them. When all data pages have had checksums
+        enabled, the cluster will automatically switch data checksums mode to
+        <literal>on</literal>.
+       </para>
+       <para>
+        If <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter> are
+        specified, the speed of the process is throttled using the same principles as
+        <link linkend="runtime-config-resource-vacuum-cost">Cost-based Vacuum Delay</link>.
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_disable_data_checksums</primary>
+        </indexterm>
+        <function>pg_disable_data_checksums</function> ()
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Disables data checksum validation and calculation for the cluster. This
+        will switch the data checksum mode to <literal>inprogress-off</literal>
+        while data checksums are being disabled. When all active backends have
+        stopped validating data checksums, the data checksum mode will be
+        changed to <literal>off</literal>.  At this point the data pages will
+        still have checksums recorded but they are not updated when pages are
+        modified.
+       </para></entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+
+  </sect2>
+
   <sect2 id="functions-admin-dbobject">
    <title>Database Object Management Functions</title>
 
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index c0f812e3f5e..547e8586d4b 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -159,6 +159,8 @@
      (but not the autovacuum workers),
      the <glossterm linkend="glossary-background-writer">background writer</glossterm>,
      the <glossterm linkend="glossary-checkpointer">checkpointer</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker">data checksums worker</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker-launcher">data checksums worker launcher</glossterm>,
      the <glossterm linkend="glossary-logger">logger</glossterm>,
      the <glossterm linkend="glossary-startup-process">startup process</glossterm>,
      the <glossterm linkend="glossary-wal-archiver">WAL archiver</glossterm>,
@@ -548,6 +550,27 @@
    </glossdef>
   </glossentry>
 
+  <glossentry id="glossary-data-checksums-worker">
+   <glossterm>Data Checksums Worker</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which enables or disables data checksums in a specific database.
+    </para>
+   </glossdef>
+  </glossentry>
+
+  <glossentry id="glossary-data-checksums-worker-launcher">
+   <glossterm>Data Checksums Worker Launcher</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which starts <glossterm linkend="glossary-data-checksums-worker"> processes</glossterm>
+     for each database.
+    </para>
+   </glossdef>
+  </glossentry>
+
   <glossentry id="glossary-db-cluster">
    <glossterm>Database cluster</glossterm>
    <glossdef>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 16646f560e8..4a6ef5f1605 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3497,8 +3497,9 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Number of data page checksum failures detected in this
-       database (or on a shared object), or NULL if data checksums are
-       disabled.
+       database (or on a shared object).
+       Detected failures are reported regardless of the
+       <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -3508,8 +3509,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Time at which the last data page checksum failure was detected in
-       this database (or on a shared object), or NULL if data checksums are
-       disabled.
+       this database (or on a shared object). Last failure is reported
+       regardless of the <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -6793,6 +6794,204 @@ FROM pg_stat_get_backend_idset() AS backendid;
 
  </sect2>
 
+ <sect2 id="data-checksum-progress-reporting">
+  <title>Data Checksum Progress Reporting</title>
+
+  <indexterm>
+   <primary>pg_stat_progress_data_checksums</primary>
+  </indexterm>
+
+  <para>
+   When data checksums are being enabled on a running cluster, the
+   <structname>pg_stat_progress_data_checksums</structname> view will contain
+   a row for each background worker which is currently calculating checksums
+   for the data pages.
+  </para>
+
+  <table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
+   <title><structname>pg_stat_progress_data_checksums</structname> View</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        Column Type
+       </para>
+       <para>
+        Description>
+       </para>
+      </entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>pid</structfield> <type>integer</type>
+       </para>
+       <para>
+        Process ID of a datachecksumworker process.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>phase</structfield> <type>text</type>
+       </para>
+       <para>
+        Current processing phase, see <xref linkend="datachecksum-phases"/>
+        for description of the phases.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of databases which will be processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of relations which will be processed, or
+        <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of relations yet.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of databases which have been processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of relations which have been processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_current</structfield> <type>oid</type>
+       </para>
+       <para>
+        OID of the database currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current</structfield> <type>oid</type>
+       </para>
+       <para>
+        OID of the relation currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current_blocks</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of blocks in the relation currently being processed.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relation_current_blocks_processed</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks which have been processed in the relation currently
+        being processed.
+       </para>
+      </entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+  <table id="datachecksum-phases">
+   <title>Data Checksum Phases</title>
+   <tgroup cols="2">
+    <colspec colname="col1" colwidth="1*"/>
+    <colspec colname="col2" colwidth="2*"/>
+    <thead>
+     <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+    <tbody>
+     <row>
+      <entry><literal>enabling</literal></entry>
+      <entry>
+       The command is currently enabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>disabling</literal></entry>
+      <entry>
+       The command is currently disabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on backends</literal></entry>
+      <entry>
+       The command is currently waiting for backends to acknowledge the data
+       checksum operation.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on temporary tables</literal></entry>
+      <entry>
+       The command is currently waiting for all temporary tables which existed
+       at the time the command was started to be removed.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>done</literal></entry>
+      <entry>
+       The command has finished processing and is exiting.
+      </entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+ </sect2>
+
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index 95043aa329c..0343710af53 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -45,6 +45,12 @@ PostgreSQL documentation
    exit status is nonzero if the operation failed.
   </para>
 
+  <para>
+   When enabling checksums, if checksums were in the process of being enabled
+   when the cluster was shut down, <application>pg_checksums</application>
+   will still process all relations regardless of the online processing.
+  </para>
+
   <para>
    When verifying checksums, every file in the cluster is scanned. When
    enabling checksums, each relation file block with a changed checksum is
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index f3b86b26be9..2562b1a002c 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -246,9 +246,10 @@
   <para>
    Checksums can be disabled when the cluster is initialized using <link
    linkend="app-initdb-data-checksums"><application>initdb</application></link>.
-   They can also be enabled or disabled at a later time as an offline
-   operation. Data checksums are enabled or disabled at the full cluster
-   level, and cannot be specified individually for databases or tables.
+   They can also be enabled or disabled at a later time either as an offline
+   operation or online in a running cluster allowing concurrent access. Data
+   checksums are enabled or disabled at the full cluster level, and cannot be
+   specified individually for databases or tables.
   </para>
 
   <para>
@@ -265,7 +266,7 @@
   </para>
 
   <sect2 id="checksums-offline-enable-disable">
-   <title>Off-line Enabling of Checksums</title>
+   <title>Offline Enabling of Checksums</title>
 
    <para>
     The <link linkend="app-pgchecksums"><application>pg_checksums</application></link>
@@ -274,6 +275,54 @@
    </para>
 
   </sect2>
+
+  <sect2 id="checksums-online-enable-disable">
+   <title>Online Enabling of Checksums</title>
+
+   <para>
+    Checksums can be enabled or disabled online, by calling the appropriate
+    <link linkend="functions-admin-checksum">functions</link>.
+   </para>
+
+   <para>
+    Enabling checksums will put the cluster checksum mode in
+    <literal>inprogress-on</literal> mode.  During this time, checksums will be
+    written but not verified. In addition to this, a background worker process
+    is started that enables checksums on all existing data in the cluster. Once
+    this worker has completed processing all databases in the cluster, the
+    checksum mode will automatically switch to <literal>on</literal>. The
+    processing will consume two background worker processes, make sure that
+    <varname>max_worker_processes</varname> allows for at least two more
+    additional processes.
+   </para>
+
+   <para>
+    The process will initially wait for all open transactions to finish before
+    it starts, so that it can be certain that there are no tables that have been
+    created inside a transaction that has not committed yet and thus would not
+    be visible to the process enabling checksums. It will also, for each database,
+    wait for all pre-existing temporary tables to get removed before it finishes.
+    If long-lived temporary tables are used in the application it may be necessary
+    to terminate these application connections to allow the process to complete.
+   </para>
+
+   <para>
+    If the cluster is stopped while in <literal>inprogress-on</literal> mode, for
+    any reason, then this process must be restarted manually. To do this,
+    re-execute the function <function>pg_enable_data_checksums()</function>
+    once the cluster has been restarted. The background worker will attempt
+    to resume the work from where it was interrupted.
+   </para>
+
+   <note>
+    <para>
+     Enabling checksums can cause significant I/O to the system, as most of the
+     database pages will need to be rewritten, and will be written both to the
+     data files and the WAL.
+    </para>
+   </note>
+
+  </sect2>
  </sect1>
 
   <sect1 id="wal-intro">
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 58040f28656..1e3ad2ab68d 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -18,6 +18,7 @@
 #include "access/xlog.h"
 #include "access/xlog_internal.h"
 #include "catalog/pg_control.h"
+#include "storage/bufpage.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -167,6 +168,26 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
 		memcpy(&wal_level, rec, sizeof(int));
 		appendStringInfo(buf, "wal_level %s", get_wal_level_string(wal_level));
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state xlrec;
+
+		memcpy(&xlrec, rec, sizeof(xl_checksum_state));
+		switch (xlrec.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_VERSION:
+				appendStringInfoString(buf, "on");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				appendStringInfoString(buf, "inprogress-off");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				appendStringInfoString(buf, "inprogress-on");
+				break;
+			default:
+				appendStringInfoString(buf, "off");
+		}
+	}
 }
 
 const char *
@@ -218,6 +239,9 @@ xlog_identify(uint8 info)
 		case XLOG_CHECKPOINT_REDO:
 			id = "CHECKPOINT_REDO";
 			break;
+		case XLOG_CHECKSUMS:
+			id = "CHECKSUMS";
+			break;
 	}
 
 	return id;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 799fc739e18..b5190a7e104 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -647,6 +647,16 @@ static XLogRecPtr LocalMinRecoveryPoint;
 static TimeLineID LocalMinRecoveryPointTLI;
 static bool updateMinRecoveryPoint = true;
 
+/*
+ * Local state fror Controlfile data_checksum_version. After initialization
+ * this is only updated when absorbing a procsignal barrier during interrupt
+ * processing. The reason for keeping a copy in backend-private memory is to
+ * avoid locking for interrogating checksum state. Possible values are the
+ * checksum versions defined in storage/bufpage.h as well as zero when data
+ * checksums are disabled.
+ */
+static uint32 LocalDataChecksumVersion = 0;
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -715,6 +725,8 @@ static void WALInsertLockAcquireExclusive(void);
 static void WALInsertLockRelease(void);
 static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);
 
+static void XLogChecksums(ChecksumType new_type);
+
 /*
  * Insert an XLOG record represented by an already-constructed chain of data
  * chunks.  This is a low-level routine; to construct the WAL record header
@@ -828,9 +840,10 @@ XLogInsertRecord(XLogRecData *rdata,
 		 * only happen just after a checkpoint, so it's better to be slow in
 		 * this case and fast otherwise.
 		 *
-		 * Also check to see if fullPageWrites was just turned on or there's a
-		 * running backup (which forces full-page writes); if we weren't
-		 * already doing full-page writes then go back and recompute.
+		 * Also check to see if fullPageWrites was just turned on, there's a
+		 * running backup or if checksums are enabled (all of which forces
+		 * full-page writes); if we weren't already doing full-page writes
+		 * then go back and recompute.
 		 *
 		 * If we aren't doing full-page writes then RedoRecPtr doesn't
 		 * actually affect the contents of the XLOG record, so we'll update
@@ -843,7 +856,9 @@ XLogInsertRecord(XLogRecData *rdata,
 			Assert(RedoRecPtr < Insert->RedoRecPtr);
 			RedoRecPtr = Insert->RedoRecPtr;
 		}
-		doPageWrites = (Insert->fullPageWrites || Insert->runningBackups > 0);
+		doPageWrites = (Insert->fullPageWrites ||
+						Insert->runningBackups > 0 ||
+						DataChecksumsNeedWrite());
 
 		if (doPageWrites &&
 			(!prevDoPageWrites ||
@@ -4579,9 +4594,7 @@ ReadControlFile(void)
 
 	CalculateCheckpointSegments();
 
-	/* Make the initdb settings visible as GUC variables, too */
-	SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no",
-					PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
+	LocalDataChecksumVersion = ControlFile->data_checksum_version;
 }
 
 /*
@@ -4615,13 +4628,345 @@ GetMockAuthenticationNonce(void)
 }
 
 /*
- * Are checksums enabled for data pages?
+ * DataChecksumsNeedWrite
+ *		Returns whether data checksums must be written or not
+ *
+ * Returns true iff data checksums are enabled or are in the process of being
+ * enabled. During "inprogress-on" and "inprogress-off" states checksums must
+ * be written even though they are not verified (see datachecksumsworker.c for
+ * a longer discussion).
+ *
+ * This function is intended for callsites which are about to write a data page
+ * to storage, and need to know whether to re-calculate the checksum for the
+ * page header. Calling this function must be performed as close to the write
+ * operation as possible to keep the critical section short.
+ */
+bool
+DataChecksumsNeedWrite(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * DataChecksumsNeedVerify
+ *		Returns whether data checksums must be verified or not
+ *
+ * Data checksums are only verified if they are fully enabled in the cluster.
+ * During the "inprogress-on" and "inprogress-off" states they are only
+ * updated, not verified (see datachecksumsworker.c for a longer discussion).
+ *
+ * This function is intended for callsites which have read data and are about
+ * to perform checksum validation based on the result of this.  Calling this
+ * function must be performed as close to the validation call as possible to
+ * keep the critical section short. This is in order to protect against time of
+ * check/time of use situations around data checksum validation.
+ */
+bool
+DataChecksumsNeedVerify(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
+}
+
+/*
+ * DataChecksumsOnInProgress
+ *		Returns whether data checksums are being enabled
+ *
+ * Most operations don't need to worry about the "inprogress" states, and
+ * should use DataChecksumsNeedVerify() or DataChecksumsNeedWrite(). The
+ * "inprogress-on" state for enabling checksums is used when the checksum
+ * worker is setting checksums on all pages, it can thus be used to check for
+ * aborted checksum processing which need to be restarted.
+ */
+inline bool
+DataChecksumsOnInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+}
+
+/*
+ * DataChecksumsOffInProgress
+ *		Returns whether data checksums are being disabled
+ *
+ * The "inprogress-off" state for disabling checksums is used for when the
+ * worker resets the catalog state.  DataChecksumsNeedVerify() or
+ * DataChecksumsNeedWrite() should be used for deciding whether to read/write
+ * checksums.
  */
 bool
-DataChecksumsEnabled(void)
+DataChecksumsOffInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * SetDataChecksumsOnInProgress
+ *		Sets the data checksum state to "inprogress-on" to enable checksums
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on". See
+ * SetDataChecksumsOn below for a description on how this state change works.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOnInProgress(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile != NULL);
+
+	/*
+	 * The state transition is performed in a critical section with
+	 * checkpoints held off to provide crash safety.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state change in all backends to ensure that all backends are in
+	 * "inprogress-on". Once done we know that all backends are writing data
+	 * checksums.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOn
+ *		Enables data checksums cluster-wide
+ *
+ * Enabling data checksums is performed using two barriers, the first one to
+ * set the checksums state to "inprogress-on" (which is performed by
+ * SetDataChecksumsOnInProgress()) and the second one to set the state to "on"
+ * (performed here).
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on".  This state
+ * requires data checksums to be written but not verified. This ensures that
+ * all data pages can be checksummed without the risk of false negatives in
+ * validation during the process.  When all existing pages are guaranteed to
+ * have checksums, and all new pages will be initiated with checksums, the
+ * state can be changed to "on". Once the state is "on" checksums will be both
+ * written and verified. See datachecksumsworker.c for a longer discussion on
+ * how data checksums can be enabled in a running cluster.
+ *
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOn(void)
 {
+	uint64		barrier;
+
 	Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/*
+	 * The only allowed state transition to "on" is from "inprogress-on" since
+	 * that state ensures that all pages will have data checksums written.
+	 */
+	if (ControlFile->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+		elog(ERROR, "checksums not in \"inprogress-on\" mode");
+	}
+
+	LWLockRelease(ControlFileLock);
+
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state transition of "on" in all backends. When done we know that
+	 * data checksums are enabled in all backends and data checksums are both
+	 * written and verified.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOff
+ *		Disables data checksums cluster-wide
+ *
+ * Disabling data checksums must be performed with two sets of barriers, each
+ * carrying a different state. The state is first set to "inprogress-off"
+ * during which checksums are still written but not verified. This ensures that
+ * backends which have yet to observe the state change from "on" won't get
+ * validation errors on concurrently modified pages. Once all backends have
+ * changed to "inprogress-off", the barrier for moving to "off" can be emitted.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOff(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/* If data checksums are already disabled there is nothing to do */
+	if (ControlFile->data_checksum_version == 0)
+	{
+		LWLockRelease(ControlFileLock);
+		return;
+	}
+
+	/*
+	 * If data checksums are currently enabled we first transition to the
+	 * "inprogress-off" state during which backends continue to write
+	 * checksums without verifying them. When all backends are in
+	 * "inprogress-off" the next transition to "off" can be performed, after
+	 * which all data checksum processing is disabled.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+
+		MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+		START_CRIT_SECTION();
+
+		XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+		LWLockRelease(ControlFileLock);
+
+		barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+
+		END_CRIT_SECTION();
+		MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+
+		/*
+		 * Update local state in all backends to ensure that any backend in
+		 * "on" state is changed to "inprogress-off".
+		 */
+		WaitForProcSignalBarrier(barrier);
+
+		/*
+		 * At this point we know that no backends are verifying data checksums
+		 * during reading. Next, we can safely move to state "off" to also
+		 * stop writing checksums.
+		 */
+	}
+	else
+	{
+		/*
+		 * Ending up here implies that the checksums state is "inprogress-on"
+		 * or "inprogress-off" and we can transition directly to "off" from
+		 * there.
+		 */
+		LWLockRelease(ControlFileLock);
+	}
+
+	/*
+	 * Ensure that we don't incur a checkpoint during disabling checksums.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(0);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = 0;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * ProcSignalBarrier absorption functions for enabling and disabling data
+ * checksums in a running cluster. The procsignalbarriers are emitted in the
+ * SetDataChecksums* functions.
+ */
+bool
+AbsorbChecksumsOnInProgressBarrier(void)
+{
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOnBarrier(void)
+{
+	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffInProgressBarrier(void)
+{
+	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffBarrier(void)
+{
+	LocalDataChecksumVersion = 0;
+	return true;
+}
+
+/*
+ * InitLocalControlData
+ *
+ * Set up backend local caches of controldata variables which may change at
+ * any point during runtime and thus require special cased locking. So far
+ * this only applies to data_checksum_version, but it's intended to be general
+ * purpose enough to handle future cases.
+ */
+void
+InitLocalControldata(void)
+{
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	LocalDataChecksumVersion = ControlFile->data_checksum_version;
+	LWLockRelease(ControlFileLock);
+}
+
+/* guc hook */
+const char *
+show_data_checksums(void)
+{
+	if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)
+		return "on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		return "inprogress-on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+		return "inprogress-off";
+	else
+		return "off";
 }
 
 /*
@@ -6194,6 +6539,33 @@ StartupXLOG(void)
 	 */
 	CompleteCommitTsInitialization();
 
+	/*
+	 * If we reach this point with checksums being enabled ("inprogress-on"
+	 * state), we notify the user that they need to manually restart the
+	 * process to enable checksums. This is because we cannot launch a dynamic
+	 * background worker directly from here, it has to be launched from a
+	 * regular backend.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		ereport(WARNING,
+				(errmsg("data checksums are being enabled, but no worker is running"),
+				 errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
+
+	/*
+	 * If data checksums were being disabled when the cluster was shut down,
+	 * we know that we have a state where all backends have stopped validating
+	 * checksums and we can move to off instead of prompting the user to
+	 * perform any action.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+	{
+		XLogChecksums(0);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = 0;
+		LWLockRelease(ControlFileLock);
+	}
+
 	/*
 	 * All done with end-of-recovery actions.
 	 *
@@ -8201,6 +8573,24 @@ XLogReportParameters(void)
 	}
 }
 
+/*
+ * Log the new state of checksums
+ */
+static void
+XLogChecksums(ChecksumType new_type)
+{
+	xl_checksum_state xlrec;
+	XLogRecPtr	recptr;
+
+	xlrec.new_checksumtype = new_type;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, sizeof(xl_checksum_state));
+
+	recptr = XLogInsert(RM_XLOG_ID, XLOG_CHECKSUMS);
+	XLogFlush(recptr);
+}
+
 /*
  * Update full_page_writes in shared memory, and write an
  * XLOG_FPW_CHANGE record if necessary.
@@ -8629,6 +9019,46 @@ xlog_redo(XLogReaderState *record)
 	{
 		/* nothing to do here, just for informational purposes */
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state state;
+		uint64		barrier;
+
+		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = state.new_checksumtype;
+		LWLockRelease(ControlFileLock);
+
+		/*
+		 * Block on a procsignalbarrier to await all processes having seen the
+		 * change to checksum status. Once the barrier has been passed we can
+		 * initiate the corresponding processing.
+		 */
+		switch (state.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			default:
+				Assert(state.new_checksumtype == 0);
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+		}
+	}
 }
 
 /*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 8c3090165f0..6b1c2ed85ea 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -26,6 +26,7 @@
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "postmaster/datachecksumsworker.h"
 #include "replication/walreceiver.h"
 #include "storage/fd.h"
 #include "storage/latch.h"
@@ -748,3 +749,43 @@ pg_promote(PG_FUNCTION_ARGS)
 						   wait_seconds)));
 	PG_RETURN_BOOL(false);
 }
+
+/*
+ * Disables data checksums for the cluster, if applicable. Starts a background
+ * worker which turns off the data checksums.
+ */
+Datum
+disable_data_checksums(PG_FUNCTION_ARGS)
+{
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	StartDataChecksumsWorkerLauncher(false, 0, 0, false);
+	PG_RETURN_VOID();
+}
+
+/*
+ * Enables data checksums for the cluster, if applicable.  Supports vacuum-
+ * like cost based throttling to limit system load. Starts a background worker
+ * which updates data checksums on existing data.
+ */
+Datum
+enable_data_checksums(PG_FUNCTION_ARGS)
+{
+	int			cost_delay = PG_GETARG_INT32(0);
+	int			cost_limit = PG_GETARG_INT32(1);
+	bool		fast = PG_GETARG_BOOL(2);
+
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	if (cost_delay < 0)
+		ereport(ERROR, errmsg("cost delay cannot be a negative value"));
+
+	if (cost_limit <= 0)
+		ereport(ERROR, errmsg("cost limit must be greater than zero"));
+
+	StartDataChecksumsWorkerLauncher(true, cost_delay, cost_limit, fast);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 3f8a3c55725..598dd41bae6 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1613,7 +1613,8 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	 * enabled for this cluster, and if this is a relation file, then verify
 	 * the checksum.
 	 */
-	if (!noverify_checksums && DataChecksumsEnabled() &&
+	if (!noverify_checksums &&
+		DataChecksumsNeedWrite() &&
 		RelFileNumberIsValid(relfilenumber))
 		verify_checksum = true;
 
@@ -2006,6 +2007,9 @@ verify_page_checksum(Page page, XLogRecPtr start_lsn, BlockNumber blkno,
 	if (PageIsNew(page) || PageGetLSN(page) >= start_lsn)
 		return true;
 
+	if (!DataChecksumsNeedVerify())
+		return true;
+
 	/* Perform the actual checksum calculation. */
 	checksum = pg_checksum_page(page, blkno);
 
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 566f308e443..b18db2b5dde 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -650,6 +650,13 @@ LANGUAGE INTERNAL
 CALLED ON NULL INPUT VOLATILE PARALLEL SAFE
 AS 'pg_stat_reset_slru';
 
+CREATE OR REPLACE FUNCTION
+  pg_enable_data_checksums(cost_delay integer DEFAULT 0,
+                           cost_limit integer DEFAULT 100,
+						   fast boolean DEFAULT false)
+  RETURNS void STRICT VOLATILE LANGUAGE internal AS 'enable_data_checksums'
+  PARALLEL RESTRICTED;
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
@@ -775,6 +782,10 @@ REVOKE EXECUTE ON FUNCTION pg_ls_logicalmapdir() FROM PUBLIC;
 
 REVOKE EXECUTE ON FUNCTION pg_ls_replslotdir(text) FROM PUBLIC;
 
+REVOKE EXECUTE ON FUNCTION pg_enable_data_checksums(integer, integer, boolean) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_disable_data_checksums() FROM public;
+
 --
 -- We also set up some things as accessible to standard roles.
 --
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index a4d2cfdcaf5..0d62976dc1f 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1334,6 +1334,27 @@ CREATE VIEW pg_stat_progress_copy AS
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_data_checksums AS
+	SELECT
+		S.pid AS pid, S.datid AS datid, D.datname AS datname,
+		CASE S.param1 WHEN 0 THEN 'enabling'
+                      WHEN 1 THEN 'disabling'
+					  WHEN 2 THEN 'waiting'
+					  WHEN 3 THEN 'waiting on backends'
+					  WHEN 4 THEN 'waiting on temporary tables'
+					  WHEN 5 THEN 'done'
+					  END AS phase,
+		CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+		CASE S.param3 WHEN -1 THEN NULL ELSE S.param3 END AS relations_total,
+		S.param4 AS databases_processed,
+		S.param5 AS relations_processed,
+		S.param6 AS databases_current,
+		S.param7 AS relation_current,
+		S.param8 AS relation_current_blocks,
+		S.param9 AS relation_current_blocks_processed
+	FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+		LEFT JOIN pg_database D ON S.datid = D.oid;
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/postmaster/Makefile b/src/backend/postmaster/Makefile
index 0f4435d2d97..0c36765acfe 100644
--- a/src/backend/postmaster/Makefile
+++ b/src/backend/postmaster/Makefile
@@ -18,6 +18,7 @@ OBJS = \
 	bgworker.o \
 	bgwriter.o \
 	checkpointer.o \
+	datachecksumsworker.o \
 	fork_process.o \
 	interrupt.o \
 	launch_backend.o \
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 116ddf7b835..5720f66b0fd 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -18,6 +18,7 @@
 #include "pgstat.h"
 #include "port/atomics.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "replication/logicallauncher.h"
 #include "replication/logicalworker.h"
@@ -132,6 +133,12 @@ static const struct
 	},
 	{
 		"TablesyncWorkerMain", TablesyncWorkerMain
+	},
+	{
+		"DataChecksumsWorkerLauncherMain", DataChecksumsWorkerLauncherMain
+	},
+	{
+		"DataChecksumsWorkerMain", DataChecksumsWorkerMain
 	}
 };
 
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
new file mode 100644
index 00000000000..de7a077f9c2
--- /dev/null
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -0,0 +1,1376 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.c
+ *	  Background worker for enabling or disabling data checksums online
+ *
+ * When enabling data checksums on a database at initdb time or when shut down
+ * with pg_checksums, no extra process is required as each page is checksummed,
+ * and verified, when accessed.  When enabling checksums on an already running
+ * cluster, this worker will ensure that all pages are checksummed before
+ * verification of the checksums is turned on. In the case of disabling
+ * checksums, the state transition is performed only in the control file, no
+ * changes are performed on the data pages.
+ *
+ * Checksums can be either enabled or disabled cluster-wide, with on/off being
+ * the end state for data_checksums.
+ *
+ * Enabling checksums
+ * ------------------
+ * When enabling checksums in an online cluster, data_checksums will be set to
+ * "inprogress-on" which signals that write operations MUST compute and write
+ * the checksum on the data page, but during reading the checksum SHALL NOT be
+ * verified. This ensures that all objects created during checksumming will
+ * have checksums set, but no reads will fail due to incorrect checksum. The
+ * DataChecksumsWorker will compile a list of databases which exist at the
+ * start of checksumming, and all of these which haven't been dropped during
+ * the processing MUST have been processed successfully in order for checksums
+ * to be enabled. Any new relation created during processing will see the
+ * in-progress state and will automatically be checksummed.
+ *
+ * For each database, all relations which have storage are read and every data
+ * page is marked dirty to force a write with the checksum. This will generate
+ * a lot of WAL as the entire database is read and written.
+ *
+ * If the processing is interrupted by a cluster restart, it will be restarted
+ * from the beginning again as state isn't persisted.
+ *
+ * Disabling checksums
+ * -------------------
+ * When disabling checksums, data_checksums will be set to "inprogress-off"
+ * which signals that checksums are written but no longer verified. This ensure
+ * that backends which have yet to move from the "on" state will still be able
+ * to process data checksum validation.
+ *
+ * Synchronization and Correctness
+ * -------------------------------
+ * The processes involved in enabling, or disabling, data checksums in an
+ * online cluster must be properly synchronized with the normal backends
+ * serving concurrent queries to ensure correctness. Correctness is defined
+ * as the following:
+ *
+ *    - Backends SHALL NOT violate local data_checksums state
+ *    - Data checksums SHALL NOT be considered enabled cluster-wide until all
+ *      currently connected backends have the local state "enabled"
+ *
+ * There are two levels of synchronization required for enabling data checksums
+ * in an online cluster: (i) changing state in the active backends ("on",
+ * "off", "inprogress-on" and "inprogress-off"), and (ii) ensuring no
+ * incompatible objects and processes are left in a database when workers end.
+ * The former deals with cluster-wide agreement on data checksum state and the
+ * latter with ensuring that any concurrent activity cannot break the data
+ * checksum contract during processing.
+ *
+ * Synchronizing the state change is done with procsignal barriers, where the
+ * WAL logging backend updating the global state in the controlfile will wait
+ * for all other backends to absorb the barrier. Barrier absorption will happen
+ * during interrupt processing, which means that connected backends will change
+ * state at different times. To prevent data checksum state changes when
+ * writing and verifying checksums, interrupts shall be held off before
+ * interrogating state and resumed when the IO operation has been performed.
+ *
+ *   When Enabling Data Checksums
+ *   ----------------------------
+ *   A process which fails to observe data checksums being enabled can induce
+ *   two types of errors: failing to write the checksum when modifying the page
+ *   and failing to validate the data checksum on the page when reading it.
+ *
+ *   When processing starts all backends belong to one of the below sets, with
+ *   one set being empty:
+ *
+ *   Bd: Backends in "off" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   If processing is started in an online cluster then all backends are in Bd.
+ *   If processing was halted by the cluster shutting down, the controlfile
+ *   state "inprogress-on" will be observed on system startup and all backends
+ *   will be in Bd. Backends transition Bd -> Bi via a procsignalbarrier.  When
+ *   the DataChecksumsWorker has finished writing checksums on all pages and
+ *   enables data checksums cluster-wide, there are four sets of backends where
+ *   Bd shall be an empty set:
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   Backends in Bi and Be will write checksums when modifying a page, but only
+ *   backends in Be will verify the checksum during reading. The Bg backend is
+ *   blocked waiting for all backends in Bi to process interrupts and move to
+ *   Be. Any backend starting while Bg is waiting on the procsignalbarrier will
+ *   observe the global state being "on" and will thus automatically belong to
+ *   Be.  Checksums are enabled cluster-wide when Bi is an empty set. Bi and Be
+ *   are compatible sets while still operating based on their local state as
+ *   both write data checksums.
+ *
+ *   When Disabling Data Checksums
+ *   -----------------------------
+ *   A process which fails to observe that data checksums have been disabled
+ *   can induce two types of errors: writing the checksum when modifying the
+ *   page and validating a data checksum which is no longer correct due to
+ *   modifications to the page.
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bo: Backends in "inprogress-off" state
+ *
+ *   Backends transition from the Be state to Bd like so: Be -> Bo -> Bd
+ *
+ *   The goal is to transition all backends to Bd making the others empty sets.
+ *   Backends in Bo write data checksums, but don't validate them, such that
+ *   backends still in Be can continue to validate pages until the barrier has
+ *   been absorbed such that they are in Bo. Once all backends are in Bo, the
+ *   barrier to transition to "off" can be raised and all backends can safely
+ *   stop writing data checksums as no backend is enforcing data checksum
+ *   validation any longer.
+ *
+ *
+ * Potential optimizations
+ * -----------------------
+ * Below are some potential optimizations and improvements which were brought
+ * up during reviews of this feature, but which weren't implemented in the
+ * initial version. These are ideas listed without any validation on their
+ * feasibility or potential payoff. More discussion on these can be found on
+ * the -hackers threads linked to in the commit message of this feature.
+ *
+ *   * Launching datachecksumsworker for resuming operation from the startup
+ *     process: Currently users have to restart processing manually after a
+ *     restart since dynamic background worker cannot be started from the
+ *     postmaster. Changing the startup process could make restarting the
+ *     processing automatic on cluster restart.
+ *   * Avoid dirtying the page when checksums already match: Iff the checksum
+ *     on the page happens to already match we still dirty the page. It should
+ *     be enough to only do the log_newpage_buffer() call in that case.
+ *   * Invent a lightweight WAL record that doesn't contain the full-page
+ *     image but just the block number: On replay, the redo routine would read
+ *     the page from disk.
+ *   * Teach pg_checksums to avoid checksummed pages when pg_checksums is used
+ *     to enable checksums on a cluster which is in inprogress-on state and
+ *     may have checksummed pages (make pg_checksums be able to resume an
+ *     online operation).
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/postmaster/datachecksumsworker.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/htup_details.h"
+#include "access/xact.h"
+#include "access/xloginsert.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_database.h"
+#include "commands/progress.h"
+#include "commands/vacuum.h"
+#include "common/relpath.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "postmaster/bgworker.h"
+#include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/bufmgr.h"
+#include "storage/checksum.h"
+#include "storage/ipc.h"
+#include "storage/lmgr.h"
+#include "storage/lwlock.h"
+#include "storage/procarray.h"
+#include "storage/smgr.h"
+#include "tcop/tcopprot.h"
+#include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
+#include "utils/ps_status.h"
+#include "utils/syscache.h"
+
+/*
+ * Number of times we retry to open a database before giving up and consider
+ * it to have failed processing.
+ */
+#define DATACHECKSUMSWORKER_MAX_DB_RETRIES 5
+
+typedef enum
+{
+	DATACHECKSUMSWORKER_SUCCESSFUL = 0,
+	DATACHECKSUMSWORKER_ABORTED,
+	DATACHECKSUMSWORKER_FAILED,
+	DATACHECKSUMSWORKER_RETRYDB,
+} DataChecksumsWorkerResult;
+
+/*
+ * Signaling between backends calling pg_enable/disable_data_checksums, the
+ * checksums launcher process, and the checksums worker process.
+ *
+ * This struct is protected by DataChecksumsWorkerLock
+ */
+typedef struct DataChecksumsWorkerShmemStruct
+{
+	/*
+	 * These are set by pg_enable/disable_data_checksums, to tell the launcher
+	 * what the target state is.
+	 */
+	bool		launch_enable_checksums;	/* True if checksums are being
+											 * enabled, else false */
+	int			launch_cost_delay;
+	int			launch_cost_limit;
+	bool		launch_fast;
+
+	/*
+	 * Is a launcher process is currently running?
+	 *
+	 * This is set by the launcher process, after it has read the above
+	 * launch_* parameters.
+	 */
+	bool		launcher_running;
+
+	/*
+	 * These fields indicate the target state that the launcher is currently
+	 * working towards. They can be different from the corresponding launch_*
+	 * fields, if a new pg_enable/disable_data_checksums() call was made while
+	 * the launcher/worker was already running.
+	 *
+	 * The below members are set when the launcher starts, and are only
+	 * accessed read-only by the single worker. Thus, we can access these
+	 * without a lock. If multiple workers, or dynamic cost parameters, are
+	 * supported at some point then this would need to be revisited.
+	 */
+	bool		enabling_checksums; /* True if checksums are being enabled,
+									 * else false */
+	int			cost_delay;
+	int			cost_limit;
+	bool		immediate_checkpoint;
+
+	/*
+	 * Signaling between the launcher and the worker process.
+	 *
+	 * As there is only a single worker, and the launcher won't read these
+	 * until the worker exits, they can be accessed without the need for a
+	 * lock. If multiple workers are supported then this will have to be
+	 * revisited.
+	 */
+
+	/* result, set by worker before exiting */
+	DataChecksumsWorkerResult success;
+
+	/*
+	 * tells the worker process whether it should also process the shared
+	 * catalogs
+	 */
+	bool		process_shared_catalogs;
+} DataChecksumsWorkerShmemStruct;
+
+/* Shared memory segment for datachecksumsworker */
+static DataChecksumsWorkerShmemStruct *DataChecksumsWorkerShmem;
+
+/* Bookkeeping for work to do */
+typedef struct DataChecksumsWorkerDatabase
+{
+	Oid			dboid;
+	char	   *dbname;
+} DataChecksumsWorkerDatabase;
+
+typedef struct DataChecksumsWorkerResultEntry
+{
+	Oid			dboid;
+	DataChecksumsWorkerResult result;
+	int			retries;
+} DataChecksumsWorkerResultEntry;
+
+
+/*
+ * Flag set by the interrupt handler
+ */
+static volatile sig_atomic_t abort_requested = false;
+
+/*
+ * Have we set the DataChecksumsWorkerShmemStruct->launcher_running flag?
+ * If we have, we need to clear it before exiting!
+ */
+static volatile sig_atomic_t launcher_running = false;
+
+/*
+ * Are we enabling data checksums, or disabling them?
+ */
+static bool enabling_checksums;
+
+/* Prototypes */
+static List *BuildDatabaseList(void);
+static List *BuildRelationList(bool temp_relations, bool include_shared);
+static void FreeDatabaseList(List *dblist);
+static DataChecksumsWorkerResult ProcessDatabase(DataChecksumsWorkerDatabase *db);
+static bool ProcessAllDatabases(bool immediate_checkpoint);
+static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy);
+static void launcher_cancel_handler(SIGNAL_ARGS);
+static void WaitForAllTransactionsToFinish(void);
+
+/*
+ * StartDataChecksumsWorkerLauncher
+ *		Main entry point for datachecksumsworker launcher process
+ *
+ * The main entrypoint for starting data checksums processing for enabling as
+ * well as disabling.
+ */
+void
+StartDataChecksumsWorkerLauncher(bool enable_checksums,
+								 int cost_delay,
+								 int cost_limit,
+								 bool fast)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	bool		launcher_running;
+
+	/* the cost delay settings have no effect when disabling */
+	Assert(enable_checksums || cost_delay == 0);
+	Assert(enable_checksums || cost_limit == 0);
+
+	/* store the desired state in shared memory */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	DataChecksumsWorkerShmem->launch_enable_checksums = enable_checksums;
+	DataChecksumsWorkerShmem->launch_cost_delay = cost_delay;
+	DataChecksumsWorkerShmem->launch_cost_limit = cost_limit;
+	DataChecksumsWorkerShmem->launch_fast = fast;
+
+	/* is the launcher already running? */
+	launcher_running = DataChecksumsWorkerShmem->launcher_running;
+
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * Launch a new launcher process, if it's not running already.
+	 *
+	 * If the launcher is currently busy enabling the checksums, and we want
+	 * them disabled (or vice versa), the launcher will notice that at latest
+	 * when it's about to exit, and will loop back process the new request. So
+	 * if the launcher is already running, we don't need to do anything more
+	 * here to abort it.
+	 *
+	 * If you call pg_enable/disable_data_checksums() twice in a row, before
+	 * the launcher has had a chance to start up, we still end up launching it
+	 * twice.  That's OK, the second invocation will see that a launcher is
+	 * already running and exit quickly.
+	 *
+	 * TODO: We could optimize here and skip launching the launcher, if we are
+	 * already in the desired state, i.e. if the checksums are already enabled
+	 * and you call pg_enable_data_checksums().
+	 */
+	if (!launcher_running)
+	{
+		/*
+		 * Prepare the BackgroundWorker and launch it.
+		 */
+		memset(&bgw, 0, sizeof(bgw));
+		bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+		bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+		snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+		snprintf(bgw.bgw_function_name, BGW_MAXLEN, "DataChecksumsWorkerLauncherMain");
+		snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker launcher");
+		snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker launcher");
+		bgw.bgw_restart_time = BGW_NEVER_RESTART;
+		bgw.bgw_notify_pid = MyProcPid;
+		bgw.bgw_main_arg = (Datum) 0;
+
+		if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+			ereport(ERROR,
+					(errmsg("failed to start background worker to process data checksums")));
+	}
+}
+
+/*
+ * ProcessSingleRelationFork
+ *		Enable data checksums in a single relation/fork.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy)
+{
+	BlockNumber numblocks = RelationGetNumberOfBlocksInFork(reln, forkNum);
+	BlockNumber blknum;
+	char		activity[NAMEDATALEN * 2 + 128];
+	char	   *relns;
+
+	relns = get_namespace_name(RelationGetNamespace(reln));
+
+	if (!relns)
+		return false;
+
+	/* Report the current relation to pgstat_activity */
+	snprintf(activity, sizeof(activity) - 1, "processing: %s.%s (%s, %dblocks)",
+			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	/*
+	 * We are looping over the blocks which existed at the time of process
+	 * start, which is safe since new blocks are created with checksums set
+	 * already due to the state being "inprogress-on".
+	 */
+	for (blknum = 0; blknum < numblocks; blknum++)
+	{
+		Buffer		buf = ReadBufferExtended(reln, forkNum, blknum, RBM_NORMAL, strategy);
+
+		/* Need to get an exclusive lock before we can flag as dirty */
+		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+
+		/*
+		 * Mark the buffer as dirty and force a full page write.  We have to
+		 * re-write the page to WAL even if the checksum hasn't changed,
+		 * because if there is a replica it might have a slightly different
+		 * version of the page with an invalid checksum, caused by unlogged
+		 * changes (e.g. hintbits) on the master happening while checksums
+		 * were off. This can happen if there was a valid checksum on the page
+		 * at one point in the past, so only when checksums are first on, then
+		 * off, and then turned on again. Iff wal_level is set to "minimal",
+		 * this could be avoided iff the checksum is calculated to be correct.
+		 */
+		START_CRIT_SECTION();
+		MarkBufferDirty(buf);
+		log_newpage_buffer(buf, false);
+		END_CRIT_SECTION();
+
+		UnlockReleaseBuffer(buf);
+
+		/*
+		 * This is the only place where we check if we are asked to abort, the
+		 * abortion will bubble up from here. It's safe to check this without
+		 * a lock, because if we miss it being set, we will try again soon.
+		 */
+		Assert(enabling_checksums);
+		if (!DataChecksumsWorkerShmem->launch_enable_checksums)
+			abort_requested = true;
+		if (abort_requested)
+			return false;
+
+		vacuum_delay_point();
+	}
+
+	pfree(relns);
+	return true;
+}
+
+/*
+ * ProcessSingleRelationByOid
+ *		Process a single relation based on oid.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationByOid(Oid relationId, BufferAccessStrategy strategy)
+{
+	Relation	rel;
+	ForkNumber	fnum;
+	bool		aborted = false;
+
+	StartTransactionCommand();
+
+	rel = try_relation_open(relationId, AccessShareLock);
+	if (rel == NULL)
+	{
+		/*
+		 * Relation no longer exists. We don't consider this an error since
+		 * there are no pages in it that need data checksums, and thus return
+		 * true. The worker operates off a list of relations generated at the
+		 * start of processing, so relations being dropped in the meantime is
+		 * to be expected.
+		 */
+		CommitTransactionCommand();
+		pgstat_report_activity(STATE_IDLE, NULL);
+		return true;
+	}
+	RelationGetSmgr(rel);
+
+	for (fnum = 0; fnum <= MAX_FORKNUM; fnum++)
+	{
+		if (smgrexists(rel->rd_smgr, fnum))
+		{
+			if (!ProcessSingleRelationFork(rel, fnum, strategy))
+			{
+				aborted = true;
+				break;
+			}
+		}
+	}
+	relation_close(rel, AccessShareLock);
+	elog(DEBUG2,
+		 "data checksum processing done for relation with OID %u: %s",
+		 relationId, (aborted ? "aborted" : "finished"));
+
+	CommitTransactionCommand();
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return !aborted;
+}
+
+/*
+ * ProcessDatabase
+ *		Enable data checksums in a single database.
+ *
+ * We do this by launching a dynamic background worker into this database, and
+ * waiting for it to finish.  We have to do this in a separate worker, since
+ * each process can only be connected to one database during its lifetime.
+ */
+static DataChecksumsWorkerResult
+ProcessDatabase(DataChecksumsWorkerDatabase *db)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	BgwHandleStatus status;
+	pid_t		pid;
+	char		activity[NAMEDATALEN + 64];
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_FAILED;
+
+	memset(&bgw, 0, sizeof(bgw));
+	bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+	bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+	snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+	snprintf(bgw.bgw_function_name, BGW_MAXLEN, "%s", "DataChecksumsWorkerMain");
+	snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker worker");
+	snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker worker");
+	bgw.bgw_restart_time = BGW_NEVER_RESTART;
+	bgw.bgw_notify_pid = MyProcPid;
+	bgw.bgw_main_arg = ObjectIdGetDatum(db->dboid);
+
+	/*
+	 * If there are no worker slots available, make sure we retry processing
+	 * this database. This will make the datachecksumsworker move on to the
+	 * next database and quite likely fail with the same problem. TODO: Maybe
+	 * we need a backoff to avoid running through all the databases here in
+	 * short order.
+	 */
+	if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+	{
+		ereport(WARNING,
+				(errmsg("failed to start worker for enabling data checksums in database \"%s\", retrying",
+						db->dbname),
+				 errhint("The max_worker_processes setting might be too low.")));
+		return DATACHECKSUMSWORKER_RETRYDB;
+	}
+
+	status = WaitForBackgroundWorkerStartup(bgw_handle, &pid);
+	if (status == BGWH_STOPPED)
+	{
+		ereport(WARNING,
+				(errmsg("could not start background worker for enabling data checksums in database \"%s\"",
+						db->dbname),
+				 errhint("More details on the error might be found in the server log.")));
+		return DATACHECKSUMSWORKER_FAILED;
+	}
+
+	/*
+	 * If the postmaster crashed we cannot end up with a processed database so
+	 * we have no alternative other than exiting. When enabling checksums we
+	 * won't at this time have changed the pg_control version to enabled so
+	 * when the cluster comes back up processing will have to be restarted.
+	 * When disabling, the pg_control version will be set to off before this
+	 * so when the cluster comes up checksums will be off as expected.
+	 */
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("cannot enable data checksums without the postmaster process"),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	Assert(status == BGWH_STARTED);
+	ereport(DEBUG1,
+			(errmsg("initiating data checksum processing in database \"%s\"",
+					db->dbname)));
+
+	snprintf(activity, sizeof(activity) - 1,
+			 "Waiting for worker in database %s (pid %d)", db->dbname, pid);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	status = WaitForBackgroundWorkerShutdown(bgw_handle);
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("postmaster exited during data checksum processing in \"%s\"",
+						db->dbname),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	if (DataChecksumsWorkerShmem->success == DATACHECKSUMSWORKER_ABORTED)
+		ereport(LOG,
+				(errmsg("data checksums processing was aborted in database \"%s\"",
+						db->dbname)));
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return DataChecksumsWorkerShmem->success;
+}
+
+/*
+ * launcher_exit
+ *
+ * Internal routine for cleaning up state when the launcher process exits. We
+ * need to clean up the abort flag to ensure that processing can be restarted
+ * again after it was previously aborted.
+ */
+static void
+launcher_exit(int code, Datum arg)
+{
+	if (launcher_running)
+	{
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		launcher_running = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		LWLockRelease(DataChecksumsWorkerLock);
+	}
+}
+
+/*
+ * launcher_cancel_handler
+ *
+ * Internal routine for reacting to SIGINT and flagging the worker to abort.
+ * The worker won't be interrupted immediately but will check for abort flag
+ * between each block in a relation.
+ */
+static void
+launcher_cancel_handler(SIGNAL_ARGS)
+{
+	int			save_errno = errno;
+
+	abort_requested = true;
+
+	/*
+	 * There is no sleeping in the main loop, the flag will be checked
+	 * periodically in ProcessSingleRelationFork. The worker does however
+	 * sleep when waiting for concurrent transactions to end so we still need
+	 * to set the latch.
+	 */
+	SetLatch(MyLatch);
+
+	errno = save_errno;
+}
+
+/*
+ * WaitForAllTransactionsToFinish
+ *		Blocks awaiting all current transactions to finish
+ *
+ * Returns when all transactions which are active at the call of the function
+ * have ended, or if the postmaster dies while waiting. If the postmaster dies
+ * the abort flag will be set to indicate that the caller of this shouldn't
+ * proceed.
+ *
+ * NB: this will return early, if aborted by SIGINT or if the target state
+ * is changed while we're running.
+ */
+static void
+WaitForAllTransactionsToFinish(void)
+{
+	TransactionId waitforxid;
+
+	LWLockAcquire(XidGenLock, LW_SHARED);
+	waitforxid = XidFromFullTransactionId(TransamVariables->nextXid);
+	LWLockRelease(XidGenLock);
+
+	while (TransactionIdPrecedes(GetOldestActiveTransactionId(), waitforxid))
+	{
+		char		activity[64];
+		int			rc;
+
+		/* Oldest running xid is older than us, so wait */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for current transactions to finish (waiting for %u)",
+				 waitforxid);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		rc = WaitLatch(MyLatch,
+					   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+					   5000,
+					   WAIT_EVENT_CHECKSUM_ENABLE_STARTCONDITION);
+
+		/*
+		 * If the postmaster died we won't be able to enable checksums
+		 * cluster-wide so abort and hope to continue when restarted.
+		 */
+		if (rc & WL_POSTMASTER_DEATH)
+			ereport(FATAL,
+					(errmsg("postmaster exited during data checksum processing"),
+					 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_SHARED);
+		if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			abort_requested = true;
+		LWLockRelease(DataChecksumsWorkerLock);
+		if (abort_requested)
+			break;
+	}
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+	return;
+}
+
+/*
+ * DataChecksumsWorkerLauncherMain
+ *
+ * Main function for launching dynamic background workers for processing data
+ * checksums in databases. This function has the bgworker management, with
+ * ProcessAllDatabases being responsible for looping over the databases and
+ * initiating processing.
+ */
+void
+DataChecksumsWorkerLauncherMain(Datum arg)
+{
+	on_shmem_exit(launcher_exit, 0);
+
+	ereport(DEBUG1,
+			errmsg("background worker \"datachecksumsworker\" launcher started"));
+
+	pqsignal(SIGTERM, die);
+	pqsignal(SIGINT, launcher_cancel_handler);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_LAUNCHER;
+	init_ps_display(NULL);
+
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	if (DataChecksumsWorkerShmem->launcher_running)
+	{
+		/* Launcher was already running, let it finish */
+		LWLockRelease(DataChecksumsWorkerLock);
+		return;
+	}
+
+	launcher_running = true;
+
+	/*
+	 * Initialize a connection to shared catalogs only.
+	 */
+	BackgroundWorkerInitializeConnectionByOid(InvalidOid, InvalidOid, 0);
+
+	/* Initialize backend status information */
+	pgstat_bestart();
+
+	enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+	DataChecksumsWorkerShmem->launcher_running = true;
+	DataChecksumsWorkerShmem->enabling_checksums = enabling_checksums;
+	DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+	DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+	DataChecksumsWorkerShmem->immediate_checkpoint = DataChecksumsWorkerShmem->launch_fast;
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * The target state can change while we are busy enabling/disabling
+	 * checksums, if the user calls pg_disable/enable_data_checksums() before
+	 * we are finished with the previous request. In that case, we will loop
+	 * back here, to process the new request.
+	 */
+again:
+
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	/*
+	 * If we're asked to enable checksums, we need to check if processing was
+	 * previously interrupted such that we should resume rather than start
+	 * from scratch.
+	 */
+	if (enabling_checksums)
+	{
+		/*
+		 * If we are asked to enable checksums in a cluster which already has
+		 * checksums enabled, exit immediately as there is nothing more to do.
+		 * Hold interrupts to make sure state doesn't change during checking.
+		 */
+		HOLD_INTERRUPTS();
+		if (DataChecksumsNeedVerify())
+		{
+			RESUME_INTERRUPTS();
+			goto done;
+		}
+		RESUME_INTERRUPTS();
+
+		/*
+		 * Initialize progress and indicate that we are waiting on the other
+		 * backends to clear the procsignalbarrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS);
+
+		/*
+		 * Set the state to inprogress-on and wait on the procsignal barrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_ENABLING);
+		SetDataChecksumsOnInProgress();
+
+		if (!ProcessAllDatabases(DataChecksumsWorkerShmem->immediate_checkpoint))
+		{
+			/*
+			 * If the target state changed during processing then it's not a
+			 * failure, so restart processing instead.
+			 */
+			LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+			if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			{
+				LWLockRelease(DataChecksumsWorkerLock);
+				goto done;
+			}
+			LWLockRelease(DataChecksumsWorkerLock);
+			ereport(ERROR,
+					(errmsg("unable to enable data checksums in cluster")));
+		}
+
+		SetDataChecksumsOn();
+	}
+	else
+	{
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_DISABLING);
+		SetDataChecksumsOff();
+	}
+
+done:
+
+	/*
+	 * All done. But before we exit, check if the target state was changed
+	 * while we were running. In that case we will have to start all over
+	 * again.
+	 */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+	if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+	{
+		DataChecksumsWorkerShmem->enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+		DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+		LWLockRelease(DataChecksumsWorkerLock);
+		goto again;
+	}
+
+	/* Shut down progress reporting as we are done */
+	pgstat_progress_end_command();
+
+	launcher_running = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	LWLockRelease(DataChecksumsWorkerLock);
+}
+
+/*
+ * ProcessAllDatabases
+ *		Compute the list of all databases and process checksums in each
+ *
+ * This will repeatedly generate a list of databases to process for enabling
+ * checksums. Until no new databases are found, this will loop around computing
+ * a new list and comparing it to the already seen ones.
+ *
+ * If immediate_checkpoint is set to true then a CHECKPOINT_IMMEDIATE will be
+ * issued. This is useful for testing but should be avoided in production use
+ * as it may affect cluster performance drastically.
+ */
+static bool
+ProcessAllDatabases(bool immediate_checkpoint)
+{
+	List	   *DatabaseList;
+	HTAB	   *ProcessedDatabases = NULL;
+	HASHCTL		hash_ctl;
+	bool		found_failed = false;
+	int			flags;
+
+	/* Initialize a hash tracking all processed databases */
+	memset(&hash_ctl, 0, sizeof(hash_ctl));
+	hash_ctl.keysize = sizeof(Oid);
+	hash_ctl.entrysize = sizeof(DataChecksumsWorkerResultEntry);
+	ProcessedDatabases = hash_create("Processed databases",
+									 64,
+									 &hash_ctl,
+									 HASH_ELEM | HASH_BLOBS);
+
+	/*
+	 * Set up so first run processes shared catalogs, but not once in every
+	 * db.
+	 */
+	DataChecksumsWorkerShmem->process_shared_catalogs = true;
+
+	/*
+	 * Get a list of all databases to process. This may include databases that
+	 * were created during our runtime.  Since a database can be created as a
+	 * copy of any other database (which may not have existed in our last
+	 * run), we have to repeat this loop until no new databases show up in the
+	 * list.
+	 */
+	DatabaseList = BuildDatabaseList();
+
+	/*
+	 * Update progress reporting with the total number of databases we need to
+	 * process. This number should not be changed during processing, the
+	 * columns for processed databases is instead increased such that it can
+	 * be compared against the total.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_DB,
+								 list_length(DatabaseList));
+
+	while (true)
+	{
+		int			processed_databases = 0;
+
+		foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+		{
+			DataChecksumsWorkerResult result;
+			DataChecksumsWorkerResultEntry *entry;
+			bool		found;
+
+			/*
+			 * Indicate which database is being processed set the number of
+			 * relations to -1 to clear field from previous values. -1 will
+			 * translate to NULL in the progress view.
+			 */
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_CUR_DB, db->dboid);
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_REL, -1);
+
+			/*
+			 * Check if this database has been processed already, and if so
+			 * whether it should be retried or skipped.
+			 */
+			entry = (DataChecksumsWorkerResultEntry *) hash_search(ProcessedDatabases, &db->dboid,
+																   HASH_FIND, NULL);
+
+			if (entry)
+			{
+				if (entry->result == DATACHECKSUMSWORKER_RETRYDB)
+				{
+					/*
+					 * Limit the number of retries to avoid infinite looping
+					 * in case there simply won't be enough workers in the
+					 * cluster to finish this operation.
+					 */
+					if (entry->retries > DATACHECKSUMSWORKER_MAX_DB_RETRIES)
+						entry->result = DATACHECKSUMSWORKER_FAILED;
+				}
+
+				/* Skip if this database has been processed already */
+				if (entry->result != DATACHECKSUMSWORKER_RETRYDB)
+					continue;
+			}
+
+			result = ProcessDatabase(db);
+			processed_databases++;
+
+			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
+			{
+				/*
+				 * If one database has completed shared catalogs, we don't
+				 * have to process them again.
+				 */
+				if (DataChecksumsWorkerShmem->process_shared_catalogs)
+					DataChecksumsWorkerShmem->process_shared_catalogs = false;
+			}
+			else if (result == DATACHECKSUMSWORKER_ABORTED)
+			{
+				/* Abort flag set, so exit the whole process */
+				return false;
+			}
+
+			entry = hash_search(ProcessedDatabases, &db->dboid, HASH_ENTER, &found);
+			entry->dboid = db->dboid;
+			entry->result = result;
+			if (!found)
+				entry->retries = 0;
+			else
+				entry->retries++;
+		}
+
+		elog(DEBUG1,
+			 "%i databases processed for data checksum enabling, %s",
+			 processed_databases,
+			 (processed_databases ? "process with restart" : "process completed"));
+
+		FreeDatabaseList(DatabaseList);
+
+		/*
+		 * If no databases were processed in this run of the loop, we have now
+		 * finished all databases and no concurrently created ones can exist.
+		 */
+		if (processed_databases == 0)
+			break;
+
+		/*
+		 * Re-generate the list of databases for another pass. Since we wait
+		 * for all pre-existing transactions finish, this way we can be
+		 * certain that there are no databases left without checksums.
+		 */
+		WaitForAllTransactionsToFinish();
+		DatabaseList = BuildDatabaseList();
+	}
+
+	/*
+	 * ProcessedDatabases now has all databases and the results of their
+	 * processing. Failure to enable checksums for a database can be because
+	 * they actually failed for some reason, or because the database was
+	 * dropped between us getting the database list and trying to process it.
+	 * Get a fresh list of databases to detect the second case where the
+	 * database was dropped before we had started processing it. If a database
+	 * still exists, but enabling checksums failed then we fail the entire
+	 * checksumming process and exit with an error.
+	 */
+	WaitForAllTransactionsToFinish();
+	DatabaseList = BuildDatabaseList();
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+	{
+		DataChecksumsWorkerResult *entry;
+		bool		found;
+
+		entry = hash_search(ProcessedDatabases, (void *) &db->dboid,
+							HASH_FIND, &found);
+
+		/*
+		 * We are only interested in the databases where the failed database
+		 * still exists.
+		 */
+		if (found && *entry == DATACHECKSUMSWORKER_FAILED)
+		{
+			ereport(WARNING,
+					(errmsg("failed to enable data checksums in \"%s\"",
+							db->dbname)));
+			found_failed = found;
+			continue;
+		}
+	}
+
+	FreeDatabaseList(DatabaseList);
+
+	if (found_failed)
+	{
+		/* Disable checksums on cluster, because we failed */
+		SetDataChecksumsOff();
+		ereport(ERROR,
+				(errmsg("data checksums failed to get enabled in all databases, aborting"),
+				 errhint("The server log might have more information on the cause of the error.")));
+	}
+
+	/*
+	 * Force a checkpoint to get everything out to disk. The use of immediate
+	 * checkpoints is for running tests, as they would otherwise not execute
+	 * in such a way that they can reliably be placed under timeout control.
+	 */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_IMMEDIATE;
+	RequestCheckpoint(flags);
+
+	return true;
+}
+
+/*
+ * DataChecksumsWorkerShmemSize
+ *		Compute required space for datachecksumsworker-related shared memory
+ */
+Size
+DataChecksumsWorkerShmemSize(void)
+{
+	Size		size;
+
+	size = sizeof(DataChecksumsWorkerShmemStruct);
+	size = MAXALIGN(size);
+
+	return size;
+}
+
+/*
+ * DataChecksumsWorkerShmemInit
+ *		Allocate and initialize datachecksumsworker-related shared memory
+ */
+void
+DataChecksumsWorkerShmemInit(void)
+{
+	bool		found;
+
+	DataChecksumsWorkerShmem = (DataChecksumsWorkerShmemStruct *)
+		ShmemInitStruct("DataChecksumsWorker Data",
+						DataChecksumsWorkerShmemSize(),
+						&found);
+
+	MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
+
+	/*
+	 * Even if this is a redundant assignment, we want to be explicit about
+	 * our intent for readability, since we want to be able to query this
+	 * state in case of restartability.
+	 */
+	DataChecksumsWorkerShmem->launch_enable_checksums = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	DataChecksumsWorkerShmem->launch_fast = false;
+}
+
+/*
+ * BuildDatabaseList
+ *		Compile a list of all currently available databases in the cluster
+ *
+ * This creates the list of databases for the datachecksumsworker workers to
+ * add checksums to. If the caller wants to ensure that no concurrently
+ * running CREATE DATABASE calls exist, this needs to be preceded by a call
+ * to WaitForAllTransactionsToFinish().
+ */
+static List *
+BuildDatabaseList(void)
+{
+	List	   *DatabaseList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(DatabaseRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_database pgdb = (Form_pg_database) GETSTRUCT(tup);
+		DataChecksumsWorkerDatabase *db;
+
+		oldctx = MemoryContextSwitchTo(ctx);
+
+		db = (DataChecksumsWorkerDatabase *) palloc0(sizeof(DataChecksumsWorkerDatabase));
+
+		db->dboid = pgdb->oid;
+		db->dbname = pstrdup(NameStr(pgdb->datname));
+
+		DatabaseList = lappend(DatabaseList, db);
+
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return DatabaseList;
+}
+
+static void
+FreeDatabaseList(List *dblist)
+{
+	if (!dblist)
+		return;
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, dblist)
+	{
+		if (db->dbname != NULL)
+			pfree(db->dbname);
+	}
+
+	list_free_deep(dblist);
+}
+
+/*
+ * BuildRelationList
+ *		Compile a list of relations in the database
+ *
+ * Returns a list of OIDs for the request relation types. If temp_relations
+ * is True then only temporary relations are returned. If temp_relations is
+ * False then non-temporary relations which have data checksums are returned.
+ * If include_shared is True then shared relations are included as well in a
+ * non-temporary list. include_shared has no relevance when building a list of
+ * temporary relations.
+ */
+static List *
+BuildRelationList(bool temp_relations, bool include_shared)
+{
+	List	   *RelationList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(RelationRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_class pgc = (Form_pg_class) GETSTRUCT(tup);
+
+		/*
+		 * Only include temporary relations when asked for a temp relation
+		 * list.
+		 */
+		if (pgc->relpersistence == RELPERSISTENCE_TEMP)
+		{
+			if (!temp_relations)
+				continue;
+		}
+		else
+		{
+			/*
+			 * If we are only interested in temp relations then continue
+			 * immediately as the current relation isn't a temp relation.
+			 */
+			if (temp_relations)
+				continue;
+
+			if (!RELKIND_HAS_STORAGE(pgc->relkind))
+				continue;
+
+			if (pgc->relisshared && !include_shared)
+				continue;
+		}
+
+		oldctx = MemoryContextSwitchTo(ctx);
+		RelationList = lappend_oid(RelationList, pgc->oid);
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return RelationList;
+}
+
+/*
+ * DataChecksumsWorkerMain
+ *
+ * Main function for enabling checksums in a single database, This is the
+ * function set as the bgw_function_name in the dynamic background worker
+ * process initiated for each database by the worker launcher. After enabling
+ * data checksums in each applicable relation in the database, it will wait for
+ * all temporary relations that were present when the function started to
+ * disappear before returning. This is required since we cannot rewrite
+ * existing temporary relations with data checksums.
+ */
+void
+DataChecksumsWorkerMain(Datum arg)
+{
+	Oid			dboid = DatumGetObjectId(arg);
+	List	   *RelationList = NIL;
+	List	   *InitialTempTableList = NIL;
+	BufferAccessStrategy strategy;
+	bool		aborted = false;
+
+	enabling_checksums = true;
+
+	pqsignal(SIGTERM, die);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_WORKER;
+	init_ps_display(NULL);
+
+	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
+											  BGWORKER_BYPASS_ALLOWCONN);
+
+	/*
+	 * Get a list of all temp tables present as we start in this database. We
+	 * need to wait until they are all gone until we are done, since we cannot
+	 * access these relations and modify them.
+	 */
+	InitialTempTableList = BuildRelationList(true, false);
+
+	/*
+	 * Enable vacuum cost delay, if any.
+	 */
+	Assert(DataChecksumsWorkerShmem->enabling_checksums);
+	VacuumCostDelay = DataChecksumsWorkerShmem->cost_delay;
+	VacuumCostLimit = DataChecksumsWorkerShmem->cost_limit;
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumCostPageHit = 0;
+	VacuumCostPageMiss = 0;
+	VacuumCostPageDirty = 0;
+
+	/*
+	 * Create and set the vacuum strategy as our buffer strategy.
+	 */
+	strategy = GetAccessStrategy(BAS_VACUUM);
+
+	RelationList = BuildRelationList(false,
+									 DataChecksumsWorkerShmem->process_shared_catalogs);
+	foreach_oid(reloid, RelationList)
+	{
+		if (!ProcessSingleRelationByOid(reloid, strategy))
+		{
+			aborted = true;
+			break;
+		}
+	}
+	list_free(RelationList);
+
+	if (aborted)
+	{
+		DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+		ereport(DEBUG1,
+				(errmsg("data checksum processing aborted in database OID %u",
+						dboid)));
+		return;
+	}
+
+	/*
+	 * Wait for all temp tables that existed when we started to go away. This
+	 * is necessary since we cannot "reach" them to enable checksums. Any temp
+	 * tables created after we started will already have checksums in them
+	 * (due to the "inprogress-on" state), so no need to wait for those.
+	 */
+	for (;;)
+	{
+		List	   *CurrentTempTables;
+		int			numleft;
+		char		activity[64];
+
+		CurrentTempTables = BuildRelationList(true, false);
+		numleft = 0;
+		foreach_oid(tmptbloid, InitialTempTableList)
+		{
+			if (list_member_oid(CurrentTempTables, tmptbloid))
+				numleft++;
+		}
+		list_free(CurrentTempTables);
+
+		if (numleft == 0)
+			break;
+
+		/*
+		 * At least one temp table is left to wait for, indicate in pgstat
+		 * activity and progress reporting.
+		 */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for %d temp tables to be removed", numleft);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		(void) WaitLatch(MyLatch,
+						 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+						 5000,
+						 WAIT_EVENT_CHECKSUM_ENABLE_FINISHCONDITION);
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		aborted = DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums;
+		LWLockRelease(DataChecksumsWorkerLock);
+
+		if (aborted || abort_requested)
+		{
+			DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+			ereport(DEBUG1,
+					(errmsg("data checksum processing aborted in database OID %u",
+							dboid)));
+			return;
+		}
+	}
+
+	list_free(InitialTempTableList);
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
+}
diff --git a/src/backend/postmaster/meson.build b/src/backend/postmaster/meson.build
index 0008603cfee..ce10ef1059a 100644
--- a/src/backend/postmaster/meson.build
+++ b/src/backend/postmaster/meson.build
@@ -6,6 +6,7 @@ backend_sources += files(
   'bgworker.c',
   'bgwriter.c',
   'checkpointer.c',
+  'datachecksumsworker.c',
   'fork_process.c',
   'interrupt.c',
   'launch_backend.c',
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 24d88f368d8..00cce4c7673 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -186,6 +186,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 		case XLOG_FPW_CHANGE:
 		case XLOG_FPI_FOR_HINT:
 		case XLOG_FPI:
+		case XLOG_CHECKSUMS:
 		case XLOG_OVERWRITE_CONTRECORD:
 		case XLOG_CHECKPOINT_REDO:
 			break;
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 174eed70367..e7761a3ddb6 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -30,6 +30,8 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "postmaster/postmaster.h"
 #include "postmaster/walsummarizer.h"
 #include "replication/logicallauncher.h"
 #include "replication/origin.h"
@@ -148,6 +150,7 @@ CalculateShmemSize(int *num_semaphores)
 	size = add_size(size, WaitEventCustomShmemSize());
 	size = add_size(size, InjectionPointShmemSize());
 	size = add_size(size, SlotSyncShmemSize());
+	size = add_size(size, DataChecksumsWorkerShmemSize());
 
 	/* include additional requested shmem from preload libraries */
 	size = add_size(size, total_addin_request);
@@ -330,6 +333,7 @@ CreateOrAttachShmemStructs(void)
 	PgArchShmemInit();
 	ApplyLauncherShmemInit();
 	SlotSyncShmemInit();
+	DataChecksumsWorkerShmemInit();
 
 	/*
 	 * Set up other modules that need some shared memory space
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 7d201965503..2b13a8cd260 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -18,6 +18,7 @@
 #include <unistd.h>
 
 #include "access/parallel.h"
+#include "access/xlog.h"
 #include "commands/async.h"
 #include "miscadmin.h"
 #include "pgstat.h"
@@ -575,6 +576,18 @@ ProcessProcSignalBarrier(void)
 					case PROCSIGNAL_BARRIER_SMGRRELEASE:
 						processed = ProcessBarrierSmgrRelease();
 						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON:
+						processed = AbsorbChecksumsOnInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_ON:
+						processed = AbsorbChecksumsOnBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF:
+						processed = AbsorbChecksumsOffInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_OFF:
+						processed = AbsorbChecksumsOffBarrier();
+						break;
 				}
 
 				/*
diff --git a/src/backend/storage/page/README b/src/backend/storage/page/README
index e30d7ac59ad..73c36a63908 100644
--- a/src/backend/storage/page/README
+++ b/src/backend/storage/page/README
@@ -10,7 +10,9 @@ http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf, discussed
 2010/12/22 on -hackers list.
 
 Current implementation requires this be enabled system-wide at initdb time, or
-by using the pg_checksums tool on an offline cluster.
+by using the pg_checksums tool on an offline cluster.  Checksums can also be
+enabled at runtime using pg_enable_data_checksums(), and disabled by using
+pg_disable_data_checksums().
 
 The checksum is not valid at all times on a data page!!
 The checksum is valid when the page leaves the shared pool and is checked
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index ecc81aacfc3..8bd5fed8c85 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -98,7 +98,7 @@ PageIsVerifiedExtended(PageData *page, BlockNumber blkno, int flags)
 	 */
 	if (!PageIsNew(page))
 	{
-		if (DataChecksumsEnabled())
+		if (DataChecksumsNeedVerify())
 		{
 			checksum = pg_checksum_page(page, blkno);
 
@@ -1501,7 +1501,7 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 	static char *pageCopy = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return page;
 
 	/*
@@ -1531,7 +1531,7 @@ void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
 	/* If we don't need a checksum, just return */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return;
 
 	((PageHeader) page)->pd_checksum = pg_checksum_page(page, blkno);
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index eb575025596..e7b418a000e 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -370,6 +370,8 @@ pgstat_tracks_io_bktype(BackendType bktype)
 		case B_LOGGER:
 			return false;
 
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 		case B_AUTOVAC_LAUNCHER:
 		case B_AUTOVAC_WORKER:
 		case B_BACKEND:
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index 3c594415bfd..b1b5cdcf36c 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -115,6 +115,8 @@ CHECKPOINT_DELAY_COMPLETE	"Waiting for a backend that blocks a checkpoint from c
 CHECKPOINT_DELAY_START	"Waiting for a backend that blocks a checkpoint from starting."
 CHECKPOINT_DONE	"Waiting for a checkpoint to complete."
 CHECKPOINT_START	"Waiting for a checkpoint to start."
+CHECKSUM_ENABLE_STARTCONDITION	"Waiting for data checksums enabling to start."
+CHECKSUM_ENABLE_FINISHCONDITION	"Waiting for data checksums to be enabled."
 EXECUTE_GATHER	"Waiting for activity from a child process while executing a <literal>Gather</literal> plan node."
 HASH_BATCH_ALLOCATE	"Waiting for an elected Parallel Hash participant to allocate a hash table."
 HASH_BATCH_ELECT	"Waiting to elect a Parallel Hash participant to allocate a hash table."
@@ -346,6 +348,7 @@ WALSummarizer	"Waiting to read or update WAL summarization state."
 DSMRegistry	"Waiting to read or update the dynamic shared memory registry."
 InjectionPoint	"Waiting to read or update information related to injection points."
 SerialControl	"Waiting to read or update shared <filename>pg_serial</filename> state."
+DataChecksumsWorker	"Waiting for data checksumsworker."
 
 #
 # END OF PREDEFINED LWLOCKS (DO NOT CHANGE THIS LINE)
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 9172e1cb9d2..f6bd1e9f0ca 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -274,6 +274,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		cmdtype = PROGRESS_COMMAND_BASEBACKUP;
 	else if (pg_strcasecmp(cmd, "COPY") == 0)
 		cmdtype = PROGRESS_COMMAND_COPY;
+	else if (pg_strcasecmp(cmd, "DATACHECKSUMS") == 0)
+		cmdtype = PROGRESS_COMMAND_DATACHECKSUMS;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -1146,9 +1148,6 @@ pg_stat_get_db_checksum_failures(PG_FUNCTION_ARGS)
 	int64		result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
@@ -1164,9 +1163,6 @@ pg_stat_get_db_checksum_last_failure(PG_FUNCTION_ARGS)
 	TimestampTz result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index dc3521457c7..a071ba6f455 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -293,6 +293,12 @@ GetBackendTypeDesc(BackendType backendType)
 		case B_CHECKPOINTER:
 			backendDesc = gettext_noop("checkpointer");
 			break;
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+			backendDesc = "datachecksumsworker launcher";
+			break;
+		case B_DATACHECKSUMSWORKER_WORKER:
+			backendDesc = "datachecksumsworker worker";
+			break;
 		case B_LOGGER:
 			backendDesc = gettext_noop("logger");
 			break;
@@ -892,7 +898,8 @@ InitializeSessionUserIdStandalone(void)
 	 * workers, in slot sync worker and in background workers.
 	 */
 	Assert(!IsUnderPostmaster || AmAutoVacuumWorkerProcess() ||
-		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess());
+		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess() ||
+		   AmDataChecksumsWorkerProcess());
 
 	/* call only once */
 	Assert(!OidIsValid(AuthenticatedUserId));
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index ee1a9d5d98b..70899e6ef2d 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -739,6 +739,11 @@ InitPostgres(const char *in_dbname, Oid dboid,
 	 */
 	SharedInvalBackendInit(false);
 
+	/*
+	 * Set up backend local cache of Controldata values.
+	 */
+	InitLocalControldata();
+
 	ProcSignalInit(MyCancelKeyValid, MyCancelKey);
 
 	/*
@@ -869,7 +874,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
 					 errhint("You should immediately run CREATE USER \"%s\" SUPERUSER;.",
 							 username != NULL ? username : "postgres")));
 	}
-	else if (AmBackgroundWorkerProcess())
+	else if (AmBackgroundWorkerProcess() || AmDataChecksumsWorkerProcess())
 	{
 		if (username == NULL && !OidIsValid(useroid))
 		{
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index ad25cbb39c5..05cba3a02e3 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -476,6 +476,14 @@ static const struct config_enum_entry wal_compression_options[] = {
 	{NULL, 0, false}
 };
 
+static const struct config_enum_entry data_checksums_options[] = {
+	{"on", DATA_CHECKSUMS_ON, true},
+	{"off", DATA_CHECKSUMS_OFF, true},
+	{"inprogress-on", DATA_CHECKSUMS_INPROGRESS_ON, true},
+	{"inprogress-off", DATA_CHECKSUMS_INPROGRESS_OFF, true},
+	{NULL, 0, false}
+};
+
 /*
  * Options for enum values stored in other modules
  */
@@ -601,7 +609,7 @@ static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
 static int	num_os_semaphores;
-static bool data_checksums;
+static int	data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
@@ -1952,17 +1960,6 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
-	{
-		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
-			gettext_noop("Shows whether data checksums are turned on for this cluster."),
-			NULL,
-			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
-		},
-		&data_checksums,
-		false,
-		NULL, NULL, NULL
-	},
-
 	{
 		{"syslog_sequence_numbers", PGC_SIGHUP, LOGGING_WHERE,
 			gettext_noop("Add sequence number to syslog messages to avoid duplicate suppression."),
@@ -5299,6 +5296,17 @@ struct config_enum ConfigureNamesEnum[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows whether data checksums are turned on for this cluster."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+		},
+		&data_checksums,
+		DATA_CHECKSUMS_OFF, data_checksums_options,
+		NULL, NULL, show_data_checksums
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 867aeddc601..79c3f86357e 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -576,7 +576,7 @@ main(int argc, char *argv[])
 		mode == PG_MODE_DISABLE)
 		pg_fatal("data checksums are already disabled in cluster");
 
-	if (ControlFile->data_checksum_version > 0 &&
+	if (ControlFile->data_checksum_version == DATA_CHECKSUMS_ON &&
 		mode == PG_MODE_ENABLE)
 		pg_fatal("data checksums are already enabled in cluster");
 
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index bd49ea867bf..f48936d3b00 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -14,6 +14,7 @@
 
 #include "common/string.h"
 #include "pg_upgrade.h"
+#include "storage/bufpage.h"
 
 
 /*
@@ -735,6 +736,14 @@ check_control_data(ControlData *oldctrl,
 	 * check_for_isn_and_int8_passing_mismatch().
 	 */
 
+	/*
+	 * If data checksums are in any in-progress state then disallow the
+	 * upgrade. The user should either let the process finish, or turn off
+	 * data checksums, before retrying.
+	 */
+	if (oldctrl->data_checksum_version > PG_DATA_CHECKSUM_VERSION)
+		pg_fatal("checksums are being enabled in the old cluster");
+
 	/*
 	 * We might eventually allow upgrades from checksum to no-checksum
 	 * clusters.
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index d313099c027..0bb32c9c0fc 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -117,7 +117,7 @@ extern PGDLLIMPORT int wal_level;
  * of the bits make it to disk, but the checksum wouldn't match.  Also WAL-log
  * them if forced by wal_log_hints=on.
  */
-#define XLogHintBitIsNeeded() (DataChecksumsEnabled() || wal_log_hints)
+#define XLogHintBitIsNeeded() (wal_log_hints || DataChecksumsNeedWrite())
 
 /* Do we need to WAL-log information required only for Hot Standby and logical replication? */
 #define XLogStandbyInfoActive() (wal_level >= WAL_LEVEL_REPLICA)
@@ -230,7 +230,19 @@ extern XLogRecPtr GetXLogWriteRecPtr(void);
 
 extern uint64 GetSystemIdentifier(void);
 extern char *GetMockAuthenticationNonce(void);
-extern bool DataChecksumsEnabled(void);
+extern bool DataChecksumsNeedWrite(void);
+extern bool DataChecksumsNeedVerify(void);
+extern bool DataChecksumsOnInProgress(void);
+extern bool DataChecksumsOffInProgress(void);
+extern void SetDataChecksumsOnInProgress(void);
+extern void SetDataChecksumsOn(void);
+extern void SetDataChecksumsOff(void);
+extern bool AbsorbChecksumsOnInProgressBarrier(void);
+extern bool AbsorbChecksumsOffInProgressBarrier(void);
+extern bool AbsorbChecksumsOnBarrier(void);
+extern bool AbsorbChecksumsOffBarrier(void);
+extern const char *show_data_checksums(void);
+extern void InitLocalControldata(void);
 extern bool GetDefaultCharSignedness(void);
 extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
 extern Size XLOGShmemSize(void);
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index 2cf8d55d706..ee82d4ce73d 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,6 +25,7 @@
 #include "lib/stringinfo.h"
 #include "pgtime.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/relfilelocator.h"
 
 
@@ -289,6 +290,12 @@ typedef struct xl_restore_point
 	char		rp_name[MAXFNAMELEN];
 } xl_restore_point;
 
+/* Information logged when data checksum level is changed */
+typedef struct xl_checksum_state
+{
+	ChecksumType new_checksumtype;
+} xl_checksum_state;
+
 /* Overwrite of prior contrecord */
 typedef struct xl_overwrite_contrecord
 {
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 63e834a6ce4..9b5a50adf54 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -80,6 +80,7 @@ typedef struct CheckPoint
 /* 0xC0 is used in Postgres 9.5-11 */
 #define XLOG_OVERWRITE_CONTRECORD		0xD0
 #define XLOG_CHECKPOINT_REDO			0xE0
+#define XLOG_CHECKSUMS					0xF0
 
 
 /*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index cede992b6e2..fb0e062b984 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12246,6 +12246,23 @@
   proname => 'jsonb_subscript_handler', prorettype => 'internal',
   proargtypes => 'internal', prosrc => 'jsonb_subscript_handler' },
 
+# data checksum management functions
+{ oid => '9258',
+  descr => 'disable data checksums',
+  proname => 'pg_disable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => '',
+  prosrc => 'disable_data_checksums' },
+
+{ oid => '9257',
+  descr => 'enable data checksums',
+  proname => 'pg_enable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'int4 int4 bool', proallargtypes => '{int4,int4,bool}',
+  proargmodes => '{i,i,i}',
+  proargnames => '{cost_delay,cost_limit,fast}',
+  prosrc => 'enable_data_checksums' },
+
 # collation management functions
 { oid => '3445', descr => 'import collations from operating system',
   proname => 'pg_import_system_collations', procost => '100',
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 7c736e7b03b..1ebd0c792b4 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -157,4 +157,22 @@
 #define PROGRESS_COPY_TYPE_PIPE 3
 #define PROGRESS_COPY_TYPE_CALLBACK 4
 
+/* Progress parameters for PROGRESS_DATACHECKSUMS */
+#define PROGRESS_DATACHECKSUMS_PHASE 0
+#define PROGRESS_DATACHECKSUMS_TOTAL_DB 1
+#define PROGRESS_DATACHECKSUMS_TOTAL_REL 2
+#define PROGRESS_DATACHECKSUMS_PROCESSED_DB 3
+#define PROGRESS_DATACHECKSUMS_PROCESSED_REL 4
+#define PROGRESS_DATACHECKSUMS_CUR_DB 5
+#define PROGRESS_DATACHECKSUMS_CUR_REL 6
+#define PROGRESS_DATACHECKSUMS_CUR_REL_TOTAL_BLOCKS 7
+#define PROGRESS_DATACHECKSUMS_CUR_REL_PROCESSED_BLOCKS 8
+
+/* Phases of datachecksumsworker operation */
+#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING 0
+#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING 1
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS 2
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL 3
+#define PROGRESS_DATACHECKSUMS_DONE 4
+
 #endif
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index a2b63495eec..9923c7f518d 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -365,6 +365,9 @@ typedef enum BackendType
 	B_WAL_SUMMARIZER,
 	B_WAL_WRITER,
 
+	B_DATACHECKSUMSWORKER_LAUNCHER,
+	B_DATACHECKSUMSWORKER_WORKER,
+
 	/*
 	 * Logger is not connected to shared memory and does not have a PGPROC
 	 * entry.
@@ -389,6 +392,7 @@ extern PGDLLIMPORT BackendType MyBackendType;
 #define AmWalReceiverProcess()		(MyBackendType == B_WAL_RECEIVER)
 #define AmWalSummarizerProcess()	(MyBackendType == B_WAL_SUMMARIZER)
 #define AmWalWriterProcess()		(MyBackendType == B_WAL_WRITER)
+#define AmDataChecksumsWorkerProcess()	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
 
 #define AmSpecialWorkerProcess() \
 	(AmAutoVacuumLauncherProcess() || \
diff --git a/src/include/postmaster/datachecksumsworker.h b/src/include/postmaster/datachecksumsworker.h
new file mode 100644
index 00000000000..59c9000d646
--- /dev/null
+++ b/src/include/postmaster/datachecksumsworker.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.h
+ *	  header file for checksum helper background worker
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/postmaster/datachecksumsworker.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef DATACHECKSUMSWORKER_H
+#define DATACHECKSUMSWORKER_H
+
+/* Shared memory */
+extern Size DataChecksumsWorkerShmemSize(void);
+extern void DataChecksumsWorkerShmemInit(void);
+
+/* Start the background processes for enabling or disabling checksums */
+void		StartDataChecksumsWorkerLauncher(bool enable_checksums,
+											 int cost_delay,
+											 int cost_limit,
+											 bool fast);
+
+/* Background worker entrypoints */
+void		DataChecksumsWorkerLauncherMain(Datum arg);
+void		DataChecksumsWorkerMain(Datum arg);
+
+#endif							/* DATACHECKSUMSWORKER_H */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 6646b6f6371..5326782171f 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -205,7 +205,17 @@ typedef PageHeaderData *PageHeader;
  * handling pages.
  */
 #define PG_PAGE_LAYOUT_VERSION		4
+
+/*
+ * Checksum version 0 is used for when data checksums are disabled.
+ * PG_DATA_CHECKSUM_VERSION defines that data checksums are enabled in the
+ * cluster and PG_DATA_CHECKSUM_INPROGRESS_{ON|OFF}_VERSION defines that data
+ * checksums are either currently being enabled or disabled.
+ */
 #define PG_DATA_CHECKSUM_VERSION	1
+#define PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION	2
+#define PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION	3
+
 
 /* ----------------------------------------------------------------
  *						page support functions
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 25d13a798d1..6faff962ef0 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -15,6 +15,14 @@
 
 #include "storage/block.h"
 
+typedef enum ChecksumType
+{
+	DATA_CHECKSUMS_OFF = 0,
+	DATA_CHECKSUMS_ON,
+	DATA_CHECKSUMS_INPROGRESS_ON,
+	DATA_CHECKSUMS_INPROGRESS_OFF
+} ChecksumType;
+
 /*
  * Compute the checksum for a Postgres page.  The page must be aligned on a
  * 4-byte boundary.
diff --git a/src/include/storage/lwlocklist.h b/src/include/storage/lwlocklist.h
index cf565452382..afe39db753c 100644
--- a/src/include/storage/lwlocklist.h
+++ b/src/include/storage/lwlocklist.h
@@ -83,3 +83,4 @@ PG_LWLOCK(49, WALSummarizer)
 PG_LWLOCK(50, DSMRegistry)
 PG_LWLOCK(51, InjectionPoint)
 PG_LWLOCK(52, SerialControl)
+PG_LWLOCK(53, DataChecksumsWorker)
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 114eb1f8f76..61a3e3af9d4 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -447,9 +447,10 @@ extern PGDLLIMPORT PGPROC *PreparedXactProcs;
  * Background writer, checkpointer, WAL writer, WAL summarizer, and archiver
  * run during normal operation.  Startup process and WAL receiver also consume
  * 2 slots, but WAL writer is launched only after startup has exited, so we
- * only need 6 slots.
+ * only need 6 slots to cover these. The DataChecksums worker and launcher
+ * can consume 2 slots when data checksums are enabled or disabled.
  */
-#define NUM_AUXILIARY_PROCS		6
+#define NUM_AUXILIARY_PROCS		8
 
 /* configurable options */
 extern PGDLLIMPORT int DeadlockTimeout;
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 022fd8ed933..8937fa6ed3d 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -54,6 +54,11 @@ typedef enum
 typedef enum
 {
 	PROCSIGNAL_BARRIER_SMGRRELEASE, /* ask smgr to close files */
+
+	PROCSIGNAL_BARRIER_CHECKSUM_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_ON,
 } ProcSignalBarrierType;
 
 /*
diff --git a/src/include/utils/backend_progress.h b/src/include/utils/backend_progress.h
index dda813ab407..c664e92dbfe 100644
--- a/src/include/utils/backend_progress.h
+++ b/src/include/utils/backend_progress.h
@@ -28,6 +28,7 @@ typedef enum ProgressCommandType
 	PROGRESS_COMMAND_CREATE_INDEX,
 	PROGRESS_COMMAND_BASEBACKUP,
 	PROGRESS_COMMAND_COPY,
+	PROGRESS_COMMAND_DATACHECKSUMS,
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	20
diff --git a/src/test/Makefile b/src/test/Makefile
index 511a72e6238..278ce3e8a86 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,16 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = perl postmaster regress isolation modules authentication recovery subscription
+SUBDIRS = \
+		perl \
+		postmaster \
+		regress \
+		isolation \
+		modules \
+		authentication \
+		recovery \
+		subscription \
+		checksum
 
 ifeq ($(with_icu),yes)
 SUBDIRS += icu
diff --git a/src/test/checksum/.gitignore b/src/test/checksum/.gitignore
new file mode 100644
index 00000000000..871e943d50e
--- /dev/null
+++ b/src/test/checksum/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/checksum/Makefile b/src/test/checksum/Makefile
new file mode 100644
index 00000000000..fd03bf73df4
--- /dev/null
+++ b/src/test/checksum/Makefile
@@ -0,0 +1,23 @@
+#-------------------------------------------------------------------------
+#
+# Makefile for src/test/checksum
+#
+# Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/checksum/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/test/checksum
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
+
+clean distclean maintainer-clean:
+	rm -rf tmp_check
diff --git a/src/test/checksum/README b/src/test/checksum/README
new file mode 100644
index 00000000000..0f0317060b3
--- /dev/null
+++ b/src/test/checksum/README
@@ -0,0 +1,22 @@
+src/test/checksum/README
+
+Regression tests for data checksums
+===================================
+
+This directory contains a test suite for enabling data checksums
+in a running cluster.
+
+Running the tests
+=================
+
+    make check
+
+or
+
+    make installcheck
+
+NOTE: This creates a temporary installation (in the case of "check"),
+with multiple nodes, be they master or standby(s) for the purpose of
+the tests.
+
+NOTE: This requires the --enable-tap-tests argument to configure.
diff --git a/src/test/checksum/meson.build b/src/test/checksum/meson.build
new file mode 100644
index 00000000000..5f96b5c246d
--- /dev/null
+++ b/src/test/checksum/meson.build
@@ -0,0 +1,15 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+tests += {
+  'name': 'checksums',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'tests': [
+      't/001_basic.pl',
+      't/002_restarts.pl',
+      't/003_standby_restarts.pl',
+      't/004_offline.pl',
+    ],
+  },
+}
diff --git a/src/test/checksum/t/001_basic.pl b/src/test/checksum/t/001_basic.pl
new file mode 100644
index 00000000000..4c64f6a14fc
--- /dev/null
+++ b/src/test/checksum/t/001_basic.pl
@@ -0,0 +1,88 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are turned off
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable data checksums
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+# Wait for checksums to become enabled
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Enable data checksums again which should be a no-op..
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+# ..and make sure we can still read/write data
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums again
+$node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are disabled');
+
+# Test reading again
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure previously checksummed pages can be read back');
+
+# Re-enable checksums and make sure that the underlying data has changed to
+# ensure that checksums will be different.
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back the data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$node->stop;
+
+done_testing();
diff --git a/src/test/checksum/t/002_restarts.pl b/src/test/checksum/t/002_restarts.pl
new file mode 100644
index 00000000000..2697b722257
--- /dev/null
+++ b/src/test/checksum/t/002_restarts.pl
@@ -0,0 +1,92 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# restarting the processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, '1', "ensure checksums aren't enabled yet");
+
+$bsession->quit;
+$node->stop;
+$node->start;
+
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'inprogress-on', "ensure checksums aren't enabled yet");
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are turned on');
+
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+$result = $node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are turned off');
+
+done_testing();
diff --git a/src/test/checksum/t/003_standby_restarts.pl b/src/test/checksum/t/003_standby_restarts.pl
new file mode 100644
index 00000000000..6c0fe8f3bf8
--- /dev/null
+++ b/src/test/checksum/t/003_standby_restarts.pl
@@ -0,0 +1,131 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# streaming replication
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize primary node
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+$node_primary->init(allows_streaming => 1, no_data_checksums => 1);
+$node_primary->start;
+my $backup_name = 'my_backup';
+
+# Take backup
+$node_primary->backup($backup_name);
+
+# Create streaming standby linking to primary
+my $node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $backup_name,
+	has_streaming => 1);
+$node_standby_1->start;
+
+# Create some content on the primary to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Wait for standbys to catch up
+$node_primary->wait_for_catchup($node_standby_1, 'replay',
+	$node_primary->lsn('insert'));
+
+# Check that checksums are turned off on all nodes
+my $result = $node_primary->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on primary');
+
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on standby_1');
+
+# Enable checksums for the cluster
+$node_primary->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+# Ensure that the primary switches to "inprogress-on"
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	"inprogress-on");
+is($result, 1, 'ensure checksums are in progress on primary');
+
+# Wait for checksum enable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Ensure that the standby has switched to "inprogress-on" or "on".  Normally it
+# would be "inprogress-on", but it is theoretically possible for the primary to
+# complete the checksum enabling *and* have the standby replay that record
+# before we reach the check below.
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting = 'off' FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'f');
+is($result, 1, 'ensure standby has absorbed the inprogress-on barrier');
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+
+is(($result eq 'inprogress-on' || $result eq 'on'),
+	1, 'ensure checksums are on, or in progress, on standby_1');
+
+# Insert some more data which should be checksummed on INSERT
+$node_primary->safe_psql('postgres',
+	"INSERT INTO t VALUES (generate_series(1, 10000));");
+
+# Wait for checksums enabled on the primary
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the primary');
+
+# Wait for checksums enabled on the standby
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the standby');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, '20000', 'ensure we can safely read all data with checksums');
+
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+# Disable checksums and ensure it's propagated to standby and that we can
+# still read all data
+$node_primary->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+# Wait for checksum disable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure data checksums are disabled on the primary 2');
+
+# Ensure that the standby has switched to off
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are off on standby_1');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, "20000", 'ensure we can safely read all data without checksums');
+
+done_testing();
diff --git a/src/test/checksum/t/004_offline.pl b/src/test/checksum/t/004_offline.pl
new file mode 100644
index 00000000000..9cee62c9b52
--- /dev/null
+++ b/src/test/checksum/t/004_offline.pl
@@ -0,0 +1,101 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums offline from various states
+# of checksum processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable checksums offline using pg_checksums
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are enabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums offline again using pg_checksums
+$node->stop;
+$node->checksum_disable_offline;
+$node->start;
+
+# Ensure that checksums are disabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, 1, 'ensure checksums are in the process of being enabled');
+
+# Turn the cluster off and enable checksums offline, then start back up
+$bsession->quit;
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are now enabled even though processing wasn't
+# restarted
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+done_testing();
diff --git a/src/test/meson.build b/src/test/meson.build
index ccc31d6a86a..c4bfef2c0e2 100644
--- a/src/test/meson.build
+++ b/src/test/meson.build
@@ -8,6 +8,7 @@ subdir('postmaster')
 subdir('recovery')
 subdir('subscription')
 subdir('modules')
+subdir('checksum')
 
 if ssl.found()
   subdir('ssl')
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index b105cba05a6..666bd2a2d4c 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3748,6 +3748,40 @@ sub advance_wal
 	}
 }
 
+=item $node->checksum_enable_offline()
+
+Enable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_enable_offline
+{
+	my ($self) = @_;
+
+	print "### Enabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-e');
+	return;
+}
+
+=item checksum_disable_offline
+
+Disable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_disable_offline
+{
+	my ($self) = @_;
+
+	print "### Disabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-d');
+	return;
+}
+
 =pod
 
 =back
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 62f69ac20b2..36bed4b168d 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2041,6 +2041,34 @@ pg_stat_progress_create_index| SELECT s.pid,
     s.param15 AS partitions_done
    FROM (pg_stat_get_progress_info('CREATE INDEX'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
+pg_stat_progress_data_checksums| SELECT s.pid,
+    s.datid,
+    d.datname,
+        CASE s.param1
+            WHEN 0 THEN 'enabling'::text
+            WHEN 1 THEN 'disabling'::text
+            WHEN 2 THEN 'waiting'::text
+            WHEN 3 THEN 'waiting on backends'::text
+            WHEN 4 THEN 'waiting on temporary tables'::text
+            WHEN 5 THEN 'done'::text
+            ELSE NULL::text
+        END AS phase,
+        CASE s.param2
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param2
+        END AS databases_total,
+        CASE s.param3
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param3
+        END AS relations_total,
+    s.param4 AS databases_processed,
+    s.param5 AS relations_processed,
+    s.param6 AS databases_current,
+    s.param7 AS relation_current,
+    s.param8 AS relation_current_blocks,
+    s.param9 AS relation_current_blocks_processed
+   FROM (pg_stat_get_progress_info('DATACHECKSUMS'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)));
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 9840060997f..86e4057f61d 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -403,6 +403,7 @@ CheckPointStmt
 CheckpointStatsData
 CheckpointerRequest
 CheckpointerShmemStruct
+ChecksumType
 Chromosome
 CkptSortItem
 CkptTsStatus
@@ -593,6 +594,10 @@ DataPageDeleteStack
 DataTypesUsageChecks
 DataTypesUsageVersionCheck
 DatabaseInfo
+DataChecksumsWorkerDatabase
+DataChecksumsWorkerResult
+DataChecksumsWorkerResultEntry
+DataChecksumsWorkerShmemStruct
 DateADT
 DateTimeErrorExtra
 Datum
@@ -4141,6 +4146,7 @@ xl_btree_split
 xl_btree_unlink_page
 xl_btree_update
 xl_btree_vacuum
+xl_checksum_state
 xl_clog_truncate
 xl_commit_ts_truncate
 xl_dbase_create_file_copy_rec
-- 
2.48.1

v20250310d-0002-simple-post-rebase-fixes.patchtext/x-patch; charset=UTF-8; name=v20250310d-0002-simple-post-rebase-fixes.patchDownload
From 8a72efd62662efd8672d84d2872fbf54173c9337 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Fri, 7 Mar 2025 20:19:29 +0100
Subject: [PATCH v20250310d 2/5] simple post-rebase fixes

- Update checks in PostmasterStateMachine to account for datachecksum
  workers, etc.

- Remove pgstat_bestart() call - it would need to be _initial(), but I
  don't think it's needed.

- Update vacuum_delay_point() call.

- Cast PID to long in elog call (same as we do in postmaster.c)

- Fix test 003_standby_restarts by adding a replication slot

- Fix DataChecksumsWorkerShmemInit to only zero the memory once.

- Update child_process_kinds to keep it in sync with BackendType.

- Fix expected output for stats regression test.
---
 src/backend/postmaster/datachecksumsworker.c | 28 ++++++++++----------
 src/backend/postmaster/launch_backend.c      |  3 +++
 src/backend/postmaster/postmaster.c          |  5 ++++
 src/backend/utils/activity/pgstat_backend.c  |  2 ++
 src/test/checksum/t/003_standby_restarts.pl  | 10 ++++++-
 src/test/regress/expected/stats.out          | 18 ++++++++++++-
 6 files changed, 50 insertions(+), 16 deletions(-)

diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
index de7a077f9c2..6df92684a3b 100644
--- a/src/backend/postmaster/datachecksumsworker.c
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -450,7 +450,7 @@ ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrateg
 		if (abort_requested)
 			return false;
 
-		vacuum_delay_point();
+		vacuum_delay_point(false);
 	}
 
 	pfree(relns);
@@ -587,7 +587,7 @@ ProcessDatabase(DataChecksumsWorkerDatabase *db)
 					db->dbname)));
 
 	snprintf(activity, sizeof(activity) - 1,
-			 "Waiting for worker in database %s (pid %d)", db->dbname, pid);
+			 "Waiting for worker in database %s (pid %ld)", db->dbname, (long) pid);
 	pgstat_report_activity(STATE_RUNNING, activity);
 
 	status = WaitForBackgroundWorkerShutdown(bgw_handle);
@@ -752,9 +752,6 @@ DataChecksumsWorkerLauncherMain(Datum arg)
 	 */
 	BackgroundWorkerInitializeConnectionByOid(InvalidOid, InvalidOid, 0);
 
-	/* Initialize backend status information */
-	pgstat_bestart();
-
 	enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
 	DataChecksumsWorkerShmem->launcher_running = true;
 	DataChecksumsWorkerShmem->enabling_checksums = enabling_checksums;
@@ -1095,16 +1092,19 @@ DataChecksumsWorkerShmemInit(void)
 						DataChecksumsWorkerShmemSize(),
 						&found);
 
-	MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
+	if (!found)
+	{
+		MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
 
-	/*
-	 * Even if this is a redundant assignment, we want to be explicit about
-	 * our intent for readability, since we want to be able to query this
-	 * state in case of restartability.
-	 */
-	DataChecksumsWorkerShmem->launch_enable_checksums = false;
-	DataChecksumsWorkerShmem->launcher_running = false;
-	DataChecksumsWorkerShmem->launch_fast = false;
+		/*
+		 * Even if this is a redundant assignment, we want to be explicit about
+		 * our intent for readability, since we want to be able to query this
+		 * state in case of restartability.
+		 */
+		DataChecksumsWorkerShmem->launch_enable_checksums = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		DataChecksumsWorkerShmem->launch_fast = false;
+	}
 }
 
 /*
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 47375e5bfaa..92d8017fd56 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -202,6 +202,9 @@ static child_process_kind child_process_kinds[] = {
 	[B_WAL_SUMMARIZER] = {"wal_summarizer", WalSummarizerMain, true},
 	[B_WAL_WRITER] = {"wal_writer", WalWriterMain, true},
 
+	[B_DATACHECKSUMSWORKER_LAUNCHER] = {"datachecksum launcher", NULL, false},
+	[B_DATACHECKSUMSWORKER_WORKER] = {"datachecksum worker", NULL, false},
+
 	[B_LOGGER] = {"syslogger", SysLoggerMain, false},
 };
 
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index d2a7a7add6f..2fc438987b5 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2947,6 +2947,11 @@ PostmasterStateMachine(void)
 									B_INVALID,
 									B_STANDALONE_BACKEND);
 
+			/* also add checksumming processes */
+			remainMask = btmask_add(remainMask,
+									B_DATACHECKSUMSWORKER_LAUNCHER,
+									B_DATACHECKSUMSWORKER_WORKER);
+
 			/* All types should be included in targetMask or remainMask */
 			Assert((remainMask.mask | targetMask.mask) == BTYPE_MASK_ALL.mask);
 		}
diff --git a/src/backend/utils/activity/pgstat_backend.c b/src/backend/utils/activity/pgstat_backend.c
index 6efbb650aa8..bd458f8c1af 100644
--- a/src/backend/utils/activity/pgstat_backend.c
+++ b/src/backend/utils/activity/pgstat_backend.c
@@ -295,6 +295,8 @@ pgstat_tracks_backend_bktype(BackendType bktype)
 		case B_BG_WRITER:
 		case B_CHECKPOINTER:
 		case B_STARTUP:
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 			return false;
 
 		case B_AUTOVAC_WORKER:
diff --git a/src/test/checksum/t/003_standby_restarts.pl b/src/test/checksum/t/003_standby_restarts.pl
index 6c0fe8f3bf8..6782664f4e6 100644
--- a/src/test/checksum/t/003_standby_restarts.pl
+++ b/src/test/checksum/t/003_standby_restarts.pl
@@ -20,15 +20,23 @@ my $enable_params = '0, 100, true';
 my $node_primary = PostgreSQL::Test::Cluster->new('primary');
 $node_primary->init(allows_streaming => 1, no_data_checksums => 1);
 $node_primary->start;
-my $backup_name = 'my_backup';
+
+my $slotname = 'physical_slot';
+$node_primary->safe_psql('postgres',
+	"SELECT pg_create_physical_replication_slot('$slotname')");
 
 # Take backup
+my $backup_name = 'my_backup';
 $node_primary->backup($backup_name);
 
 # Create streaming standby linking to primary
 my $node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
 $node_standby_1->init_from_backup($node_primary, $backup_name,
 	has_streaming => 1);
+$node_standby_1->append_conf(
+	'postgresql.conf', qq[
+primary_slot_name = '$slotname'
+]);
 $node_standby_1->start;
 
 # Create some content on the primary to have un-checksummed data in the cluster
diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 30d763c4aee..da6645f0d7b 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -51,6 +51,22 @@ client backend|relation|vacuum
 client backend|temp relation|normal
 client backend|wal|init
 client backend|wal|normal
+datachecksumsworker launcher|relation|bulkread
+datachecksumsworker launcher|relation|bulkwrite
+datachecksumsworker launcher|relation|init
+datachecksumsworker launcher|relation|normal
+datachecksumsworker launcher|relation|vacuum
+datachecksumsworker launcher|temp relation|normal
+datachecksumsworker launcher|wal|init
+datachecksumsworker launcher|wal|normal
+datachecksumsworker worker|relation|bulkread
+datachecksumsworker worker|relation|bulkwrite
+datachecksumsworker worker|relation|init
+datachecksumsworker worker|relation|normal
+datachecksumsworker worker|relation|vacuum
+datachecksumsworker worker|temp relation|normal
+datachecksumsworker worker|wal|init
+datachecksumsworker worker|wal|normal
 slotsync worker|relation|bulkread
 slotsync worker|relation|bulkwrite
 slotsync worker|relation|init
@@ -87,7 +103,7 @@ walsummarizer|wal|init
 walsummarizer|wal|normal
 walwriter|wal|init
 walwriter|wal|normal
-(71 rows)
+(87 rows)
 \a
 -- ensure that both seqscan and indexscan plans are allowed
 SET enable_seqscan TO on;
-- 
2.48.1

v20250310d-0003-sync-the-data_checksums-GUC-with-the-loca.patchtext/x-patch; charset=UTF-8; name=v20250310d-0003-sync-the-data_checksums-GUC-with-the-loca.patchDownload
From 3ad310ac23abc138e926dc72c9b4a3d4c8e45c28 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Fri, 7 Mar 2025 23:07:44 +0100
Subject: [PATCH v20250310d 3/5] sync the data_checksums GUC with the local
 variable

We now have three places that in some way express the state of data
checksums in the instance.

- control file / data_checksum_version

- LocalDataChecksumVersion as a local cache to reduce locking

- data_checksums backing the GUC

We need to keep the GUC variable in sync, to ensure we get the correct
value even early during startup (e.g. with -C).

Introduces a new SetLocalDataChecksumVersion() which sets the local
variable, and also updates the data_checksum GUC at the same time.
---
 src/backend/access/transam/xlog.c   | 51 +++++++++++++++++++++++++----
 src/backend/utils/misc/guc_tables.c |  3 +-
 src/include/access/xlog.h           |  1 +
 3 files changed, 47 insertions(+), 8 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index b5190a7e104..f137cdc6d42 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -657,6 +657,14 @@ static bool updateMinRecoveryPoint = true;
  */
 static uint32 LocalDataChecksumVersion = 0;
 
+/*
+ * Variable backing the GUC, keep it in sync with LocalDataChecksumVersion.
+ * See SetLocalDataChecksumVersion().
+ */
+int			data_checksums = 0;
+
+static void SetLocalDataChecksumVersion(uint32 data_checksum_version);
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -4594,7 +4602,7 @@ ReadControlFile(void)
 
 	CalculateCheckpointSegments();
 
-	LocalDataChecksumVersion = ControlFile->data_checksum_version;
+	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
 }
 
 /*
@@ -4913,7 +4921,7 @@ SetDataChecksumsOff(void)
 bool
 AbsorbChecksumsOnInProgressBarrier(void)
 {
-	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
 	return true;
 }
 
@@ -4921,21 +4929,21 @@ bool
 AbsorbChecksumsOnBarrier(void)
 {
 	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
-	LocalDataChecksumVersion = PG_DATA_CHECKSUM_VERSION;
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_VERSION);
 	return true;
 }
 
 bool
 AbsorbChecksumsOffInProgressBarrier(void)
 {
-	LocalDataChecksumVersion = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
 	return true;
 }
 
 bool
 AbsorbChecksumsOffBarrier(void)
 {
-	LocalDataChecksumVersion = 0;
+	SetLocalDataChecksumVersion(0);
 	return true;
 }
 
@@ -4951,10 +4959,41 @@ void
 InitLocalControldata(void)
 {
 	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
-	LocalDataChecksumVersion = ControlFile->data_checksum_version;
+	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
 	LWLockRelease(ControlFileLock);
 }
 
+/*
+ * XXX probably should be called in all places that modify the value of
+ * LocalDataChecksumVersion (to make sure data_checksums GUC is in sync)
+ *
+ * XXX aren't PG_DATA_ and DATA_ constants the same? why do we need both?
+ */
+void
+SetLocalDataChecksumVersion(uint32 data_checksum_version)
+{
+	LocalDataChecksumVersion = data_checksum_version;
+
+	switch (LocalDataChecksumVersion)
+	{
+		case PG_DATA_CHECKSUM_VERSION:
+			data_checksums = DATA_CHECKSUMS_ON;
+			break;
+
+		case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+			data_checksums = DATA_CHECKSUMS_INPROGRESS_ON;
+			break;
+
+		case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+			data_checksums = DATA_CHECKSUMS_INPROGRESS_OFF;
+			break;
+
+		default:
+			data_checksums = DATA_CHECKSUMS_OFF;
+			break;
+	}
+}
+
 /* guc hook */
 const char *
 show_data_checksums(void)
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 05cba3a02e3..66abf056159 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -609,7 +609,6 @@ static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
 static int	num_os_semaphores;
-static int	data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
@@ -5300,7 +5299,7 @@ struct config_enum ConfigureNamesEnum[] =
 		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
 			gettext_noop("Shows whether data checksums are turned on for this cluster."),
 			NULL,
-			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
 		},
 		&data_checksums,
 		DATA_CHECKSUMS_OFF, data_checksums_options,
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 0bb32c9c0fc..aec3ea0bc63 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -56,6 +56,7 @@ extern PGDLLIMPORT int CommitDelay;
 extern PGDLLIMPORT int CommitSiblings;
 extern PGDLLIMPORT bool track_wal_io_timing;
 extern PGDLLIMPORT int wal_decode_buffer_size;
+extern PGDLLIMPORT int data_checksums;
 
 extern PGDLLIMPORT int CheckPointSegments;
 
-- 
2.48.1

v20250310d-0004-make-progress-reporting-work.patchtext/x-patch; charset=UTF-8; name=v20250310d-0004-make-progress-reporting-work.patchDownload
From 6e5f04d2ae4e037f80fb0d3bedd73e5fe54d79b0 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Sat, 8 Mar 2025 19:34:50 +0100
Subject: [PATCH v20250310d 4/5] make progress reporting work

- Splits the progress status by worker type - one row for launcher,
  one row for checksum worker (might be more with parallel workers).

- The launcher only updates database counters, the workers only set
  relation/block counters.

- Also reworks the columns in the system view a bit, discards the
  "current" fields (we still know the database for each worker).

- Issue: Not sure what to do about relation forks, at this point it
  tracks only blocks for MAIN fork.
---
 src/backend/catalog/system_views.sql         | 36 ++++----
 src/backend/postmaster/datachecksumsworker.c | 91 +++++++++++++++++---
 src/include/commands/progress.h              | 27 +++---
 src/test/regress/expected/rules.out          | 29 ++++---
 4 files changed, 131 insertions(+), 52 deletions(-)

diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 0d62976dc1f..4330d0ad656 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1335,25 +1335,25 @@ CREATE VIEW pg_stat_progress_copy AS
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
 CREATE VIEW pg_stat_progress_data_checksums AS
-	SELECT
-		S.pid AS pid, S.datid AS datid, D.datname AS datname,
-		CASE S.param1 WHEN 0 THEN 'enabling'
+    SELECT
+        S.pid AS pid, S.datid, D.datname AS datname,
+        CASE S.param1 WHEN 0 THEN 'enabling'
                       WHEN 1 THEN 'disabling'
-					  WHEN 2 THEN 'waiting'
-					  WHEN 3 THEN 'waiting on backends'
-					  WHEN 4 THEN 'waiting on temporary tables'
-					  WHEN 5 THEN 'done'
-					  END AS phase,
-		CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
-		CASE S.param3 WHEN -1 THEN NULL ELSE S.param3 END AS relations_total,
-		S.param4 AS databases_processed,
-		S.param5 AS relations_processed,
-		S.param6 AS databases_current,
-		S.param7 AS relation_current,
-		S.param8 AS relation_current_blocks,
-		S.param9 AS relation_current_blocks_processed
-	FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
-		LEFT JOIN pg_database D ON S.datid = D.oid;
+                      WHEN 2 THEN 'waiting'
+                      WHEN 3 THEN 'waiting on backends'
+                      WHEN 4 THEN 'waiting on temporary tables'
+                      WHEN 5 THEN 'waiting on checkpoint'
+                      WHEN 6 THEN 'done'
+                      END AS phase,
+        CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+        S.param3 AS databases_done,
+        CASE S.param4 WHEN -1 THEN NULL ELSE S.param4 END AS relations_total,
+        CASE S.param5 WHEN -1 THEN NULL ELSE S.param5 END AS relations_done,
+        CASE S.param6 WHEN -1 THEN NULL ELSE S.param6 END AS blocks_total,
+        CASE S.param7 WHEN -1 THEN NULL ELSE S.param7 END AS blocks_done
+    FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid
+    ORDER BY S.datid; -- return the launcher process first
 
 CREATE VIEW pg_user_mappings AS
     SELECT
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
index 6df92684a3b..bbbce61cfa6 100644
--- a/src/backend/postmaster/datachecksumsworker.c
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -409,6 +409,11 @@ ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrateg
 			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
 	pgstat_report_activity(STATE_RUNNING, activity);
 
+	/* XXX only do this for main forks, maybe we should do it for all? */
+	if (forkNum == MAIN_FORKNUM)
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+									 numblocks);
+
 	/*
 	 * We are looping over the blocks which existed at the time of process
 	 * start, which is safe since new blocks are created with checksums set
@@ -450,6 +455,11 @@ ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrateg
 		if (abort_requested)
 			return false;
 
+		/* XXX only do this for main forks, maybe we should do it for all? */
+		if (forkNum == MAIN_FORKNUM)
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+										 (blknum + 1));
+
 		vacuum_delay_point(false);
 	}
 
@@ -798,6 +808,8 @@ again:
 		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
 									 PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS);
 
+		/* XXX isn't it weird there's no wait between the phase updates? */
+
 		/*
 		 * Set the state to inprogress-on and wait on the procsignal barrier.
 		 */
@@ -908,8 +920,29 @@ ProcessAllDatabases(bool immediate_checkpoint)
 	 * columns for processed databases is instead increased such that it can
 	 * be compared against the total.
 	 */
-	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_DB,
-								 list_length(DatabaseList));
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_DBS_TOTAL,
+			PROGRESS_DATACHECKSUMS_DBS_DONE,
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE,
+			PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+			PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+		};
+
+		int64	vals[6];
+
+		vals[0] = list_length(DatabaseList);
+		vals[1] = 0;
+
+		/* translated to NULL */
+		vals[2] = -1;
+		vals[3] = -1;
+		vals[4] = -1;
+		vals[5] = -1;
+
+		pgstat_progress_update_multi_param(6, index, vals);
+	}
 
 	while (true)
 	{
@@ -921,14 +954,6 @@ ProcessAllDatabases(bool immediate_checkpoint)
 			DataChecksumsWorkerResultEntry *entry;
 			bool		found;
 
-			/*
-			 * Indicate which database is being processed set the number of
-			 * relations to -1 to clear field from previous values. -1 will
-			 * translate to NULL in the progress view.
-			 */
-			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_CUR_DB, db->dboid);
-			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_TOTAL_REL, -1);
-
 			/*
 			 * Check if this database has been processed already, and if so
 			 * whether it should be retried or skipped.
@@ -957,6 +982,12 @@ ProcessAllDatabases(bool immediate_checkpoint)
 			result = ProcessDatabase(db);
 			processed_databases++;
 
+			/*
+			 * Update the number of processed databases in the progress report.
+			 */
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_DBS_DONE,
+										 processed_databases);
+
 			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
 			{
 				/*
@@ -1050,6 +1081,13 @@ ProcessAllDatabases(bool immediate_checkpoint)
 				 errhint("The server log might have more information on the cause of the error.")));
 	}
 
+	/*
+	 * When enabling checksums, we have to wait for a checkpoint for the
+	 * checksums to e.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT);
+
 	/*
 	 * Force a checkpoint to get everything out to disk. The use of immediate
 	 * checkpoints is for running tests, as they would otherwise not execute
@@ -1258,6 +1296,7 @@ DataChecksumsWorkerMain(Datum arg)
 	List	   *InitialTempTableList = NIL;
 	BufferAccessStrategy strategy;
 	bool		aborted = false;
+	int64		rels_done;
 
 	enabling_checksums = true;
 
@@ -1271,6 +1310,10 @@ DataChecksumsWorkerMain(Datum arg)
 	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
 											  BGWORKER_BYPASS_ALLOWCONN);
 
+	/* worker will have a separate entry in pg_stat_progress_data_checksums */
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
 	/*
 	 * Get a list of all temp tables present as we start in this database. We
 	 * need to wait until they are all gone until we are done, since we cannot
@@ -1297,6 +1340,24 @@ DataChecksumsWorkerMain(Datum arg)
 
 	RelationList = BuildRelationList(false,
 									 DataChecksumsWorkerShmem->process_shared_catalogs);
+
+	/* Update the total number of relations to be processed in this DB. */
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE
+		};
+
+		int64	vals[2];
+
+		vals[0] = list_length(RelationList);
+		vals[1] = 0;
+
+		pgstat_progress_update_multi_param(2, index, vals);
+	}
+
+	/* process the relations */
+	rels_done = 0;
 	foreach_oid(reloid, RelationList)
 	{
 		if (!ProcessSingleRelationByOid(reloid, strategy))
@@ -1304,6 +1365,9 @@ DataChecksumsWorkerMain(Datum arg)
 			aborted = true;
 			break;
 		}
+
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_RELS_DONE,
+									 ++rels_done);
 	}
 	list_free(RelationList);
 
@@ -1316,6 +1380,10 @@ DataChecksumsWorkerMain(Datum arg)
 		return;
 	}
 
+	/* The worker is about to wait for temporary tables to go away. */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL);
+
 	/*
 	 * Wait for all temp tables that existed when we started to go away. This
 	 * is necessary since we cannot "reach" them to enable checksums. Any temp
@@ -1372,5 +1440,8 @@ DataChecksumsWorkerMain(Datum arg)
 
 	list_free(InitialTempTableList);
 
+	/* worker done */
+	pgstat_progress_end_command();
+
 	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
 }
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 1ebd0c792b4..94b478a6cc9 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -158,21 +158,20 @@
 #define PROGRESS_COPY_TYPE_CALLBACK 4
 
 /* Progress parameters for PROGRESS_DATACHECKSUMS */
-#define PROGRESS_DATACHECKSUMS_PHASE 0
-#define PROGRESS_DATACHECKSUMS_TOTAL_DB 1
-#define PROGRESS_DATACHECKSUMS_TOTAL_REL 2
-#define PROGRESS_DATACHECKSUMS_PROCESSED_DB 3
-#define PROGRESS_DATACHECKSUMS_PROCESSED_REL 4
-#define PROGRESS_DATACHECKSUMS_CUR_DB 5
-#define PROGRESS_DATACHECKSUMS_CUR_REL 6
-#define PROGRESS_DATACHECKSUMS_CUR_REL_TOTAL_BLOCKS 7
-#define PROGRESS_DATACHECKSUMS_CUR_REL_PROCESSED_BLOCKS 8
+#define PROGRESS_DATACHECKSUMS_PHASE		0
+#define PROGRESS_DATACHECKSUMS_DBS_TOTAL	1
+#define PROGRESS_DATACHECKSUMS_DBS_DONE		2
+#define PROGRESS_DATACHECKSUMS_RELS_TOTAL	3
+#define PROGRESS_DATACHECKSUMS_RELS_DONE	4
+#define PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL	5
+#define PROGRESS_DATACHECKSUMS_BLOCKS_DONE	6
 
 /* Phases of datachecksumsworker operation */
-#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING 0
-#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING 1
-#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS 2
-#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL 3
-#define PROGRESS_DATACHECKSUMS_DONE 4
+#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING			0
+#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING			1
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING			2
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS	3
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL	4
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT 5
 
 #endif
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 36bed4b168d..2cfea837554 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2050,25 +2050,34 @@ pg_stat_progress_data_checksums| SELECT s.pid,
             WHEN 2 THEN 'waiting'::text
             WHEN 3 THEN 'waiting on backends'::text
             WHEN 4 THEN 'waiting on temporary tables'::text
-            WHEN 5 THEN 'done'::text
+            WHEN 5 THEN 'waiting on checkpoint'::text
+            WHEN 6 THEN 'done'::text
             ELSE NULL::text
         END AS phase,
         CASE s.param2
             WHEN '-1'::integer THEN NULL::bigint
             ELSE s.param2
         END AS databases_total,
-        CASE s.param3
+    s.param3 AS databases_done,
+        CASE s.param4
             WHEN '-1'::integer THEN NULL::bigint
-            ELSE s.param3
+            ELSE s.param4
         END AS relations_total,
-    s.param4 AS databases_processed,
-    s.param5 AS relations_processed,
-    s.param6 AS databases_current,
-    s.param7 AS relation_current,
-    s.param8 AS relation_current_blocks,
-    s.param9 AS relation_current_blocks_processed
+        CASE s.param5
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param5
+        END AS relations_done,
+        CASE s.param6
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param6
+        END AS blocks_total,
+        CASE s.param7
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param7
+        END AS blocks_done
    FROM (pg_stat_get_progress_info('DATACHECKSUMS'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
-     LEFT JOIN pg_database d ON ((s.datid = d.oid)));
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)))
+  ORDER BY s.datid;
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,
-- 
2.48.1

v20250310d-0005-update-docs.patchtext/x-patch; charset=UTF-8; name=v20250310d-0005-update-docs.patchDownload
From 1f053c9b7d1e12030b10240a54008b6f1e898b93 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Sun, 9 Mar 2025 13:34:10 +0100
Subject: [PATCH v20250310d 5/5] update docs

---
 doc/src/sgml/monitoring.sgml | 84 ++++++++++++++++++++----------------
 doc/src/sgml/wal.sgml        |  8 ++--
 2 files changed, 51 insertions(+), 41 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 4a6ef5f1605..2a568ee2357 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6804,8 +6804,8 @@ FROM pg_stat_get_backend_idset() AS backendid;
   <para>
    When data checksums are being enabled on a running cluster, the
    <structname>pg_stat_progress_data_checksums</structname> view will contain
-   a row for each background worker which is currently calculating checksums
-   for the data pages.
+   a row for the launcher process, and one row for each worker process which
+   is currently calculating checksums for the data pages in one database.
   </para>
 
   <table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
@@ -6837,37 +6837,33 @@ FROM pg_stat_get_backend_idset() AS backendid;
      </row>
 
      <row>
-      <entry role="catalog_table_entry">
-       <para role="column_definition">
-        <structfield>phase</structfield> <type>text</type>
-       </para>
-       <para>
-        Current processing phase, see <xref linkend="datachecksum-phases"/>
-        for description of the phases.
-       </para>
-      </entry>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datid</structfield> <type>oid</type>
+      </para>
+      <para>
+       OID of this database, or 0 for the launcher process
+       relation
+      </para></entry>
      </row>
 
      <row>
-      <entry role="catalog_table_entry">
-       <para role="column_definition">
-        <structfield>databases_total</structfield> <type>integer</type>
-       </para>
-       <para>
-        The total number of databases which will be processed.
-       </para>
-      </entry>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datname</structfield> <type>name</type>
+      </para>
+      <para>
+       Name of this database, or <literal>NULL</literal> for the
+       launcher process.
+      </para></entry>
      </row>
 
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>relations_total</structfield> <type>integer</type>
+        <structfield>phase</structfield> <type>text</type>
        </para>
        <para>
-        The total number of relations which will be processed, or
-        <literal>NULL</literal> if the data checksums worker process hasn't
-        calculated the number of relations yet.
+        Current processing phase, see <xref linkend="datachecksum-phases"/>
+        for description of the phases.
        </para>
       </entry>
      </row>
@@ -6875,10 +6871,12 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>databases_processed</structfield> <type>integer</type>
+        <structfield>databases_total</structfield> <type>integer</type>
        </para>
        <para>
-        The number of databases which have been processed.
+        The total number of databases which will be processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6886,10 +6884,12 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>relations_processed</structfield> <type>integer</type>
+        <structfield>databases_done</structfield> <type>integer</type>
        </para>
        <para>
-        The number of relations which have been processed.
+        The number of databases which have been processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6897,10 +6897,13 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>databases_current</structfield> <type>oid</type>
+        <structfield>relations_total</structfield> <type>integer</type>
        </para>
        <para>
-        OID of the database currently being processed.
+        The total number of relations which will be processed, or
+        <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of relations yet. The launcher process has
+        this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6908,10 +6911,11 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>relation_current</structfield> <type>oid</type>
+        <structfield>relations_done</structfield> <type>integer</type>
        </para>
        <para>
-        OID of the relation currently being processed.
+        The number of relations which have been processed. The launcher
+        process has this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6919,10 +6923,13 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>relation_current_blocks</structfield> <type>integer</type>
+        <structfield>blocks_total</structfield> <type>integer</type>
        </para>
        <para>
-        The total number of blocks in the relation currently being processed.
+        The number of blocks in the current relation which will be processed,
+        or <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of blocks yet. The launcher process has
+        this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6930,11 +6937,11 @@ FROM pg_stat_get_backend_idset() AS backendid;
      <row>
       <entry role="catalog_table_entry">
        <para role="column_definition">
-        <structfield>relation_current_blocks_processed</structfield> <type>integer</type>
+        <structfield>blocks_done</structfield> <type>integer</type>
        </para>
        <para>
-        The number of blocks which have been processed in the relation currently
-        being processed.
+        The number of blocks in the current relation which have been processed.
+        The launcher process has this <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6982,9 +6989,10 @@ FROM pg_stat_get_backend_idset() AS backendid;
       </entry>
      </row>
      <row>
-      <entry><literal>done</literal></entry>
+      <entry><literal>waiting on checkpoint</literal></entry>
       <entry>
-       The command has finished processing and is exiting.
+       The command is currently waiting for a checkpoint to update the checksum
+       state at the end.
       </entry>
      </row>
     </tbody>
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index 2562b1a002c..0ada90ca0b1 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -310,15 +310,17 @@
     If the cluster is stopped while in <literal>inprogress-on</literal> mode, for
     any reason, then this process must be restarted manually. To do this,
     re-execute the function <function>pg_enable_data_checksums()</function>
-    once the cluster has been restarted. The background worker will attempt
-    to resume the work from where it was interrupted.
+    once the cluster has been restarted. The process will start over, there is
+    no support for resuming work from where it was interrupted.
    </para>
 
    <note>
     <para>
      Enabling checksums can cause significant I/O to the system, as most of the
      database pages will need to be rewritten, and will be written both to the
-     data files and the WAL.
+     data files and the WAL. The impact may be limited by throttling using the
+     <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter>
+     parameters of the <function>pg_enable_data_checksums</function> function.
     </para>
    </note>
 
-- 
2.48.1

#25Daniel Gustafsson
daniel@yesql.se
In reply to: Tomas Vondra (#24)
Re: Changing the state of data checksums in a running cluster

On 10 Mar 2025, at 12:17, Tomas Vondra <tomas@vondra.me> wrote:

On 3/10/25 10:46, Tomas Vondra wrote:

On 3/10/25 01:18, Tomas Vondra wrote:

Thank you so much for picking up and fixing the blockers, it's highly appreciated!

For me, this passes all CI tests, hopefully cfbot will be happy too.

Confirmed, it compiles clean, builds docs and passes all tests for me as well.

A few comments from reading over your changes:

+   launcher worker has this value set, the other worker processes
+   have this <literal>NULL</literal>.
There seems to be a word or two missing (same in a few places), should this be
"have this set to NULL"?
+   The command is currently waiting for a checkpoint to update the checksum
+   state at the end.
s/at the end/before finishing/?

+ * XXX aren't PG_DATA_ and DATA_ constants the same? why do we need both?
They aren't mapping 1:1 as PG_DATA_ has the version numbers, and if checksums
aren't enabled there is no version and thus there is no PG_DATA_CHECKSUMS_OFF.
This could of course be remedied. IIRC one reason for adding the enum was to
get compiler warnings on missing cases when switch()ing over the value, but I
don't think the current code has any switch.

+ /* XXX isn't it weird there's no wait between the phase updates? */
It is, I think we should skip PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS in
favor of PROGRESS_DATACHECKSUMS_PHASE_ENABLING.

+   * When enabling checksums, we have to wait for a checkpoint for the
+   * checksums to e.
Seems to be missing the punchline, "for the checksum state to be moved from
in-progress to on" perhaps?

It also needs a pgindent and pgperltidy but there were only small trivial
changes there.

Thanks again for updating the patch!

--
Daniel Gustafsson

#26Tomas Vondra
tomas@vondra.me
In reply to: Daniel Gustafsson (#25)
4 attachment(s)
Re: Changing the state of data checksums in a running cluster

On 3/10/25 14:27, Daniel Gustafsson wrote:

On 10 Mar 2025, at 12:17, Tomas Vondra <tomas@vondra.me> wrote:

On 3/10/25 10:46, Tomas Vondra wrote:

On 3/10/25 01:18, Tomas Vondra wrote:

Thank you so much for picking up and fixing the blockers, it's highly appreciated!

For me, this passes all CI tests, hopefully cfbot will be happy too.

Confirmed, it compiles clean, builds docs and passes all tests for me as well.

A few comments from reading over your changes:

+   launcher worker has this value set, the other worker processes
+   have this <literal>NULL</literal>.
There seems to be a word or two missing (same in a few places), should this be
"have this set to NULL"?

done

+   The command is currently waiting for a checkpoint to update the checksum
+   state at the end.
s/at the end/before finishing/?

done

+ * XXX aren't PG_DATA_ and DATA_ constants the same? why do we need both?
They aren't mapping 1:1 as PG_DATA_ has the version numbers, and if checksums
aren't enabled there is no version and thus there is no PG_DATA_CHECKSUMS_OFF.
This could of course be remedied. IIRC one reason for adding the enum was to
get compiler warnings on missing cases when switch()ing over the value, but I
don't think the current code has any switch.

I haven't done anything about this. I'm not convinced it's an issue we
need to fix, and I haven't tried how much work would it be.

+ /* XXX isn't it weird there's no wait between the phase updates? */
It is, I think we should skip PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS in
favor of PROGRESS_DATACHECKSUMS_PHASE_ENABLING.

Removed the WAITING_BACKENDS phase.

+   * When enabling checksums, we have to wait for a checkpoint for the
+   * checksums to e.
Seems to be missing the punchline, "for the checksum state to be moved from
in-progress to on" perhaps?

done

It also needs a pgindent and pgperltidy but there were only small trivial
changes there.

done

Attached is an updated version.

--
Tomas Vondra

Attachments:

v20250310e-0001-Online-enabling-and-disabling-of-data-che.patchtext/x-patch; charset=UTF-8; name=v20250310e-0001-Online-enabling-and-disabling-of-data-che.patchDownload
From 1cb65478673dc8ffa01312f7a06fc488164c8f20 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Fri, 7 Mar 2025 19:21:22 +0100
Subject: [PATCH v20250310e 1/4] Online enabling and disabling of data
 checksums

This allows data checksums to be enabled, or disabled, in a running
cluster without restricting access to the cluster during processing.

Data checksums could prior to this only be enabled during initdb or
when the cluster is offline using the pg_checksums app. This commit
introduce functionality to enable, or disable, data checksums while
the cluster is running regardless of how it was initialized.

A background worker launcher process is responsible for launching a
dynamic per-database background worker which will mark all buffers
dirty for all relation with storage in order for them to have data
checksums calcuated on write.  Once all relations in all databases
have been processed, the data_checksums state will be set to on and
the cluster will at that point be identical to one which had data
checksums enabled during initialization or via offline processing.

When data checksums are being enabled, concurrent I/O operations
from backends other than the data checksums worker will write the
checksums but not verify them on reading.  Only when all backends
have absorbed the procsignalbarrier for setting data_checksums to
on will they also start verifying checksums on reading.  The same
process is repeated during disabling; all backends write checksums
but do not verify them until the barrier for setting the state to
off has been absorbed by all.  This in-progress state is used to
ensure there are no false negatives (or positives) due to reading
a checksum which is not in sync with the page.

This work is based on an earlier version of this patch which was
reviewed by among others Heikki Linnakangas, Robert Haas, Andres
Freund, Tomas Vondra, Michael Banck and Andrey Borodin.

Author: Daniel Gustafsson <daniel@yesql.se>
Author: Magnus Hagander <magnus@hagander.net>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com
Discussion: https://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de
Discussion: https://postgr.es/m/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com
---
 doc/src/sgml/func.sgml                        |   71 +
 doc/src/sgml/glossary.sgml                    |   23 +
 doc/src/sgml/monitoring.sgml                  |  215 ++-
 doc/src/sgml/ref/pg_checksums.sgml            |    6 +
 doc/src/sgml/wal.sgml                         |   59 +-
 src/backend/access/rmgrdesc/xlogdesc.c        |   24 +
 src/backend/access/transam/xlog.c             |  489 +++++-
 src/backend/access/transam/xlogfuncs.c        |   41 +
 src/backend/backup/basebackup.c               |    6 +-
 src/backend/catalog/system_functions.sql      |   11 +
 src/backend/catalog/system_views.sql          |   21 +
 src/backend/postmaster/Makefile               |    1 +
 src/backend/postmaster/bgworker.c             |    7 +
 src/backend/postmaster/datachecksumsworker.c  | 1447 +++++++++++++++++
 src/backend/postmaster/launch_backend.c       |    3 +
 src/backend/postmaster/meson.build            |    1 +
 src/backend/postmaster/postmaster.c           |    5 +
 src/backend/replication/logical/decode.c      |    1 +
 src/backend/storage/ipc/ipci.c                |    4 +
 src/backend/storage/ipc/procsignal.c          |   13 +
 src/backend/storage/page/README               |    4 +-
 src/backend/storage/page/bufpage.c            |    6 +-
 src/backend/utils/activity/pgstat_backend.c   |    2 +
 src/backend/utils/activity/pgstat_io.c        |    2 +
 .../utils/activity/wait_event_names.txt       |    3 +
 src/backend/utils/adt/pgstatfuncs.c           |    8 +-
 src/backend/utils/init/miscinit.c             |    9 +-
 src/backend/utils/init/postinit.c             |    7 +-
 src/backend/utils/misc/guc_tables.c           |   31 +-
 src/bin/pg_checksums/pg_checksums.c           |    2 +-
 src/bin/pg_upgrade/controldata.c              |    9 +
 src/include/access/xlog.h                     |   17 +-
 src/include/access/xlog_internal.h            |    7 +
 src/include/catalog/pg_control.h              |    1 +
 src/include/catalog/pg_proc.dat               |   17 +
 src/include/commands/progress.h               |   17 +
 src/include/miscadmin.h                       |    4 +
 src/include/postmaster/datachecksumsworker.h  |   31 +
 src/include/storage/bufpage.h                 |   10 +
 src/include/storage/checksum.h                |    8 +
 src/include/storage/lwlocklist.h              |    1 +
 src/include/storage/proc.h                    |    5 +-
 src/include/storage/procsignal.h              |    5 +
 src/include/utils/backend_progress.h          |    1 +
 src/test/Makefile                             |   11 +-
 src/test/checksum/.gitignore                  |    2 +
 src/test/checksum/Makefile                    |   23 +
 src/test/checksum/README                      |   22 +
 src/test/checksum/meson.build                 |   15 +
 src/test/checksum/t/001_basic.pl              |   88 +
 src/test/checksum/t/002_restarts.pl           |   92 ++
 src/test/checksum/t/003_standby_restarts.pl   |  139 ++
 src/test/checksum/t/004_offline.pl            |  101 ++
 src/test/meson.build                          |    1 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |   34 +
 src/test/regress/expected/rules.out           |   37 +
 src/test/regress/expected/stats.out           |   18 +-
 src/tools/pgindent/typedefs.list              |    6 +
 58 files changed, 3194 insertions(+), 50 deletions(-)
 create mode 100644 src/backend/postmaster/datachecksumsworker.c
 create mode 100644 src/include/postmaster/datachecksumsworker.h
 create mode 100644 src/test/checksum/.gitignore
 create mode 100644 src/test/checksum/Makefile
 create mode 100644 src/test/checksum/README
 create mode 100644 src/test/checksum/meson.build
 create mode 100644 src/test/checksum/t/001_basic.pl
 create mode 100644 src/test/checksum/t/002_restarts.pl
 create mode 100644 src/test/checksum/t/003_standby_restarts.pl
 create mode 100644 src/test/checksum/t/004_offline.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 51dd8ad6571..a09843e4ecf 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29865,6 +29865,77 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
 
   </sect2>
 
+  <sect2 id="functions-admin-checksum">
+   <title>Data Checksum Functions</title>
+
+   <para>
+    The functions shown in <xref linkend="functions-checksums-table" /> can
+    be used to enable or disable data checksums in a running cluster.
+    See <xref linkend="checksums" /> for details.
+   </para>
+
+   <table id="functions-checksums-table">
+    <title>Data Checksum Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_enable_data_checksums</primary>
+        </indexterm>
+        <function>pg_enable_data_checksums</function> ( <optional><parameter>cost_delay</parameter> <type>int</type>, <parameter>cost_limit</parameter> <type>int</type></optional> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Initiates data checksums for the cluster. This will switch the data
+        checksums mode to <literal>inprogress-on</literal> as well as start a
+        background worker that will process all pages in the database and
+        enable checksums on them. When all data pages have had checksums
+        enabled, the cluster will automatically switch data checksums mode to
+        <literal>on</literal>.
+       </para>
+       <para>
+        If <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter> are
+        specified, the speed of the process is throttled using the same principles as
+        <link linkend="runtime-config-resource-vacuum-cost">Cost-based Vacuum Delay</link>.
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_disable_data_checksums</primary>
+        </indexterm>
+        <function>pg_disable_data_checksums</function> ()
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Disables data checksum validation and calculation for the cluster. This
+        will switch the data checksum mode to <literal>inprogress-off</literal>
+        while data checksums are being disabled. When all active backends have
+        stopped validating data checksums, the data checksum mode will be
+        changed to <literal>off</literal>.  At this point the data pages will
+        still have checksums recorded but they are not updated when pages are
+        modified.
+       </para></entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+
+  </sect2>
+
   <sect2 id="functions-admin-dbobject">
    <title>Database Object Management Functions</title>
 
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index c0f812e3f5e..547e8586d4b 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -159,6 +159,8 @@
      (but not the autovacuum workers),
      the <glossterm linkend="glossary-background-writer">background writer</glossterm>,
      the <glossterm linkend="glossary-checkpointer">checkpointer</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker">data checksums worker</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker-launcher">data checksums worker launcher</glossterm>,
      the <glossterm linkend="glossary-logger">logger</glossterm>,
      the <glossterm linkend="glossary-startup-process">startup process</glossterm>,
      the <glossterm linkend="glossary-wal-archiver">WAL archiver</glossterm>,
@@ -548,6 +550,27 @@
    </glossdef>
   </glossentry>
 
+  <glossentry id="glossary-data-checksums-worker">
+   <glossterm>Data Checksums Worker</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which enables or disables data checksums in a specific database.
+    </para>
+   </glossdef>
+  </glossentry>
+
+  <glossentry id="glossary-data-checksums-worker-launcher">
+   <glossterm>Data Checksums Worker Launcher</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which starts <glossterm linkend="glossary-data-checksums-worker"> processes</glossterm>
+     for each database.
+    </para>
+   </glossdef>
+  </glossentry>
+
   <glossentry id="glossary-db-cluster">
    <glossterm>Database cluster</glossterm>
    <glossdef>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 16646f560e8..2a568ee2357 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3497,8 +3497,9 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Number of data page checksum failures detected in this
-       database (or on a shared object), or NULL if data checksums are
-       disabled.
+       database (or on a shared object).
+       Detected failures are reported regardless of the
+       <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -3508,8 +3509,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Time at which the last data page checksum failure was detected in
-       this database (or on a shared object), or NULL if data checksums are
-       disabled.
+       this database (or on a shared object). Last failure is reported
+       regardless of the <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -6793,6 +6794,212 @@ FROM pg_stat_get_backend_idset() AS backendid;
 
  </sect2>
 
+ <sect2 id="data-checksum-progress-reporting">
+  <title>Data Checksum Progress Reporting</title>
+
+  <indexterm>
+   <primary>pg_stat_progress_data_checksums</primary>
+  </indexterm>
+
+  <para>
+   When data checksums are being enabled on a running cluster, the
+   <structname>pg_stat_progress_data_checksums</structname> view will contain
+   a row for the launcher process, and one row for each worker process which
+   is currently calculating checksums for the data pages in one database.
+  </para>
+
+  <table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
+   <title><structname>pg_stat_progress_data_checksums</structname> View</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        Column Type
+       </para>
+       <para>
+        Description>
+       </para>
+      </entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>pid</structfield> <type>integer</type>
+       </para>
+       <para>
+        Process ID of a datachecksumworker process.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datid</structfield> <type>oid</type>
+      </para>
+      <para>
+       OID of this database, or 0 for the launcher process
+       relation
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datname</structfield> <type>name</type>
+      </para>
+      <para>
+       Name of this database, or <literal>NULL</literal> for the
+       launcher process.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>phase</structfield> <type>text</type>
+       </para>
+       <para>
+        Current processing phase, see <xref linkend="datachecksum-phases"/>
+        for description of the phases.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of databases which will be processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of databases which have been processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of relations which will be processed, or
+        <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of relations yet. The launcher process has
+        this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of relations which have been processed. The launcher
+        process has this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>blocks_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks in the current relation which will be processed,
+        or <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of blocks yet. The launcher process has
+        this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>blocks_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks in the current relation which have been processed.
+        The launcher process has this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+  <table id="datachecksum-phases">
+   <title>Data Checksum Phases</title>
+   <tgroup cols="2">
+    <colspec colname="col1" colwidth="1*"/>
+    <colspec colname="col2" colwidth="2*"/>
+    <thead>
+     <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+    <tbody>
+     <row>
+      <entry><literal>enabling</literal></entry>
+      <entry>
+       The command is currently enabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>disabling</literal></entry>
+      <entry>
+       The command is currently disabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on backends</literal></entry>
+      <entry>
+       The command is currently waiting for backends to acknowledge the data
+       checksum operation.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on temporary tables</literal></entry>
+      <entry>
+       The command is currently waiting for all temporary tables which existed
+       at the time the command was started to be removed.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on checkpoint</literal></entry>
+      <entry>
+       The command is currently waiting for a checkpoint to update the checksum
+       state at the end.
+      </entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+ </sect2>
+
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index 95043aa329c..0343710af53 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -45,6 +45,12 @@ PostgreSQL documentation
    exit status is nonzero if the operation failed.
   </para>
 
+  <para>
+   When enabling checksums, if checksums were in the process of being enabled
+   when the cluster was shut down, <application>pg_checksums</application>
+   will still process all relations regardless of the online processing.
+  </para>
+
   <para>
    When verifying checksums, every file in the cluster is scanned. When
    enabling checksums, each relation file block with a changed checksum is
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index f3b86b26be9..0ada90ca0b1 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -246,9 +246,10 @@
   <para>
    Checksums can be disabled when the cluster is initialized using <link
    linkend="app-initdb-data-checksums"><application>initdb</application></link>.
-   They can also be enabled or disabled at a later time as an offline
-   operation. Data checksums are enabled or disabled at the full cluster
-   level, and cannot be specified individually for databases or tables.
+   They can also be enabled or disabled at a later time either as an offline
+   operation or online in a running cluster allowing concurrent access. Data
+   checksums are enabled or disabled at the full cluster level, and cannot be
+   specified individually for databases or tables.
   </para>
 
   <para>
@@ -265,7 +266,7 @@
   </para>
 
   <sect2 id="checksums-offline-enable-disable">
-   <title>Off-line Enabling of Checksums</title>
+   <title>Offline Enabling of Checksums</title>
 
    <para>
     The <link linkend="app-pgchecksums"><application>pg_checksums</application></link>
@@ -274,6 +275,56 @@
    </para>
 
   </sect2>
+
+  <sect2 id="checksums-online-enable-disable">
+   <title>Online Enabling of Checksums</title>
+
+   <para>
+    Checksums can be enabled or disabled online, by calling the appropriate
+    <link linkend="functions-admin-checksum">functions</link>.
+   </para>
+
+   <para>
+    Enabling checksums will put the cluster checksum mode in
+    <literal>inprogress-on</literal> mode.  During this time, checksums will be
+    written but not verified. In addition to this, a background worker process
+    is started that enables checksums on all existing data in the cluster. Once
+    this worker has completed processing all databases in the cluster, the
+    checksum mode will automatically switch to <literal>on</literal>. The
+    processing will consume two background worker processes, make sure that
+    <varname>max_worker_processes</varname> allows for at least two more
+    additional processes.
+   </para>
+
+   <para>
+    The process will initially wait for all open transactions to finish before
+    it starts, so that it can be certain that there are no tables that have been
+    created inside a transaction that has not committed yet and thus would not
+    be visible to the process enabling checksums. It will also, for each database,
+    wait for all pre-existing temporary tables to get removed before it finishes.
+    If long-lived temporary tables are used in the application it may be necessary
+    to terminate these application connections to allow the process to complete.
+   </para>
+
+   <para>
+    If the cluster is stopped while in <literal>inprogress-on</literal> mode, for
+    any reason, then this process must be restarted manually. To do this,
+    re-execute the function <function>pg_enable_data_checksums()</function>
+    once the cluster has been restarted. The process will start over, there is
+    no support for resuming work from where it was interrupted.
+   </para>
+
+   <note>
+    <para>
+     Enabling checksums can cause significant I/O to the system, as most of the
+     database pages will need to be rewritten, and will be written both to the
+     data files and the WAL. The impact may be limited by throttling using the
+     <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter>
+     parameters of the <function>pg_enable_data_checksums</function> function.
+    </para>
+   </note>
+
+  </sect2>
  </sect1>
 
   <sect1 id="wal-intro">
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 58040f28656..1e3ad2ab68d 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -18,6 +18,7 @@
 #include "access/xlog.h"
 #include "access/xlog_internal.h"
 #include "catalog/pg_control.h"
+#include "storage/bufpage.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -167,6 +168,26 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
 		memcpy(&wal_level, rec, sizeof(int));
 		appendStringInfo(buf, "wal_level %s", get_wal_level_string(wal_level));
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state xlrec;
+
+		memcpy(&xlrec, rec, sizeof(xl_checksum_state));
+		switch (xlrec.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_VERSION:
+				appendStringInfoString(buf, "on");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				appendStringInfoString(buf, "inprogress-off");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				appendStringInfoString(buf, "inprogress-on");
+				break;
+			default:
+				appendStringInfoString(buf, "off");
+		}
+	}
 }
 
 const char *
@@ -218,6 +239,9 @@ xlog_identify(uint8 info)
 		case XLOG_CHECKPOINT_REDO:
 			id = "CHECKPOINT_REDO";
 			break;
+		case XLOG_CHECKSUMS:
+			id = "CHECKSUMS";
+			break;
 	}
 
 	return id;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 799fc739e18..f137cdc6d42 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -647,6 +647,24 @@ static XLogRecPtr LocalMinRecoveryPoint;
 static TimeLineID LocalMinRecoveryPointTLI;
 static bool updateMinRecoveryPoint = true;
 
+/*
+ * Local state fror Controlfile data_checksum_version. After initialization
+ * this is only updated when absorbing a procsignal barrier during interrupt
+ * processing. The reason for keeping a copy in backend-private memory is to
+ * avoid locking for interrogating checksum state. Possible values are the
+ * checksum versions defined in storage/bufpage.h as well as zero when data
+ * checksums are disabled.
+ */
+static uint32 LocalDataChecksumVersion = 0;
+
+/*
+ * Variable backing the GUC, keep it in sync with LocalDataChecksumVersion.
+ * See SetLocalDataChecksumVersion().
+ */
+int			data_checksums = 0;
+
+static void SetLocalDataChecksumVersion(uint32 data_checksum_version);
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -715,6 +733,8 @@ static void WALInsertLockAcquireExclusive(void);
 static void WALInsertLockRelease(void);
 static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);
 
+static void XLogChecksums(ChecksumType new_type);
+
 /*
  * Insert an XLOG record represented by an already-constructed chain of data
  * chunks.  This is a low-level routine; to construct the WAL record header
@@ -828,9 +848,10 @@ XLogInsertRecord(XLogRecData *rdata,
 		 * only happen just after a checkpoint, so it's better to be slow in
 		 * this case and fast otherwise.
 		 *
-		 * Also check to see if fullPageWrites was just turned on or there's a
-		 * running backup (which forces full-page writes); if we weren't
-		 * already doing full-page writes then go back and recompute.
+		 * Also check to see if fullPageWrites was just turned on, there's a
+		 * running backup or if checksums are enabled (all of which forces
+		 * full-page writes); if we weren't already doing full-page writes
+		 * then go back and recompute.
 		 *
 		 * If we aren't doing full-page writes then RedoRecPtr doesn't
 		 * actually affect the contents of the XLOG record, so we'll update
@@ -843,7 +864,9 @@ XLogInsertRecord(XLogRecData *rdata,
 			Assert(RedoRecPtr < Insert->RedoRecPtr);
 			RedoRecPtr = Insert->RedoRecPtr;
 		}
-		doPageWrites = (Insert->fullPageWrites || Insert->runningBackups > 0);
+		doPageWrites = (Insert->fullPageWrites ||
+						Insert->runningBackups > 0 ||
+						DataChecksumsNeedWrite());
 
 		if (doPageWrites &&
 			(!prevDoPageWrites ||
@@ -4579,9 +4602,7 @@ ReadControlFile(void)
 
 	CalculateCheckpointSegments();
 
-	/* Make the initdb settings visible as GUC variables, too */
-	SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no",
-					PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
+	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
 }
 
 /*
@@ -4615,13 +4636,376 @@ GetMockAuthenticationNonce(void)
 }
 
 /*
- * Are checksums enabled for data pages?
+ * DataChecksumsNeedWrite
+ *		Returns whether data checksums must be written or not
+ *
+ * Returns true iff data checksums are enabled or are in the process of being
+ * enabled. During "inprogress-on" and "inprogress-off" states checksums must
+ * be written even though they are not verified (see datachecksumsworker.c for
+ * a longer discussion).
+ *
+ * This function is intended for callsites which are about to write a data page
+ * to storage, and need to know whether to re-calculate the checksum for the
+ * page header. Calling this function must be performed as close to the write
+ * operation as possible to keep the critical section short.
+ */
+bool
+DataChecksumsNeedWrite(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * DataChecksumsNeedVerify
+ *		Returns whether data checksums must be verified or not
+ *
+ * Data checksums are only verified if they are fully enabled in the cluster.
+ * During the "inprogress-on" and "inprogress-off" states they are only
+ * updated, not verified (see datachecksumsworker.c for a longer discussion).
+ *
+ * This function is intended for callsites which have read data and are about
+ * to perform checksum validation based on the result of this.  Calling this
+ * function must be performed as close to the validation call as possible to
+ * keep the critical section short. This is in order to protect against time of
+ * check/time of use situations around data checksum validation.
  */
 bool
-DataChecksumsEnabled(void)
+DataChecksumsNeedVerify(void)
 {
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
+}
+
+/*
+ * DataChecksumsOnInProgress
+ *		Returns whether data checksums are being enabled
+ *
+ * Most operations don't need to worry about the "inprogress" states, and
+ * should use DataChecksumsNeedVerify() or DataChecksumsNeedWrite(). The
+ * "inprogress-on" state for enabling checksums is used when the checksum
+ * worker is setting checksums on all pages, it can thus be used to check for
+ * aborted checksum processing which need to be restarted.
+ */
+inline bool
+DataChecksumsOnInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+}
+
+/*
+ * DataChecksumsOffInProgress
+ *		Returns whether data checksums are being disabled
+ *
+ * The "inprogress-off" state for disabling checksums is used for when the
+ * worker resets the catalog state.  DataChecksumsNeedVerify() or
+ * DataChecksumsNeedWrite() should be used for deciding whether to read/write
+ * checksums.
+ */
+bool
+DataChecksumsOffInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * SetDataChecksumsOnInProgress
+ *		Sets the data checksum state to "inprogress-on" to enable checksums
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on". See
+ * SetDataChecksumsOn below for a description on how this state change works.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOnInProgress(void)
+{
+	uint64		barrier;
+
 	Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+
+	/*
+	 * The state transition is performed in a critical section with
+	 * checkpoints held off to provide crash safety.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state change in all backends to ensure that all backends are in
+	 * "inprogress-on". Once done we know that all backends are writing data
+	 * checksums.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOn
+ *		Enables data checksums cluster-wide
+ *
+ * Enabling data checksums is performed using two barriers, the first one to
+ * set the checksums state to "inprogress-on" (which is performed by
+ * SetDataChecksumsOnInProgress()) and the second one to set the state to "on"
+ * (performed here).
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on".  This state
+ * requires data checksums to be written but not verified. This ensures that
+ * all data pages can be checksummed without the risk of false negatives in
+ * validation during the process.  When all existing pages are guaranteed to
+ * have checksums, and all new pages will be initiated with checksums, the
+ * state can be changed to "on". Once the state is "on" checksums will be both
+ * written and verified. See datachecksumsworker.c for a longer discussion on
+ * how data checksums can be enabled in a running cluster.
+ *
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOn(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile != NULL);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/*
+	 * The only allowed state transition to "on" is from "inprogress-on" since
+	 * that state ensures that all pages will have data checksums written.
+	 */
+	if (ControlFile->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+		elog(ERROR, "checksums not in \"inprogress-on\" mode");
+	}
+
+	LWLockRelease(ControlFileLock);
+
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state transition of "on" in all backends. When done we know that
+	 * data checksums are enabled in all backends and data checksums are both
+	 * written and verified.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOff
+ *		Disables data checksums cluster-wide
+ *
+ * Disabling data checksums must be performed with two sets of barriers, each
+ * carrying a different state. The state is first set to "inprogress-off"
+ * during which checksums are still written but not verified. This ensures that
+ * backends which have yet to observe the state change from "on" won't get
+ * validation errors on concurrently modified pages. Once all backends have
+ * changed to "inprogress-off", the barrier for moving to "off" can be emitted.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOff(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/* If data checksums are already disabled there is nothing to do */
+	if (ControlFile->data_checksum_version == 0)
+	{
+		LWLockRelease(ControlFileLock);
+		return;
+	}
+
+	/*
+	 * If data checksums are currently enabled we first transition to the
+	 * "inprogress-off" state during which backends continue to write
+	 * checksums without verifying them. When all backends are in
+	 * "inprogress-off" the next transition to "off" can be performed, after
+	 * which all data checksum processing is disabled.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+
+		MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+		START_CRIT_SECTION();
+
+		XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+		LWLockRelease(ControlFileLock);
+
+		barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+
+		END_CRIT_SECTION();
+		MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+
+		/*
+		 * Update local state in all backends to ensure that any backend in
+		 * "on" state is changed to "inprogress-off".
+		 */
+		WaitForProcSignalBarrier(barrier);
+
+		/*
+		 * At this point we know that no backends are verifying data checksums
+		 * during reading. Next, we can safely move to state "off" to also
+		 * stop writing checksums.
+		 */
+	}
+	else
+	{
+		/*
+		 * Ending up here implies that the checksums state is "inprogress-on"
+		 * or "inprogress-off" and we can transition directly to "off" from
+		 * there.
+		 */
+		LWLockRelease(ControlFileLock);
+	}
+
+	/*
+	 * Ensure that we don't incur a checkpoint during disabling checksums.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(0);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = 0;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * ProcSignalBarrier absorption functions for enabling and disabling data
+ * checksums in a running cluster. The procsignalbarriers are emitted in the
+ * SetDataChecksums* functions.
+ */
+bool
+AbsorbChecksumsOnInProgressBarrier(void)
+{
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	return true;
+}
+
+bool
+AbsorbChecksumsOnBarrier(void)
+{
+	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_VERSION);
+	return true;
+}
+
+bool
+AbsorbChecksumsOffInProgressBarrier(void)
+{
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+	return true;
+}
+
+bool
+AbsorbChecksumsOffBarrier(void)
+{
+	SetLocalDataChecksumVersion(0);
+	return true;
+}
+
+/*
+ * InitLocalControlData
+ *
+ * Set up backend local caches of controldata variables which may change at
+ * any point during runtime and thus require special cased locking. So far
+ * this only applies to data_checksum_version, but it's intended to be general
+ * purpose enough to handle future cases.
+ */
+void
+InitLocalControldata(void)
+{
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
+	LWLockRelease(ControlFileLock);
+}
+
+/*
+ * XXX probably should be called in all places that modify the value of
+ * LocalDataChecksumVersion (to make sure data_checksums GUC is in sync)
+ *
+ * XXX aren't PG_DATA_ and DATA_ constants the same? why do we need both?
+ */
+void
+SetLocalDataChecksumVersion(uint32 data_checksum_version)
+{
+	LocalDataChecksumVersion = data_checksum_version;
+
+	switch (LocalDataChecksumVersion)
+	{
+		case PG_DATA_CHECKSUM_VERSION:
+			data_checksums = DATA_CHECKSUMS_ON;
+			break;
+
+		case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+			data_checksums = DATA_CHECKSUMS_INPROGRESS_ON;
+			break;
+
+		case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+			data_checksums = DATA_CHECKSUMS_INPROGRESS_OFF;
+			break;
+
+		default:
+			data_checksums = DATA_CHECKSUMS_OFF;
+			break;
+	}
+}
+
+/* guc hook */
+const char *
+show_data_checksums(void)
+{
+	if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)
+		return "on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		return "inprogress-on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+		return "inprogress-off";
+	else
+		return "off";
 }
 
 /*
@@ -6194,6 +6578,33 @@ StartupXLOG(void)
 	 */
 	CompleteCommitTsInitialization();
 
+	/*
+	 * If we reach this point with checksums being enabled ("inprogress-on"
+	 * state), we notify the user that they need to manually restart the
+	 * process to enable checksums. This is because we cannot launch a dynamic
+	 * background worker directly from here, it has to be launched from a
+	 * regular backend.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		ereport(WARNING,
+				(errmsg("data checksums are being enabled, but no worker is running"),
+				 errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
+
+	/*
+	 * If data checksums were being disabled when the cluster was shut down,
+	 * we know that we have a state where all backends have stopped validating
+	 * checksums and we can move to off instead of prompting the user to
+	 * perform any action.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+	{
+		XLogChecksums(0);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = 0;
+		LWLockRelease(ControlFileLock);
+	}
+
 	/*
 	 * All done with end-of-recovery actions.
 	 *
@@ -8201,6 +8612,24 @@ XLogReportParameters(void)
 	}
 }
 
+/*
+ * Log the new state of checksums
+ */
+static void
+XLogChecksums(ChecksumType new_type)
+{
+	xl_checksum_state xlrec;
+	XLogRecPtr	recptr;
+
+	xlrec.new_checksumtype = new_type;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, sizeof(xl_checksum_state));
+
+	recptr = XLogInsert(RM_XLOG_ID, XLOG_CHECKSUMS);
+	XLogFlush(recptr);
+}
+
 /*
  * Update full_page_writes in shared memory, and write an
  * XLOG_FPW_CHANGE record if necessary.
@@ -8629,6 +9058,46 @@ xlog_redo(XLogReaderState *record)
 	{
 		/* nothing to do here, just for informational purposes */
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state state;
+		uint64		barrier;
+
+		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = state.new_checksumtype;
+		LWLockRelease(ControlFileLock);
+
+		/*
+		 * Block on a procsignalbarrier to await all processes having seen the
+		 * change to checksum status. Once the barrier has been passed we can
+		 * initiate the corresponding processing.
+		 */
+		switch (state.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			default:
+				Assert(state.new_checksumtype == 0);
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+		}
+	}
 }
 
 /*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 8c3090165f0..6b1c2ed85ea 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -26,6 +26,7 @@
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "postmaster/datachecksumsworker.h"
 #include "replication/walreceiver.h"
 #include "storage/fd.h"
 #include "storage/latch.h"
@@ -748,3 +749,43 @@ pg_promote(PG_FUNCTION_ARGS)
 						   wait_seconds)));
 	PG_RETURN_BOOL(false);
 }
+
+/*
+ * Disables data checksums for the cluster, if applicable. Starts a background
+ * worker which turns off the data checksums.
+ */
+Datum
+disable_data_checksums(PG_FUNCTION_ARGS)
+{
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	StartDataChecksumsWorkerLauncher(false, 0, 0, false);
+	PG_RETURN_VOID();
+}
+
+/*
+ * Enables data checksums for the cluster, if applicable.  Supports vacuum-
+ * like cost based throttling to limit system load. Starts a background worker
+ * which updates data checksums on existing data.
+ */
+Datum
+enable_data_checksums(PG_FUNCTION_ARGS)
+{
+	int			cost_delay = PG_GETARG_INT32(0);
+	int			cost_limit = PG_GETARG_INT32(1);
+	bool		fast = PG_GETARG_BOOL(2);
+
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	if (cost_delay < 0)
+		ereport(ERROR, errmsg("cost delay cannot be a negative value"));
+
+	if (cost_limit <= 0)
+		ereport(ERROR, errmsg("cost limit must be greater than zero"));
+
+	StartDataChecksumsWorkerLauncher(true, cost_delay, cost_limit, fast);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 3f8a3c55725..598dd41bae6 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1613,7 +1613,8 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	 * enabled for this cluster, and if this is a relation file, then verify
 	 * the checksum.
 	 */
-	if (!noverify_checksums && DataChecksumsEnabled() &&
+	if (!noverify_checksums &&
+		DataChecksumsNeedWrite() &&
 		RelFileNumberIsValid(relfilenumber))
 		verify_checksum = true;
 
@@ -2006,6 +2007,9 @@ verify_page_checksum(Page page, XLogRecPtr start_lsn, BlockNumber blkno,
 	if (PageIsNew(page) || PageGetLSN(page) >= start_lsn)
 		return true;
 
+	if (!DataChecksumsNeedVerify())
+		return true;
+
 	/* Perform the actual checksum calculation. */
 	checksum = pg_checksum_page(page, blkno);
 
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 566f308e443..b18db2b5dde 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -650,6 +650,13 @@ LANGUAGE INTERNAL
 CALLED ON NULL INPUT VOLATILE PARALLEL SAFE
 AS 'pg_stat_reset_slru';
 
+CREATE OR REPLACE FUNCTION
+  pg_enable_data_checksums(cost_delay integer DEFAULT 0,
+                           cost_limit integer DEFAULT 100,
+						   fast boolean DEFAULT false)
+  RETURNS void STRICT VOLATILE LANGUAGE internal AS 'enable_data_checksums'
+  PARALLEL RESTRICTED;
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
@@ -775,6 +782,10 @@ REVOKE EXECUTE ON FUNCTION pg_ls_logicalmapdir() FROM PUBLIC;
 
 REVOKE EXECUTE ON FUNCTION pg_ls_replslotdir(text) FROM PUBLIC;
 
+REVOKE EXECUTE ON FUNCTION pg_enable_data_checksums(integer, integer, boolean) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_disable_data_checksums() FROM public;
+
 --
 -- We also set up some things as accessible to standard roles.
 --
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index a4d2cfdcaf5..4330d0ad656 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1334,6 +1334,27 @@ CREATE VIEW pg_stat_progress_copy AS
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_data_checksums AS
+    SELECT
+        S.pid AS pid, S.datid, D.datname AS datname,
+        CASE S.param1 WHEN 0 THEN 'enabling'
+                      WHEN 1 THEN 'disabling'
+                      WHEN 2 THEN 'waiting'
+                      WHEN 3 THEN 'waiting on backends'
+                      WHEN 4 THEN 'waiting on temporary tables'
+                      WHEN 5 THEN 'waiting on checkpoint'
+                      WHEN 6 THEN 'done'
+                      END AS phase,
+        CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+        S.param3 AS databases_done,
+        CASE S.param4 WHEN -1 THEN NULL ELSE S.param4 END AS relations_total,
+        CASE S.param5 WHEN -1 THEN NULL ELSE S.param5 END AS relations_done,
+        CASE S.param6 WHEN -1 THEN NULL ELSE S.param6 END AS blocks_total,
+        CASE S.param7 WHEN -1 THEN NULL ELSE S.param7 END AS blocks_done
+    FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid
+    ORDER BY S.datid; -- return the launcher process first
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/postmaster/Makefile b/src/backend/postmaster/Makefile
index 0f4435d2d97..0c36765acfe 100644
--- a/src/backend/postmaster/Makefile
+++ b/src/backend/postmaster/Makefile
@@ -18,6 +18,7 @@ OBJS = \
 	bgworker.o \
 	bgwriter.o \
 	checkpointer.o \
+	datachecksumsworker.o \
 	fork_process.o \
 	interrupt.o \
 	launch_backend.o \
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 116ddf7b835..5720f66b0fd 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -18,6 +18,7 @@
 #include "pgstat.h"
 #include "port/atomics.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "replication/logicallauncher.h"
 #include "replication/logicalworker.h"
@@ -132,6 +133,12 @@ static const struct
 	},
 	{
 		"TablesyncWorkerMain", TablesyncWorkerMain
+	},
+	{
+		"DataChecksumsWorkerLauncherMain", DataChecksumsWorkerLauncherMain
+	},
+	{
+		"DataChecksumsWorkerMain", DataChecksumsWorkerMain
 	}
 };
 
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
new file mode 100644
index 00000000000..bbbce61cfa6
--- /dev/null
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -0,0 +1,1447 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.c
+ *	  Background worker for enabling or disabling data checksums online
+ *
+ * When enabling data checksums on a database at initdb time or when shut down
+ * with pg_checksums, no extra process is required as each page is checksummed,
+ * and verified, when accessed.  When enabling checksums on an already running
+ * cluster, this worker will ensure that all pages are checksummed before
+ * verification of the checksums is turned on. In the case of disabling
+ * checksums, the state transition is performed only in the control file, no
+ * changes are performed on the data pages.
+ *
+ * Checksums can be either enabled or disabled cluster-wide, with on/off being
+ * the end state for data_checksums.
+ *
+ * Enabling checksums
+ * ------------------
+ * When enabling checksums in an online cluster, data_checksums will be set to
+ * "inprogress-on" which signals that write operations MUST compute and write
+ * the checksum on the data page, but during reading the checksum SHALL NOT be
+ * verified. This ensures that all objects created during checksumming will
+ * have checksums set, but no reads will fail due to incorrect checksum. The
+ * DataChecksumsWorker will compile a list of databases which exist at the
+ * start of checksumming, and all of these which haven't been dropped during
+ * the processing MUST have been processed successfully in order for checksums
+ * to be enabled. Any new relation created during processing will see the
+ * in-progress state and will automatically be checksummed.
+ *
+ * For each database, all relations which have storage are read and every data
+ * page is marked dirty to force a write with the checksum. This will generate
+ * a lot of WAL as the entire database is read and written.
+ *
+ * If the processing is interrupted by a cluster restart, it will be restarted
+ * from the beginning again as state isn't persisted.
+ *
+ * Disabling checksums
+ * -------------------
+ * When disabling checksums, data_checksums will be set to "inprogress-off"
+ * which signals that checksums are written but no longer verified. This ensure
+ * that backends which have yet to move from the "on" state will still be able
+ * to process data checksum validation.
+ *
+ * Synchronization and Correctness
+ * -------------------------------
+ * The processes involved in enabling, or disabling, data checksums in an
+ * online cluster must be properly synchronized with the normal backends
+ * serving concurrent queries to ensure correctness. Correctness is defined
+ * as the following:
+ *
+ *    - Backends SHALL NOT violate local data_checksums state
+ *    - Data checksums SHALL NOT be considered enabled cluster-wide until all
+ *      currently connected backends have the local state "enabled"
+ *
+ * There are two levels of synchronization required for enabling data checksums
+ * in an online cluster: (i) changing state in the active backends ("on",
+ * "off", "inprogress-on" and "inprogress-off"), and (ii) ensuring no
+ * incompatible objects and processes are left in a database when workers end.
+ * The former deals with cluster-wide agreement on data checksum state and the
+ * latter with ensuring that any concurrent activity cannot break the data
+ * checksum contract during processing.
+ *
+ * Synchronizing the state change is done with procsignal barriers, where the
+ * WAL logging backend updating the global state in the controlfile will wait
+ * for all other backends to absorb the barrier. Barrier absorption will happen
+ * during interrupt processing, which means that connected backends will change
+ * state at different times. To prevent data checksum state changes when
+ * writing and verifying checksums, interrupts shall be held off before
+ * interrogating state and resumed when the IO operation has been performed.
+ *
+ *   When Enabling Data Checksums
+ *   ----------------------------
+ *   A process which fails to observe data checksums being enabled can induce
+ *   two types of errors: failing to write the checksum when modifying the page
+ *   and failing to validate the data checksum on the page when reading it.
+ *
+ *   When processing starts all backends belong to one of the below sets, with
+ *   one set being empty:
+ *
+ *   Bd: Backends in "off" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   If processing is started in an online cluster then all backends are in Bd.
+ *   If processing was halted by the cluster shutting down, the controlfile
+ *   state "inprogress-on" will be observed on system startup and all backends
+ *   will be in Bd. Backends transition Bd -> Bi via a procsignalbarrier.  When
+ *   the DataChecksumsWorker has finished writing checksums on all pages and
+ *   enables data checksums cluster-wide, there are four sets of backends where
+ *   Bd shall be an empty set:
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   Backends in Bi and Be will write checksums when modifying a page, but only
+ *   backends in Be will verify the checksum during reading. The Bg backend is
+ *   blocked waiting for all backends in Bi to process interrupts and move to
+ *   Be. Any backend starting while Bg is waiting on the procsignalbarrier will
+ *   observe the global state being "on" and will thus automatically belong to
+ *   Be.  Checksums are enabled cluster-wide when Bi is an empty set. Bi and Be
+ *   are compatible sets while still operating based on their local state as
+ *   both write data checksums.
+ *
+ *   When Disabling Data Checksums
+ *   -----------------------------
+ *   A process which fails to observe that data checksums have been disabled
+ *   can induce two types of errors: writing the checksum when modifying the
+ *   page and validating a data checksum which is no longer correct due to
+ *   modifications to the page.
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bo: Backends in "inprogress-off" state
+ *
+ *   Backends transition from the Be state to Bd like so: Be -> Bo -> Bd
+ *
+ *   The goal is to transition all backends to Bd making the others empty sets.
+ *   Backends in Bo write data checksums, but don't validate them, such that
+ *   backends still in Be can continue to validate pages until the barrier has
+ *   been absorbed such that they are in Bo. Once all backends are in Bo, the
+ *   barrier to transition to "off" can be raised and all backends can safely
+ *   stop writing data checksums as no backend is enforcing data checksum
+ *   validation any longer.
+ *
+ *
+ * Potential optimizations
+ * -----------------------
+ * Below are some potential optimizations and improvements which were brought
+ * up during reviews of this feature, but which weren't implemented in the
+ * initial version. These are ideas listed without any validation on their
+ * feasibility or potential payoff. More discussion on these can be found on
+ * the -hackers threads linked to in the commit message of this feature.
+ *
+ *   * Launching datachecksumsworker for resuming operation from the startup
+ *     process: Currently users have to restart processing manually after a
+ *     restart since dynamic background worker cannot be started from the
+ *     postmaster. Changing the startup process could make restarting the
+ *     processing automatic on cluster restart.
+ *   * Avoid dirtying the page when checksums already match: Iff the checksum
+ *     on the page happens to already match we still dirty the page. It should
+ *     be enough to only do the log_newpage_buffer() call in that case.
+ *   * Invent a lightweight WAL record that doesn't contain the full-page
+ *     image but just the block number: On replay, the redo routine would read
+ *     the page from disk.
+ *   * Teach pg_checksums to avoid checksummed pages when pg_checksums is used
+ *     to enable checksums on a cluster which is in inprogress-on state and
+ *     may have checksummed pages (make pg_checksums be able to resume an
+ *     online operation).
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/postmaster/datachecksumsworker.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/htup_details.h"
+#include "access/xact.h"
+#include "access/xloginsert.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_database.h"
+#include "commands/progress.h"
+#include "commands/vacuum.h"
+#include "common/relpath.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "postmaster/bgworker.h"
+#include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/bufmgr.h"
+#include "storage/checksum.h"
+#include "storage/ipc.h"
+#include "storage/lmgr.h"
+#include "storage/lwlock.h"
+#include "storage/procarray.h"
+#include "storage/smgr.h"
+#include "tcop/tcopprot.h"
+#include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
+#include "utils/ps_status.h"
+#include "utils/syscache.h"
+
+/*
+ * Number of times we retry to open a database before giving up and consider
+ * it to have failed processing.
+ */
+#define DATACHECKSUMSWORKER_MAX_DB_RETRIES 5
+
+typedef enum
+{
+	DATACHECKSUMSWORKER_SUCCESSFUL = 0,
+	DATACHECKSUMSWORKER_ABORTED,
+	DATACHECKSUMSWORKER_FAILED,
+	DATACHECKSUMSWORKER_RETRYDB,
+} DataChecksumsWorkerResult;
+
+/*
+ * Signaling between backends calling pg_enable/disable_data_checksums, the
+ * checksums launcher process, and the checksums worker process.
+ *
+ * This struct is protected by DataChecksumsWorkerLock
+ */
+typedef struct DataChecksumsWorkerShmemStruct
+{
+	/*
+	 * These are set by pg_enable/disable_data_checksums, to tell the launcher
+	 * what the target state is.
+	 */
+	bool		launch_enable_checksums;	/* True if checksums are being
+											 * enabled, else false */
+	int			launch_cost_delay;
+	int			launch_cost_limit;
+	bool		launch_fast;
+
+	/*
+	 * Is a launcher process is currently running?
+	 *
+	 * This is set by the launcher process, after it has read the above
+	 * launch_* parameters.
+	 */
+	bool		launcher_running;
+
+	/*
+	 * These fields indicate the target state that the launcher is currently
+	 * working towards. They can be different from the corresponding launch_*
+	 * fields, if a new pg_enable/disable_data_checksums() call was made while
+	 * the launcher/worker was already running.
+	 *
+	 * The below members are set when the launcher starts, and are only
+	 * accessed read-only by the single worker. Thus, we can access these
+	 * without a lock. If multiple workers, or dynamic cost parameters, are
+	 * supported at some point then this would need to be revisited.
+	 */
+	bool		enabling_checksums; /* True if checksums are being enabled,
+									 * else false */
+	int			cost_delay;
+	int			cost_limit;
+	bool		immediate_checkpoint;
+
+	/*
+	 * Signaling between the launcher and the worker process.
+	 *
+	 * As there is only a single worker, and the launcher won't read these
+	 * until the worker exits, they can be accessed without the need for a
+	 * lock. If multiple workers are supported then this will have to be
+	 * revisited.
+	 */
+
+	/* result, set by worker before exiting */
+	DataChecksumsWorkerResult success;
+
+	/*
+	 * tells the worker process whether it should also process the shared
+	 * catalogs
+	 */
+	bool		process_shared_catalogs;
+} DataChecksumsWorkerShmemStruct;
+
+/* Shared memory segment for datachecksumsworker */
+static DataChecksumsWorkerShmemStruct *DataChecksumsWorkerShmem;
+
+/* Bookkeeping for work to do */
+typedef struct DataChecksumsWorkerDatabase
+{
+	Oid			dboid;
+	char	   *dbname;
+} DataChecksumsWorkerDatabase;
+
+typedef struct DataChecksumsWorkerResultEntry
+{
+	Oid			dboid;
+	DataChecksumsWorkerResult result;
+	int			retries;
+} DataChecksumsWorkerResultEntry;
+
+
+/*
+ * Flag set by the interrupt handler
+ */
+static volatile sig_atomic_t abort_requested = false;
+
+/*
+ * Have we set the DataChecksumsWorkerShmemStruct->launcher_running flag?
+ * If we have, we need to clear it before exiting!
+ */
+static volatile sig_atomic_t launcher_running = false;
+
+/*
+ * Are we enabling data checksums, or disabling them?
+ */
+static bool enabling_checksums;
+
+/* Prototypes */
+static List *BuildDatabaseList(void);
+static List *BuildRelationList(bool temp_relations, bool include_shared);
+static void FreeDatabaseList(List *dblist);
+static DataChecksumsWorkerResult ProcessDatabase(DataChecksumsWorkerDatabase *db);
+static bool ProcessAllDatabases(bool immediate_checkpoint);
+static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy);
+static void launcher_cancel_handler(SIGNAL_ARGS);
+static void WaitForAllTransactionsToFinish(void);
+
+/*
+ * StartDataChecksumsWorkerLauncher
+ *		Main entry point for datachecksumsworker launcher process
+ *
+ * The main entrypoint for starting data checksums processing for enabling as
+ * well as disabling.
+ */
+void
+StartDataChecksumsWorkerLauncher(bool enable_checksums,
+								 int cost_delay,
+								 int cost_limit,
+								 bool fast)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	bool		launcher_running;
+
+	/* the cost delay settings have no effect when disabling */
+	Assert(enable_checksums || cost_delay == 0);
+	Assert(enable_checksums || cost_limit == 0);
+
+	/* store the desired state in shared memory */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	DataChecksumsWorkerShmem->launch_enable_checksums = enable_checksums;
+	DataChecksumsWorkerShmem->launch_cost_delay = cost_delay;
+	DataChecksumsWorkerShmem->launch_cost_limit = cost_limit;
+	DataChecksumsWorkerShmem->launch_fast = fast;
+
+	/* is the launcher already running? */
+	launcher_running = DataChecksumsWorkerShmem->launcher_running;
+
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * Launch a new launcher process, if it's not running already.
+	 *
+	 * If the launcher is currently busy enabling the checksums, and we want
+	 * them disabled (or vice versa), the launcher will notice that at latest
+	 * when it's about to exit, and will loop back process the new request. So
+	 * if the launcher is already running, we don't need to do anything more
+	 * here to abort it.
+	 *
+	 * If you call pg_enable/disable_data_checksums() twice in a row, before
+	 * the launcher has had a chance to start up, we still end up launching it
+	 * twice.  That's OK, the second invocation will see that a launcher is
+	 * already running and exit quickly.
+	 *
+	 * TODO: We could optimize here and skip launching the launcher, if we are
+	 * already in the desired state, i.e. if the checksums are already enabled
+	 * and you call pg_enable_data_checksums().
+	 */
+	if (!launcher_running)
+	{
+		/*
+		 * Prepare the BackgroundWorker and launch it.
+		 */
+		memset(&bgw, 0, sizeof(bgw));
+		bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+		bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+		snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+		snprintf(bgw.bgw_function_name, BGW_MAXLEN, "DataChecksumsWorkerLauncherMain");
+		snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker launcher");
+		snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker launcher");
+		bgw.bgw_restart_time = BGW_NEVER_RESTART;
+		bgw.bgw_notify_pid = MyProcPid;
+		bgw.bgw_main_arg = (Datum) 0;
+
+		if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+			ereport(ERROR,
+					(errmsg("failed to start background worker to process data checksums")));
+	}
+}
+
+/*
+ * ProcessSingleRelationFork
+ *		Enable data checksums in a single relation/fork.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy)
+{
+	BlockNumber numblocks = RelationGetNumberOfBlocksInFork(reln, forkNum);
+	BlockNumber blknum;
+	char		activity[NAMEDATALEN * 2 + 128];
+	char	   *relns;
+
+	relns = get_namespace_name(RelationGetNamespace(reln));
+
+	if (!relns)
+		return false;
+
+	/* Report the current relation to pgstat_activity */
+	snprintf(activity, sizeof(activity) - 1, "processing: %s.%s (%s, %dblocks)",
+			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	/* XXX only do this for main forks, maybe we should do it for all? */
+	if (forkNum == MAIN_FORKNUM)
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+									 numblocks);
+
+	/*
+	 * We are looping over the blocks which existed at the time of process
+	 * start, which is safe since new blocks are created with checksums set
+	 * already due to the state being "inprogress-on".
+	 */
+	for (blknum = 0; blknum < numblocks; blknum++)
+	{
+		Buffer		buf = ReadBufferExtended(reln, forkNum, blknum, RBM_NORMAL, strategy);
+
+		/* Need to get an exclusive lock before we can flag as dirty */
+		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+
+		/*
+		 * Mark the buffer as dirty and force a full page write.  We have to
+		 * re-write the page to WAL even if the checksum hasn't changed,
+		 * because if there is a replica it might have a slightly different
+		 * version of the page with an invalid checksum, caused by unlogged
+		 * changes (e.g. hintbits) on the master happening while checksums
+		 * were off. This can happen if there was a valid checksum on the page
+		 * at one point in the past, so only when checksums are first on, then
+		 * off, and then turned on again. Iff wal_level is set to "minimal",
+		 * this could be avoided iff the checksum is calculated to be correct.
+		 */
+		START_CRIT_SECTION();
+		MarkBufferDirty(buf);
+		log_newpage_buffer(buf, false);
+		END_CRIT_SECTION();
+
+		UnlockReleaseBuffer(buf);
+
+		/*
+		 * This is the only place where we check if we are asked to abort, the
+		 * abortion will bubble up from here. It's safe to check this without
+		 * a lock, because if we miss it being set, we will try again soon.
+		 */
+		Assert(enabling_checksums);
+		if (!DataChecksumsWorkerShmem->launch_enable_checksums)
+			abort_requested = true;
+		if (abort_requested)
+			return false;
+
+		/* XXX only do this for main forks, maybe we should do it for all? */
+		if (forkNum == MAIN_FORKNUM)
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+										 (blknum + 1));
+
+		vacuum_delay_point(false);
+	}
+
+	pfree(relns);
+	return true;
+}
+
+/*
+ * ProcessSingleRelationByOid
+ *		Process a single relation based on oid.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationByOid(Oid relationId, BufferAccessStrategy strategy)
+{
+	Relation	rel;
+	ForkNumber	fnum;
+	bool		aborted = false;
+
+	StartTransactionCommand();
+
+	rel = try_relation_open(relationId, AccessShareLock);
+	if (rel == NULL)
+	{
+		/*
+		 * Relation no longer exists. We don't consider this an error since
+		 * there are no pages in it that need data checksums, and thus return
+		 * true. The worker operates off a list of relations generated at the
+		 * start of processing, so relations being dropped in the meantime is
+		 * to be expected.
+		 */
+		CommitTransactionCommand();
+		pgstat_report_activity(STATE_IDLE, NULL);
+		return true;
+	}
+	RelationGetSmgr(rel);
+
+	for (fnum = 0; fnum <= MAX_FORKNUM; fnum++)
+	{
+		if (smgrexists(rel->rd_smgr, fnum))
+		{
+			if (!ProcessSingleRelationFork(rel, fnum, strategy))
+			{
+				aborted = true;
+				break;
+			}
+		}
+	}
+	relation_close(rel, AccessShareLock);
+	elog(DEBUG2,
+		 "data checksum processing done for relation with OID %u: %s",
+		 relationId, (aborted ? "aborted" : "finished"));
+
+	CommitTransactionCommand();
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return !aborted;
+}
+
+/*
+ * ProcessDatabase
+ *		Enable data checksums in a single database.
+ *
+ * We do this by launching a dynamic background worker into this database, and
+ * waiting for it to finish.  We have to do this in a separate worker, since
+ * each process can only be connected to one database during its lifetime.
+ */
+static DataChecksumsWorkerResult
+ProcessDatabase(DataChecksumsWorkerDatabase *db)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	BgwHandleStatus status;
+	pid_t		pid;
+	char		activity[NAMEDATALEN + 64];
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_FAILED;
+
+	memset(&bgw, 0, sizeof(bgw));
+	bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+	bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+	snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+	snprintf(bgw.bgw_function_name, BGW_MAXLEN, "%s", "DataChecksumsWorkerMain");
+	snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker worker");
+	snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker worker");
+	bgw.bgw_restart_time = BGW_NEVER_RESTART;
+	bgw.bgw_notify_pid = MyProcPid;
+	bgw.bgw_main_arg = ObjectIdGetDatum(db->dboid);
+
+	/*
+	 * If there are no worker slots available, make sure we retry processing
+	 * this database. This will make the datachecksumsworker move on to the
+	 * next database and quite likely fail with the same problem. TODO: Maybe
+	 * we need a backoff to avoid running through all the databases here in
+	 * short order.
+	 */
+	if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+	{
+		ereport(WARNING,
+				(errmsg("failed to start worker for enabling data checksums in database \"%s\", retrying",
+						db->dbname),
+				 errhint("The max_worker_processes setting might be too low.")));
+		return DATACHECKSUMSWORKER_RETRYDB;
+	}
+
+	status = WaitForBackgroundWorkerStartup(bgw_handle, &pid);
+	if (status == BGWH_STOPPED)
+	{
+		ereport(WARNING,
+				(errmsg("could not start background worker for enabling data checksums in database \"%s\"",
+						db->dbname),
+				 errhint("More details on the error might be found in the server log.")));
+		return DATACHECKSUMSWORKER_FAILED;
+	}
+
+	/*
+	 * If the postmaster crashed we cannot end up with a processed database so
+	 * we have no alternative other than exiting. When enabling checksums we
+	 * won't at this time have changed the pg_control version to enabled so
+	 * when the cluster comes back up processing will have to be restarted.
+	 * When disabling, the pg_control version will be set to off before this
+	 * so when the cluster comes up checksums will be off as expected.
+	 */
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("cannot enable data checksums without the postmaster process"),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	Assert(status == BGWH_STARTED);
+	ereport(DEBUG1,
+			(errmsg("initiating data checksum processing in database \"%s\"",
+					db->dbname)));
+
+	snprintf(activity, sizeof(activity) - 1,
+			 "Waiting for worker in database %s (pid %ld)", db->dbname, (long) pid);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	status = WaitForBackgroundWorkerShutdown(bgw_handle);
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("postmaster exited during data checksum processing in \"%s\"",
+						db->dbname),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	if (DataChecksumsWorkerShmem->success == DATACHECKSUMSWORKER_ABORTED)
+		ereport(LOG,
+				(errmsg("data checksums processing was aborted in database \"%s\"",
+						db->dbname)));
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return DataChecksumsWorkerShmem->success;
+}
+
+/*
+ * launcher_exit
+ *
+ * Internal routine for cleaning up state when the launcher process exits. We
+ * need to clean up the abort flag to ensure that processing can be restarted
+ * again after it was previously aborted.
+ */
+static void
+launcher_exit(int code, Datum arg)
+{
+	if (launcher_running)
+	{
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		launcher_running = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		LWLockRelease(DataChecksumsWorkerLock);
+	}
+}
+
+/*
+ * launcher_cancel_handler
+ *
+ * Internal routine for reacting to SIGINT and flagging the worker to abort.
+ * The worker won't be interrupted immediately but will check for abort flag
+ * between each block in a relation.
+ */
+static void
+launcher_cancel_handler(SIGNAL_ARGS)
+{
+	int			save_errno = errno;
+
+	abort_requested = true;
+
+	/*
+	 * There is no sleeping in the main loop, the flag will be checked
+	 * periodically in ProcessSingleRelationFork. The worker does however
+	 * sleep when waiting for concurrent transactions to end so we still need
+	 * to set the latch.
+	 */
+	SetLatch(MyLatch);
+
+	errno = save_errno;
+}
+
+/*
+ * WaitForAllTransactionsToFinish
+ *		Blocks awaiting all current transactions to finish
+ *
+ * Returns when all transactions which are active at the call of the function
+ * have ended, or if the postmaster dies while waiting. If the postmaster dies
+ * the abort flag will be set to indicate that the caller of this shouldn't
+ * proceed.
+ *
+ * NB: this will return early, if aborted by SIGINT or if the target state
+ * is changed while we're running.
+ */
+static void
+WaitForAllTransactionsToFinish(void)
+{
+	TransactionId waitforxid;
+
+	LWLockAcquire(XidGenLock, LW_SHARED);
+	waitforxid = XidFromFullTransactionId(TransamVariables->nextXid);
+	LWLockRelease(XidGenLock);
+
+	while (TransactionIdPrecedes(GetOldestActiveTransactionId(), waitforxid))
+	{
+		char		activity[64];
+		int			rc;
+
+		/* Oldest running xid is older than us, so wait */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for current transactions to finish (waiting for %u)",
+				 waitforxid);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		rc = WaitLatch(MyLatch,
+					   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+					   5000,
+					   WAIT_EVENT_CHECKSUM_ENABLE_STARTCONDITION);
+
+		/*
+		 * If the postmaster died we won't be able to enable checksums
+		 * cluster-wide so abort and hope to continue when restarted.
+		 */
+		if (rc & WL_POSTMASTER_DEATH)
+			ereport(FATAL,
+					(errmsg("postmaster exited during data checksum processing"),
+					 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_SHARED);
+		if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			abort_requested = true;
+		LWLockRelease(DataChecksumsWorkerLock);
+		if (abort_requested)
+			break;
+	}
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+	return;
+}
+
+/*
+ * DataChecksumsWorkerLauncherMain
+ *
+ * Main function for launching dynamic background workers for processing data
+ * checksums in databases. This function has the bgworker management, with
+ * ProcessAllDatabases being responsible for looping over the databases and
+ * initiating processing.
+ */
+void
+DataChecksumsWorkerLauncherMain(Datum arg)
+{
+	on_shmem_exit(launcher_exit, 0);
+
+	ereport(DEBUG1,
+			errmsg("background worker \"datachecksumsworker\" launcher started"));
+
+	pqsignal(SIGTERM, die);
+	pqsignal(SIGINT, launcher_cancel_handler);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_LAUNCHER;
+	init_ps_display(NULL);
+
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	if (DataChecksumsWorkerShmem->launcher_running)
+	{
+		/* Launcher was already running, let it finish */
+		LWLockRelease(DataChecksumsWorkerLock);
+		return;
+	}
+
+	launcher_running = true;
+
+	/*
+	 * Initialize a connection to shared catalogs only.
+	 */
+	BackgroundWorkerInitializeConnectionByOid(InvalidOid, InvalidOid, 0);
+
+	enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+	DataChecksumsWorkerShmem->launcher_running = true;
+	DataChecksumsWorkerShmem->enabling_checksums = enabling_checksums;
+	DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+	DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+	DataChecksumsWorkerShmem->immediate_checkpoint = DataChecksumsWorkerShmem->launch_fast;
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * The target state can change while we are busy enabling/disabling
+	 * checksums, if the user calls pg_disable/enable_data_checksums() before
+	 * we are finished with the previous request. In that case, we will loop
+	 * back here, to process the new request.
+	 */
+again:
+
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	/*
+	 * If we're asked to enable checksums, we need to check if processing was
+	 * previously interrupted such that we should resume rather than start
+	 * from scratch.
+	 */
+	if (enabling_checksums)
+	{
+		/*
+		 * If we are asked to enable checksums in a cluster which already has
+		 * checksums enabled, exit immediately as there is nothing more to do.
+		 * Hold interrupts to make sure state doesn't change during checking.
+		 */
+		HOLD_INTERRUPTS();
+		if (DataChecksumsNeedVerify())
+		{
+			RESUME_INTERRUPTS();
+			goto done;
+		}
+		RESUME_INTERRUPTS();
+
+		/*
+		 * Initialize progress and indicate that we are waiting on the other
+		 * backends to clear the procsignalbarrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS);
+
+		/* XXX isn't it weird there's no wait between the phase updates? */
+
+		/*
+		 * Set the state to inprogress-on and wait on the procsignal barrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_ENABLING);
+		SetDataChecksumsOnInProgress();
+
+		if (!ProcessAllDatabases(DataChecksumsWorkerShmem->immediate_checkpoint))
+		{
+			/*
+			 * If the target state changed during processing then it's not a
+			 * failure, so restart processing instead.
+			 */
+			LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+			if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			{
+				LWLockRelease(DataChecksumsWorkerLock);
+				goto done;
+			}
+			LWLockRelease(DataChecksumsWorkerLock);
+			ereport(ERROR,
+					(errmsg("unable to enable data checksums in cluster")));
+		}
+
+		SetDataChecksumsOn();
+	}
+	else
+	{
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_DISABLING);
+		SetDataChecksumsOff();
+	}
+
+done:
+
+	/*
+	 * All done. But before we exit, check if the target state was changed
+	 * while we were running. In that case we will have to start all over
+	 * again.
+	 */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+	if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+	{
+		DataChecksumsWorkerShmem->enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+		DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+		LWLockRelease(DataChecksumsWorkerLock);
+		goto again;
+	}
+
+	/* Shut down progress reporting as we are done */
+	pgstat_progress_end_command();
+
+	launcher_running = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	LWLockRelease(DataChecksumsWorkerLock);
+}
+
+/*
+ * ProcessAllDatabases
+ *		Compute the list of all databases and process checksums in each
+ *
+ * This will repeatedly generate a list of databases to process for enabling
+ * checksums. Until no new databases are found, this will loop around computing
+ * a new list and comparing it to the already seen ones.
+ *
+ * If immediate_checkpoint is set to true then a CHECKPOINT_IMMEDIATE will be
+ * issued. This is useful for testing but should be avoided in production use
+ * as it may affect cluster performance drastically.
+ */
+static bool
+ProcessAllDatabases(bool immediate_checkpoint)
+{
+	List	   *DatabaseList;
+	HTAB	   *ProcessedDatabases = NULL;
+	HASHCTL		hash_ctl;
+	bool		found_failed = false;
+	int			flags;
+
+	/* Initialize a hash tracking all processed databases */
+	memset(&hash_ctl, 0, sizeof(hash_ctl));
+	hash_ctl.keysize = sizeof(Oid);
+	hash_ctl.entrysize = sizeof(DataChecksumsWorkerResultEntry);
+	ProcessedDatabases = hash_create("Processed databases",
+									 64,
+									 &hash_ctl,
+									 HASH_ELEM | HASH_BLOBS);
+
+	/*
+	 * Set up so first run processes shared catalogs, but not once in every
+	 * db.
+	 */
+	DataChecksumsWorkerShmem->process_shared_catalogs = true;
+
+	/*
+	 * Get a list of all databases to process. This may include databases that
+	 * were created during our runtime.  Since a database can be created as a
+	 * copy of any other database (which may not have existed in our last
+	 * run), we have to repeat this loop until no new databases show up in the
+	 * list.
+	 */
+	DatabaseList = BuildDatabaseList();
+
+	/*
+	 * Update progress reporting with the total number of databases we need to
+	 * process. This number should not be changed during processing, the
+	 * columns for processed databases is instead increased such that it can
+	 * be compared against the total.
+	 */
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_DBS_TOTAL,
+			PROGRESS_DATACHECKSUMS_DBS_DONE,
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE,
+			PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+			PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+		};
+
+		int64	vals[6];
+
+		vals[0] = list_length(DatabaseList);
+		vals[1] = 0;
+
+		/* translated to NULL */
+		vals[2] = -1;
+		vals[3] = -1;
+		vals[4] = -1;
+		vals[5] = -1;
+
+		pgstat_progress_update_multi_param(6, index, vals);
+	}
+
+	while (true)
+	{
+		int			processed_databases = 0;
+
+		foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+		{
+			DataChecksumsWorkerResult result;
+			DataChecksumsWorkerResultEntry *entry;
+			bool		found;
+
+			/*
+			 * Check if this database has been processed already, and if so
+			 * whether it should be retried or skipped.
+			 */
+			entry = (DataChecksumsWorkerResultEntry *) hash_search(ProcessedDatabases, &db->dboid,
+																   HASH_FIND, NULL);
+
+			if (entry)
+			{
+				if (entry->result == DATACHECKSUMSWORKER_RETRYDB)
+				{
+					/*
+					 * Limit the number of retries to avoid infinite looping
+					 * in case there simply won't be enough workers in the
+					 * cluster to finish this operation.
+					 */
+					if (entry->retries > DATACHECKSUMSWORKER_MAX_DB_RETRIES)
+						entry->result = DATACHECKSUMSWORKER_FAILED;
+				}
+
+				/* Skip if this database has been processed already */
+				if (entry->result != DATACHECKSUMSWORKER_RETRYDB)
+					continue;
+			}
+
+			result = ProcessDatabase(db);
+			processed_databases++;
+
+			/*
+			 * Update the number of processed databases in the progress report.
+			 */
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_DBS_DONE,
+										 processed_databases);
+
+			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
+			{
+				/*
+				 * If one database has completed shared catalogs, we don't
+				 * have to process them again.
+				 */
+				if (DataChecksumsWorkerShmem->process_shared_catalogs)
+					DataChecksumsWorkerShmem->process_shared_catalogs = false;
+			}
+			else if (result == DATACHECKSUMSWORKER_ABORTED)
+			{
+				/* Abort flag set, so exit the whole process */
+				return false;
+			}
+
+			entry = hash_search(ProcessedDatabases, &db->dboid, HASH_ENTER, &found);
+			entry->dboid = db->dboid;
+			entry->result = result;
+			if (!found)
+				entry->retries = 0;
+			else
+				entry->retries++;
+		}
+
+		elog(DEBUG1,
+			 "%i databases processed for data checksum enabling, %s",
+			 processed_databases,
+			 (processed_databases ? "process with restart" : "process completed"));
+
+		FreeDatabaseList(DatabaseList);
+
+		/*
+		 * If no databases were processed in this run of the loop, we have now
+		 * finished all databases and no concurrently created ones can exist.
+		 */
+		if (processed_databases == 0)
+			break;
+
+		/*
+		 * Re-generate the list of databases for another pass. Since we wait
+		 * for all pre-existing transactions finish, this way we can be
+		 * certain that there are no databases left without checksums.
+		 */
+		WaitForAllTransactionsToFinish();
+		DatabaseList = BuildDatabaseList();
+	}
+
+	/*
+	 * ProcessedDatabases now has all databases and the results of their
+	 * processing. Failure to enable checksums for a database can be because
+	 * they actually failed for some reason, or because the database was
+	 * dropped between us getting the database list and trying to process it.
+	 * Get a fresh list of databases to detect the second case where the
+	 * database was dropped before we had started processing it. If a database
+	 * still exists, but enabling checksums failed then we fail the entire
+	 * checksumming process and exit with an error.
+	 */
+	WaitForAllTransactionsToFinish();
+	DatabaseList = BuildDatabaseList();
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+	{
+		DataChecksumsWorkerResult *entry;
+		bool		found;
+
+		entry = hash_search(ProcessedDatabases, (void *) &db->dboid,
+							HASH_FIND, &found);
+
+		/*
+		 * We are only interested in the databases where the failed database
+		 * still exists.
+		 */
+		if (found && *entry == DATACHECKSUMSWORKER_FAILED)
+		{
+			ereport(WARNING,
+					(errmsg("failed to enable data checksums in \"%s\"",
+							db->dbname)));
+			found_failed = found;
+			continue;
+		}
+	}
+
+	FreeDatabaseList(DatabaseList);
+
+	if (found_failed)
+	{
+		/* Disable checksums on cluster, because we failed */
+		SetDataChecksumsOff();
+		ereport(ERROR,
+				(errmsg("data checksums failed to get enabled in all databases, aborting"),
+				 errhint("The server log might have more information on the cause of the error.")));
+	}
+
+	/*
+	 * When enabling checksums, we have to wait for a checkpoint for the
+	 * checksums to e.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT);
+
+	/*
+	 * Force a checkpoint to get everything out to disk. The use of immediate
+	 * checkpoints is for running tests, as they would otherwise not execute
+	 * in such a way that they can reliably be placed under timeout control.
+	 */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_IMMEDIATE;
+	RequestCheckpoint(flags);
+
+	return true;
+}
+
+/*
+ * DataChecksumsWorkerShmemSize
+ *		Compute required space for datachecksumsworker-related shared memory
+ */
+Size
+DataChecksumsWorkerShmemSize(void)
+{
+	Size		size;
+
+	size = sizeof(DataChecksumsWorkerShmemStruct);
+	size = MAXALIGN(size);
+
+	return size;
+}
+
+/*
+ * DataChecksumsWorkerShmemInit
+ *		Allocate and initialize datachecksumsworker-related shared memory
+ */
+void
+DataChecksumsWorkerShmemInit(void)
+{
+	bool		found;
+
+	DataChecksumsWorkerShmem = (DataChecksumsWorkerShmemStruct *)
+		ShmemInitStruct("DataChecksumsWorker Data",
+						DataChecksumsWorkerShmemSize(),
+						&found);
+
+	if (!found)
+	{
+		MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
+
+		/*
+		 * Even if this is a redundant assignment, we want to be explicit about
+		 * our intent for readability, since we want to be able to query this
+		 * state in case of restartability.
+		 */
+		DataChecksumsWorkerShmem->launch_enable_checksums = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		DataChecksumsWorkerShmem->launch_fast = false;
+	}
+}
+
+/*
+ * BuildDatabaseList
+ *		Compile a list of all currently available databases in the cluster
+ *
+ * This creates the list of databases for the datachecksumsworker workers to
+ * add checksums to. If the caller wants to ensure that no concurrently
+ * running CREATE DATABASE calls exist, this needs to be preceded by a call
+ * to WaitForAllTransactionsToFinish().
+ */
+static List *
+BuildDatabaseList(void)
+{
+	List	   *DatabaseList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(DatabaseRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_database pgdb = (Form_pg_database) GETSTRUCT(tup);
+		DataChecksumsWorkerDatabase *db;
+
+		oldctx = MemoryContextSwitchTo(ctx);
+
+		db = (DataChecksumsWorkerDatabase *) palloc0(sizeof(DataChecksumsWorkerDatabase));
+
+		db->dboid = pgdb->oid;
+		db->dbname = pstrdup(NameStr(pgdb->datname));
+
+		DatabaseList = lappend(DatabaseList, db);
+
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return DatabaseList;
+}
+
+static void
+FreeDatabaseList(List *dblist)
+{
+	if (!dblist)
+		return;
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, dblist)
+	{
+		if (db->dbname != NULL)
+			pfree(db->dbname);
+	}
+
+	list_free_deep(dblist);
+}
+
+/*
+ * BuildRelationList
+ *		Compile a list of relations in the database
+ *
+ * Returns a list of OIDs for the request relation types. If temp_relations
+ * is True then only temporary relations are returned. If temp_relations is
+ * False then non-temporary relations which have data checksums are returned.
+ * If include_shared is True then shared relations are included as well in a
+ * non-temporary list. include_shared has no relevance when building a list of
+ * temporary relations.
+ */
+static List *
+BuildRelationList(bool temp_relations, bool include_shared)
+{
+	List	   *RelationList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(RelationRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_class pgc = (Form_pg_class) GETSTRUCT(tup);
+
+		/*
+		 * Only include temporary relations when asked for a temp relation
+		 * list.
+		 */
+		if (pgc->relpersistence == RELPERSISTENCE_TEMP)
+		{
+			if (!temp_relations)
+				continue;
+		}
+		else
+		{
+			/*
+			 * If we are only interested in temp relations then continue
+			 * immediately as the current relation isn't a temp relation.
+			 */
+			if (temp_relations)
+				continue;
+
+			if (!RELKIND_HAS_STORAGE(pgc->relkind))
+				continue;
+
+			if (pgc->relisshared && !include_shared)
+				continue;
+		}
+
+		oldctx = MemoryContextSwitchTo(ctx);
+		RelationList = lappend_oid(RelationList, pgc->oid);
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return RelationList;
+}
+
+/*
+ * DataChecksumsWorkerMain
+ *
+ * Main function for enabling checksums in a single database, This is the
+ * function set as the bgw_function_name in the dynamic background worker
+ * process initiated for each database by the worker launcher. After enabling
+ * data checksums in each applicable relation in the database, it will wait for
+ * all temporary relations that were present when the function started to
+ * disappear before returning. This is required since we cannot rewrite
+ * existing temporary relations with data checksums.
+ */
+void
+DataChecksumsWorkerMain(Datum arg)
+{
+	Oid			dboid = DatumGetObjectId(arg);
+	List	   *RelationList = NIL;
+	List	   *InitialTempTableList = NIL;
+	BufferAccessStrategy strategy;
+	bool		aborted = false;
+	int64		rels_done;
+
+	enabling_checksums = true;
+
+	pqsignal(SIGTERM, die);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_WORKER;
+	init_ps_display(NULL);
+
+	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
+											  BGWORKER_BYPASS_ALLOWCONN);
+
+	/* worker will have a separate entry in pg_stat_progress_data_checksums */
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	/*
+	 * Get a list of all temp tables present as we start in this database. We
+	 * need to wait until they are all gone until we are done, since we cannot
+	 * access these relations and modify them.
+	 */
+	InitialTempTableList = BuildRelationList(true, false);
+
+	/*
+	 * Enable vacuum cost delay, if any.
+	 */
+	Assert(DataChecksumsWorkerShmem->enabling_checksums);
+	VacuumCostDelay = DataChecksumsWorkerShmem->cost_delay;
+	VacuumCostLimit = DataChecksumsWorkerShmem->cost_limit;
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumCostPageHit = 0;
+	VacuumCostPageMiss = 0;
+	VacuumCostPageDirty = 0;
+
+	/*
+	 * Create and set the vacuum strategy as our buffer strategy.
+	 */
+	strategy = GetAccessStrategy(BAS_VACUUM);
+
+	RelationList = BuildRelationList(false,
+									 DataChecksumsWorkerShmem->process_shared_catalogs);
+
+	/* Update the total number of relations to be processed in this DB. */
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE
+		};
+
+		int64	vals[2];
+
+		vals[0] = list_length(RelationList);
+		vals[1] = 0;
+
+		pgstat_progress_update_multi_param(2, index, vals);
+	}
+
+	/* process the relations */
+	rels_done = 0;
+	foreach_oid(reloid, RelationList)
+	{
+		if (!ProcessSingleRelationByOid(reloid, strategy))
+		{
+			aborted = true;
+			break;
+		}
+
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_RELS_DONE,
+									 ++rels_done);
+	}
+	list_free(RelationList);
+
+	if (aborted)
+	{
+		DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+		ereport(DEBUG1,
+				(errmsg("data checksum processing aborted in database OID %u",
+						dboid)));
+		return;
+	}
+
+	/* The worker is about to wait for temporary tables to go away. */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL);
+
+	/*
+	 * Wait for all temp tables that existed when we started to go away. This
+	 * is necessary since we cannot "reach" them to enable checksums. Any temp
+	 * tables created after we started will already have checksums in them
+	 * (due to the "inprogress-on" state), so no need to wait for those.
+	 */
+	for (;;)
+	{
+		List	   *CurrentTempTables;
+		int			numleft;
+		char		activity[64];
+
+		CurrentTempTables = BuildRelationList(true, false);
+		numleft = 0;
+		foreach_oid(tmptbloid, InitialTempTableList)
+		{
+			if (list_member_oid(CurrentTempTables, tmptbloid))
+				numleft++;
+		}
+		list_free(CurrentTempTables);
+
+		if (numleft == 0)
+			break;
+
+		/*
+		 * At least one temp table is left to wait for, indicate in pgstat
+		 * activity and progress reporting.
+		 */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for %d temp tables to be removed", numleft);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		(void) WaitLatch(MyLatch,
+						 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+						 5000,
+						 WAIT_EVENT_CHECKSUM_ENABLE_FINISHCONDITION);
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		aborted = DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums;
+		LWLockRelease(DataChecksumsWorkerLock);
+
+		if (aborted || abort_requested)
+		{
+			DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+			ereport(DEBUG1,
+					(errmsg("data checksum processing aborted in database OID %u",
+							dboid)));
+			return;
+		}
+	}
+
+	list_free(InitialTempTableList);
+
+	/* worker done */
+	pgstat_progress_end_command();
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
+}
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 47375e5bfaa..92d8017fd56 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -202,6 +202,9 @@ static child_process_kind child_process_kinds[] = {
 	[B_WAL_SUMMARIZER] = {"wal_summarizer", WalSummarizerMain, true},
 	[B_WAL_WRITER] = {"wal_writer", WalWriterMain, true},
 
+	[B_DATACHECKSUMSWORKER_LAUNCHER] = {"datachecksum launcher", NULL, false},
+	[B_DATACHECKSUMSWORKER_WORKER] = {"datachecksum worker", NULL, false},
+
 	[B_LOGGER] = {"syslogger", SysLoggerMain, false},
 };
 
diff --git a/src/backend/postmaster/meson.build b/src/backend/postmaster/meson.build
index 0008603cfee..ce10ef1059a 100644
--- a/src/backend/postmaster/meson.build
+++ b/src/backend/postmaster/meson.build
@@ -6,6 +6,7 @@ backend_sources += files(
   'bgworker.c',
   'bgwriter.c',
   'checkpointer.c',
+  'datachecksumsworker.c',
   'fork_process.c',
   'interrupt.c',
   'launch_backend.c',
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index d2a7a7add6f..2fc438987b5 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2947,6 +2947,11 @@ PostmasterStateMachine(void)
 									B_INVALID,
 									B_STANDALONE_BACKEND);
 
+			/* also add checksumming processes */
+			remainMask = btmask_add(remainMask,
+									B_DATACHECKSUMSWORKER_LAUNCHER,
+									B_DATACHECKSUMSWORKER_WORKER);
+
 			/* All types should be included in targetMask or remainMask */
 			Assert((remainMask.mask | targetMask.mask) == BTYPE_MASK_ALL.mask);
 		}
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 24d88f368d8..00cce4c7673 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -186,6 +186,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 		case XLOG_FPW_CHANGE:
 		case XLOG_FPI_FOR_HINT:
 		case XLOG_FPI:
+		case XLOG_CHECKSUMS:
 		case XLOG_OVERWRITE_CONTRECORD:
 		case XLOG_CHECKPOINT_REDO:
 			break;
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 174eed70367..e7761a3ddb6 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -30,6 +30,8 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "postmaster/postmaster.h"
 #include "postmaster/walsummarizer.h"
 #include "replication/logicallauncher.h"
 #include "replication/origin.h"
@@ -148,6 +150,7 @@ CalculateShmemSize(int *num_semaphores)
 	size = add_size(size, WaitEventCustomShmemSize());
 	size = add_size(size, InjectionPointShmemSize());
 	size = add_size(size, SlotSyncShmemSize());
+	size = add_size(size, DataChecksumsWorkerShmemSize());
 
 	/* include additional requested shmem from preload libraries */
 	size = add_size(size, total_addin_request);
@@ -330,6 +333,7 @@ CreateOrAttachShmemStructs(void)
 	PgArchShmemInit();
 	ApplyLauncherShmemInit();
 	SlotSyncShmemInit();
+	DataChecksumsWorkerShmemInit();
 
 	/*
 	 * Set up other modules that need some shared memory space
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 7d201965503..2b13a8cd260 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -18,6 +18,7 @@
 #include <unistd.h>
 
 #include "access/parallel.h"
+#include "access/xlog.h"
 #include "commands/async.h"
 #include "miscadmin.h"
 #include "pgstat.h"
@@ -575,6 +576,18 @@ ProcessProcSignalBarrier(void)
 					case PROCSIGNAL_BARRIER_SMGRRELEASE:
 						processed = ProcessBarrierSmgrRelease();
 						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON:
+						processed = AbsorbChecksumsOnInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_ON:
+						processed = AbsorbChecksumsOnBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF:
+						processed = AbsorbChecksumsOffInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_OFF:
+						processed = AbsorbChecksumsOffBarrier();
+						break;
 				}
 
 				/*
diff --git a/src/backend/storage/page/README b/src/backend/storage/page/README
index e30d7ac59ad..73c36a63908 100644
--- a/src/backend/storage/page/README
+++ b/src/backend/storage/page/README
@@ -10,7 +10,9 @@ http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf, discussed
 2010/12/22 on -hackers list.
 
 Current implementation requires this be enabled system-wide at initdb time, or
-by using the pg_checksums tool on an offline cluster.
+by using the pg_checksums tool on an offline cluster.  Checksums can also be
+enabled at runtime using pg_enable_data_checksums(), and disabled by using
+pg_disable_data_checksums().
 
 The checksum is not valid at all times on a data page!!
 The checksum is valid when the page leaves the shared pool and is checked
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index ecc81aacfc3..8bd5fed8c85 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -98,7 +98,7 @@ PageIsVerifiedExtended(PageData *page, BlockNumber blkno, int flags)
 	 */
 	if (!PageIsNew(page))
 	{
-		if (DataChecksumsEnabled())
+		if (DataChecksumsNeedVerify())
 		{
 			checksum = pg_checksum_page(page, blkno);
 
@@ -1501,7 +1501,7 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 	static char *pageCopy = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return page;
 
 	/*
@@ -1531,7 +1531,7 @@ void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
 	/* If we don't need a checksum, just return */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return;
 
 	((PageHeader) page)->pd_checksum = pg_checksum_page(page, blkno);
diff --git a/src/backend/utils/activity/pgstat_backend.c b/src/backend/utils/activity/pgstat_backend.c
index 6efbb650aa8..bd458f8c1af 100644
--- a/src/backend/utils/activity/pgstat_backend.c
+++ b/src/backend/utils/activity/pgstat_backend.c
@@ -295,6 +295,8 @@ pgstat_tracks_backend_bktype(BackendType bktype)
 		case B_BG_WRITER:
 		case B_CHECKPOINTER:
 		case B_STARTUP:
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 			return false;
 
 		case B_AUTOVAC_WORKER:
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index eb575025596..e7b418a000e 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -370,6 +370,8 @@ pgstat_tracks_io_bktype(BackendType bktype)
 		case B_LOGGER:
 			return false;
 
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 		case B_AUTOVAC_LAUNCHER:
 		case B_AUTOVAC_WORKER:
 		case B_BACKEND:
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index 3c594415bfd..b1b5cdcf36c 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -115,6 +115,8 @@ CHECKPOINT_DELAY_COMPLETE	"Waiting for a backend that blocks a checkpoint from c
 CHECKPOINT_DELAY_START	"Waiting for a backend that blocks a checkpoint from starting."
 CHECKPOINT_DONE	"Waiting for a checkpoint to complete."
 CHECKPOINT_START	"Waiting for a checkpoint to start."
+CHECKSUM_ENABLE_STARTCONDITION	"Waiting for data checksums enabling to start."
+CHECKSUM_ENABLE_FINISHCONDITION	"Waiting for data checksums to be enabled."
 EXECUTE_GATHER	"Waiting for activity from a child process while executing a <literal>Gather</literal> plan node."
 HASH_BATCH_ALLOCATE	"Waiting for an elected Parallel Hash participant to allocate a hash table."
 HASH_BATCH_ELECT	"Waiting to elect a Parallel Hash participant to allocate a hash table."
@@ -346,6 +348,7 @@ WALSummarizer	"Waiting to read or update WAL summarization state."
 DSMRegistry	"Waiting to read or update the dynamic shared memory registry."
 InjectionPoint	"Waiting to read or update information related to injection points."
 SerialControl	"Waiting to read or update shared <filename>pg_serial</filename> state."
+DataChecksumsWorker	"Waiting for data checksumsworker."
 
 #
 # END OF PREDEFINED LWLOCKS (DO NOT CHANGE THIS LINE)
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 9172e1cb9d2..f6bd1e9f0ca 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -274,6 +274,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		cmdtype = PROGRESS_COMMAND_BASEBACKUP;
 	else if (pg_strcasecmp(cmd, "COPY") == 0)
 		cmdtype = PROGRESS_COMMAND_COPY;
+	else if (pg_strcasecmp(cmd, "DATACHECKSUMS") == 0)
+		cmdtype = PROGRESS_COMMAND_DATACHECKSUMS;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -1146,9 +1148,6 @@ pg_stat_get_db_checksum_failures(PG_FUNCTION_ARGS)
 	int64		result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
@@ -1164,9 +1163,6 @@ pg_stat_get_db_checksum_last_failure(PG_FUNCTION_ARGS)
 	TimestampTz result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index dc3521457c7..a071ba6f455 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -293,6 +293,12 @@ GetBackendTypeDesc(BackendType backendType)
 		case B_CHECKPOINTER:
 			backendDesc = gettext_noop("checkpointer");
 			break;
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+			backendDesc = "datachecksumsworker launcher";
+			break;
+		case B_DATACHECKSUMSWORKER_WORKER:
+			backendDesc = "datachecksumsworker worker";
+			break;
 		case B_LOGGER:
 			backendDesc = gettext_noop("logger");
 			break;
@@ -892,7 +898,8 @@ InitializeSessionUserIdStandalone(void)
 	 * workers, in slot sync worker and in background workers.
 	 */
 	Assert(!IsUnderPostmaster || AmAutoVacuumWorkerProcess() ||
-		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess());
+		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess() ||
+		   AmDataChecksumsWorkerProcess());
 
 	/* call only once */
 	Assert(!OidIsValid(AuthenticatedUserId));
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index ee1a9d5d98b..70899e6ef2d 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -739,6 +739,11 @@ InitPostgres(const char *in_dbname, Oid dboid,
 	 */
 	SharedInvalBackendInit(false);
 
+	/*
+	 * Set up backend local cache of Controldata values.
+	 */
+	InitLocalControldata();
+
 	ProcSignalInit(MyCancelKeyValid, MyCancelKey);
 
 	/*
@@ -869,7 +874,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
 					 errhint("You should immediately run CREATE USER \"%s\" SUPERUSER;.",
 							 username != NULL ? username : "postgres")));
 	}
-	else if (AmBackgroundWorkerProcess())
+	else if (AmBackgroundWorkerProcess() || AmDataChecksumsWorkerProcess())
 	{
 		if (username == NULL && !OidIsValid(useroid))
 		{
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index ad25cbb39c5..66abf056159 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -476,6 +476,14 @@ static const struct config_enum_entry wal_compression_options[] = {
 	{NULL, 0, false}
 };
 
+static const struct config_enum_entry data_checksums_options[] = {
+	{"on", DATA_CHECKSUMS_ON, true},
+	{"off", DATA_CHECKSUMS_OFF, true},
+	{"inprogress-on", DATA_CHECKSUMS_INPROGRESS_ON, true},
+	{"inprogress-off", DATA_CHECKSUMS_INPROGRESS_OFF, true},
+	{NULL, 0, false}
+};
+
 /*
  * Options for enum values stored in other modules
  */
@@ -601,7 +609,6 @@ static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
 static int	num_os_semaphores;
-static bool data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
@@ -1952,17 +1959,6 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
-	{
-		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
-			gettext_noop("Shows whether data checksums are turned on for this cluster."),
-			NULL,
-			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
-		},
-		&data_checksums,
-		false,
-		NULL, NULL, NULL
-	},
-
 	{
 		{"syslog_sequence_numbers", PGC_SIGHUP, LOGGING_WHERE,
 			gettext_noop("Add sequence number to syslog messages to avoid duplicate suppression."),
@@ -5299,6 +5295,17 @@ struct config_enum ConfigureNamesEnum[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows whether data checksums are turned on for this cluster."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
+		},
+		&data_checksums,
+		DATA_CHECKSUMS_OFF, data_checksums_options,
+		NULL, NULL, show_data_checksums
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 867aeddc601..79c3f86357e 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -576,7 +576,7 @@ main(int argc, char *argv[])
 		mode == PG_MODE_DISABLE)
 		pg_fatal("data checksums are already disabled in cluster");
 
-	if (ControlFile->data_checksum_version > 0 &&
+	if (ControlFile->data_checksum_version == DATA_CHECKSUMS_ON &&
 		mode == PG_MODE_ENABLE)
 		pg_fatal("data checksums are already enabled in cluster");
 
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index bd49ea867bf..f48936d3b00 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -14,6 +14,7 @@
 
 #include "common/string.h"
 #include "pg_upgrade.h"
+#include "storage/bufpage.h"
 
 
 /*
@@ -735,6 +736,14 @@ check_control_data(ControlData *oldctrl,
 	 * check_for_isn_and_int8_passing_mismatch().
 	 */
 
+	/*
+	 * If data checksums are in any in-progress state then disallow the
+	 * upgrade. The user should either let the process finish, or turn off
+	 * data checksums, before retrying.
+	 */
+	if (oldctrl->data_checksum_version > PG_DATA_CHECKSUM_VERSION)
+		pg_fatal("checksums are being enabled in the old cluster");
+
 	/*
 	 * We might eventually allow upgrades from checksum to no-checksum
 	 * clusters.
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index d313099c027..aec3ea0bc63 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -56,6 +56,7 @@ extern PGDLLIMPORT int CommitDelay;
 extern PGDLLIMPORT int CommitSiblings;
 extern PGDLLIMPORT bool track_wal_io_timing;
 extern PGDLLIMPORT int wal_decode_buffer_size;
+extern PGDLLIMPORT int data_checksums;
 
 extern PGDLLIMPORT int CheckPointSegments;
 
@@ -117,7 +118,7 @@ extern PGDLLIMPORT int wal_level;
  * of the bits make it to disk, but the checksum wouldn't match.  Also WAL-log
  * them if forced by wal_log_hints=on.
  */
-#define XLogHintBitIsNeeded() (DataChecksumsEnabled() || wal_log_hints)
+#define XLogHintBitIsNeeded() (wal_log_hints || DataChecksumsNeedWrite())
 
 /* Do we need to WAL-log information required only for Hot Standby and logical replication? */
 #define XLogStandbyInfoActive() (wal_level >= WAL_LEVEL_REPLICA)
@@ -230,7 +231,19 @@ extern XLogRecPtr GetXLogWriteRecPtr(void);
 
 extern uint64 GetSystemIdentifier(void);
 extern char *GetMockAuthenticationNonce(void);
-extern bool DataChecksumsEnabled(void);
+extern bool DataChecksumsNeedWrite(void);
+extern bool DataChecksumsNeedVerify(void);
+extern bool DataChecksumsOnInProgress(void);
+extern bool DataChecksumsOffInProgress(void);
+extern void SetDataChecksumsOnInProgress(void);
+extern void SetDataChecksumsOn(void);
+extern void SetDataChecksumsOff(void);
+extern bool AbsorbChecksumsOnInProgressBarrier(void);
+extern bool AbsorbChecksumsOffInProgressBarrier(void);
+extern bool AbsorbChecksumsOnBarrier(void);
+extern bool AbsorbChecksumsOffBarrier(void);
+extern const char *show_data_checksums(void);
+extern void InitLocalControldata(void);
 extern bool GetDefaultCharSignedness(void);
 extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
 extern Size XLOGShmemSize(void);
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index 2cf8d55d706..ee82d4ce73d 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,6 +25,7 @@
 #include "lib/stringinfo.h"
 #include "pgtime.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/relfilelocator.h"
 
 
@@ -289,6 +290,12 @@ typedef struct xl_restore_point
 	char		rp_name[MAXFNAMELEN];
 } xl_restore_point;
 
+/* Information logged when data checksum level is changed */
+typedef struct xl_checksum_state
+{
+	ChecksumType new_checksumtype;
+} xl_checksum_state;
+
 /* Overwrite of prior contrecord */
 typedef struct xl_overwrite_contrecord
 {
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 63e834a6ce4..9b5a50adf54 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -80,6 +80,7 @@ typedef struct CheckPoint
 /* 0xC0 is used in Postgres 9.5-11 */
 #define XLOG_OVERWRITE_CONTRECORD		0xD0
 #define XLOG_CHECKPOINT_REDO			0xE0
+#define XLOG_CHECKSUMS					0xF0
 
 
 /*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index cede992b6e2..fb0e062b984 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12246,6 +12246,23 @@
   proname => 'jsonb_subscript_handler', prorettype => 'internal',
   proargtypes => 'internal', prosrc => 'jsonb_subscript_handler' },
 
+# data checksum management functions
+{ oid => '9258',
+  descr => 'disable data checksums',
+  proname => 'pg_disable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => '',
+  prosrc => 'disable_data_checksums' },
+
+{ oid => '9257',
+  descr => 'enable data checksums',
+  proname => 'pg_enable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'int4 int4 bool', proallargtypes => '{int4,int4,bool}',
+  proargmodes => '{i,i,i}',
+  proargnames => '{cost_delay,cost_limit,fast}',
+  prosrc => 'enable_data_checksums' },
+
 # collation management functions
 { oid => '3445', descr => 'import collations from operating system',
   proname => 'pg_import_system_collations', procost => '100',
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 7c736e7b03b..94b478a6cc9 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -157,4 +157,21 @@
 #define PROGRESS_COPY_TYPE_PIPE 3
 #define PROGRESS_COPY_TYPE_CALLBACK 4
 
+/* Progress parameters for PROGRESS_DATACHECKSUMS */
+#define PROGRESS_DATACHECKSUMS_PHASE		0
+#define PROGRESS_DATACHECKSUMS_DBS_TOTAL	1
+#define PROGRESS_DATACHECKSUMS_DBS_DONE		2
+#define PROGRESS_DATACHECKSUMS_RELS_TOTAL	3
+#define PROGRESS_DATACHECKSUMS_RELS_DONE	4
+#define PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL	5
+#define PROGRESS_DATACHECKSUMS_BLOCKS_DONE	6
+
+/* Phases of datachecksumsworker operation */
+#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING			0
+#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING			1
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING			2
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS	3
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL	4
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT 5
+
 #endif
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index a2b63495eec..9923c7f518d 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -365,6 +365,9 @@ typedef enum BackendType
 	B_WAL_SUMMARIZER,
 	B_WAL_WRITER,
 
+	B_DATACHECKSUMSWORKER_LAUNCHER,
+	B_DATACHECKSUMSWORKER_WORKER,
+
 	/*
 	 * Logger is not connected to shared memory and does not have a PGPROC
 	 * entry.
@@ -389,6 +392,7 @@ extern PGDLLIMPORT BackendType MyBackendType;
 #define AmWalReceiverProcess()		(MyBackendType == B_WAL_RECEIVER)
 #define AmWalSummarizerProcess()	(MyBackendType == B_WAL_SUMMARIZER)
 #define AmWalWriterProcess()		(MyBackendType == B_WAL_WRITER)
+#define AmDataChecksumsWorkerProcess()	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
 
 #define AmSpecialWorkerProcess() \
 	(AmAutoVacuumLauncherProcess() || \
diff --git a/src/include/postmaster/datachecksumsworker.h b/src/include/postmaster/datachecksumsworker.h
new file mode 100644
index 00000000000..59c9000d646
--- /dev/null
+++ b/src/include/postmaster/datachecksumsworker.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.h
+ *	  header file for checksum helper background worker
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/postmaster/datachecksumsworker.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef DATACHECKSUMSWORKER_H
+#define DATACHECKSUMSWORKER_H
+
+/* Shared memory */
+extern Size DataChecksumsWorkerShmemSize(void);
+extern void DataChecksumsWorkerShmemInit(void);
+
+/* Start the background processes for enabling or disabling checksums */
+void		StartDataChecksumsWorkerLauncher(bool enable_checksums,
+											 int cost_delay,
+											 int cost_limit,
+											 bool fast);
+
+/* Background worker entrypoints */
+void		DataChecksumsWorkerLauncherMain(Datum arg);
+void		DataChecksumsWorkerMain(Datum arg);
+
+#endif							/* DATACHECKSUMSWORKER_H */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 6646b6f6371..5326782171f 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -205,7 +205,17 @@ typedef PageHeaderData *PageHeader;
  * handling pages.
  */
 #define PG_PAGE_LAYOUT_VERSION		4
+
+/*
+ * Checksum version 0 is used for when data checksums are disabled.
+ * PG_DATA_CHECKSUM_VERSION defines that data checksums are enabled in the
+ * cluster and PG_DATA_CHECKSUM_INPROGRESS_{ON|OFF}_VERSION defines that data
+ * checksums are either currently being enabled or disabled.
+ */
 #define PG_DATA_CHECKSUM_VERSION	1
+#define PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION	2
+#define PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION	3
+
 
 /* ----------------------------------------------------------------
  *						page support functions
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 25d13a798d1..6faff962ef0 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -15,6 +15,14 @@
 
 #include "storage/block.h"
 
+typedef enum ChecksumType
+{
+	DATA_CHECKSUMS_OFF = 0,
+	DATA_CHECKSUMS_ON,
+	DATA_CHECKSUMS_INPROGRESS_ON,
+	DATA_CHECKSUMS_INPROGRESS_OFF
+} ChecksumType;
+
 /*
  * Compute the checksum for a Postgres page.  The page must be aligned on a
  * 4-byte boundary.
diff --git a/src/include/storage/lwlocklist.h b/src/include/storage/lwlocklist.h
index cf565452382..afe39db753c 100644
--- a/src/include/storage/lwlocklist.h
+++ b/src/include/storage/lwlocklist.h
@@ -83,3 +83,4 @@ PG_LWLOCK(49, WALSummarizer)
 PG_LWLOCK(50, DSMRegistry)
 PG_LWLOCK(51, InjectionPoint)
 PG_LWLOCK(52, SerialControl)
+PG_LWLOCK(53, DataChecksumsWorker)
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 114eb1f8f76..61a3e3af9d4 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -447,9 +447,10 @@ extern PGDLLIMPORT PGPROC *PreparedXactProcs;
  * Background writer, checkpointer, WAL writer, WAL summarizer, and archiver
  * run during normal operation.  Startup process and WAL receiver also consume
  * 2 slots, but WAL writer is launched only after startup has exited, so we
- * only need 6 slots.
+ * only need 6 slots to cover these. The DataChecksums worker and launcher
+ * can consume 2 slots when data checksums are enabled or disabled.
  */
-#define NUM_AUXILIARY_PROCS		6
+#define NUM_AUXILIARY_PROCS		8
 
 /* configurable options */
 extern PGDLLIMPORT int DeadlockTimeout;
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 022fd8ed933..8937fa6ed3d 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -54,6 +54,11 @@ typedef enum
 typedef enum
 {
 	PROCSIGNAL_BARRIER_SMGRRELEASE, /* ask smgr to close files */
+
+	PROCSIGNAL_BARRIER_CHECKSUM_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_ON,
 } ProcSignalBarrierType;
 
 /*
diff --git a/src/include/utils/backend_progress.h b/src/include/utils/backend_progress.h
index dda813ab407..c664e92dbfe 100644
--- a/src/include/utils/backend_progress.h
+++ b/src/include/utils/backend_progress.h
@@ -28,6 +28,7 @@ typedef enum ProgressCommandType
 	PROGRESS_COMMAND_CREATE_INDEX,
 	PROGRESS_COMMAND_BASEBACKUP,
 	PROGRESS_COMMAND_COPY,
+	PROGRESS_COMMAND_DATACHECKSUMS,
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	20
diff --git a/src/test/Makefile b/src/test/Makefile
index 511a72e6238..278ce3e8a86 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,16 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = perl postmaster regress isolation modules authentication recovery subscription
+SUBDIRS = \
+		perl \
+		postmaster \
+		regress \
+		isolation \
+		modules \
+		authentication \
+		recovery \
+		subscription \
+		checksum
 
 ifeq ($(with_icu),yes)
 SUBDIRS += icu
diff --git a/src/test/checksum/.gitignore b/src/test/checksum/.gitignore
new file mode 100644
index 00000000000..871e943d50e
--- /dev/null
+++ b/src/test/checksum/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/checksum/Makefile b/src/test/checksum/Makefile
new file mode 100644
index 00000000000..fd03bf73df4
--- /dev/null
+++ b/src/test/checksum/Makefile
@@ -0,0 +1,23 @@
+#-------------------------------------------------------------------------
+#
+# Makefile for src/test/checksum
+#
+# Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/checksum/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/test/checksum
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
+
+clean distclean maintainer-clean:
+	rm -rf tmp_check
diff --git a/src/test/checksum/README b/src/test/checksum/README
new file mode 100644
index 00000000000..0f0317060b3
--- /dev/null
+++ b/src/test/checksum/README
@@ -0,0 +1,22 @@
+src/test/checksum/README
+
+Regression tests for data checksums
+===================================
+
+This directory contains a test suite for enabling data checksums
+in a running cluster.
+
+Running the tests
+=================
+
+    make check
+
+or
+
+    make installcheck
+
+NOTE: This creates a temporary installation (in the case of "check"),
+with multiple nodes, be they master or standby(s) for the purpose of
+the tests.
+
+NOTE: This requires the --enable-tap-tests argument to configure.
diff --git a/src/test/checksum/meson.build b/src/test/checksum/meson.build
new file mode 100644
index 00000000000..5f96b5c246d
--- /dev/null
+++ b/src/test/checksum/meson.build
@@ -0,0 +1,15 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+tests += {
+  'name': 'checksums',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'tests': [
+      't/001_basic.pl',
+      't/002_restarts.pl',
+      't/003_standby_restarts.pl',
+      't/004_offline.pl',
+    ],
+  },
+}
diff --git a/src/test/checksum/t/001_basic.pl b/src/test/checksum/t/001_basic.pl
new file mode 100644
index 00000000000..4c64f6a14fc
--- /dev/null
+++ b/src/test/checksum/t/001_basic.pl
@@ -0,0 +1,88 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are turned off
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable data checksums
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+# Wait for checksums to become enabled
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Enable data checksums again which should be a no-op..
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+# ..and make sure we can still read/write data
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums again
+$node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are disabled');
+
+# Test reading again
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure previously checksummed pages can be read back');
+
+# Re-enable checksums and make sure that the underlying data has changed to
+# ensure that checksums will be different.
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back the data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$node->stop;
+
+done_testing();
diff --git a/src/test/checksum/t/002_restarts.pl b/src/test/checksum/t/002_restarts.pl
new file mode 100644
index 00000000000..2697b722257
--- /dev/null
+++ b/src/test/checksum/t/002_restarts.pl
@@ -0,0 +1,92 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# restarting the processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, '1', "ensure checksums aren't enabled yet");
+
+$bsession->quit;
+$node->stop;
+$node->start;
+
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'inprogress-on', "ensure checksums aren't enabled yet");
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are turned on');
+
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+$result = $node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are turned off');
+
+done_testing();
diff --git a/src/test/checksum/t/003_standby_restarts.pl b/src/test/checksum/t/003_standby_restarts.pl
new file mode 100644
index 00000000000..6782664f4e6
--- /dev/null
+++ b/src/test/checksum/t/003_standby_restarts.pl
@@ -0,0 +1,139 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# streaming replication
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize primary node
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+$node_primary->init(allows_streaming => 1, no_data_checksums => 1);
+$node_primary->start;
+
+my $slotname = 'physical_slot';
+$node_primary->safe_psql('postgres',
+	"SELECT pg_create_physical_replication_slot('$slotname')");
+
+# Take backup
+my $backup_name = 'my_backup';
+$node_primary->backup($backup_name);
+
+# Create streaming standby linking to primary
+my $node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $backup_name,
+	has_streaming => 1);
+$node_standby_1->append_conf(
+	'postgresql.conf', qq[
+primary_slot_name = '$slotname'
+]);
+$node_standby_1->start;
+
+# Create some content on the primary to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Wait for standbys to catch up
+$node_primary->wait_for_catchup($node_standby_1, 'replay',
+	$node_primary->lsn('insert'));
+
+# Check that checksums are turned off on all nodes
+my $result = $node_primary->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on primary');
+
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on standby_1');
+
+# Enable checksums for the cluster
+$node_primary->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+# Ensure that the primary switches to "inprogress-on"
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	"inprogress-on");
+is($result, 1, 'ensure checksums are in progress on primary');
+
+# Wait for checksum enable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Ensure that the standby has switched to "inprogress-on" or "on".  Normally it
+# would be "inprogress-on", but it is theoretically possible for the primary to
+# complete the checksum enabling *and* have the standby replay that record
+# before we reach the check below.
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting = 'off' FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'f');
+is($result, 1, 'ensure standby has absorbed the inprogress-on barrier');
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+
+is(($result eq 'inprogress-on' || $result eq 'on'),
+	1, 'ensure checksums are on, or in progress, on standby_1');
+
+# Insert some more data which should be checksummed on INSERT
+$node_primary->safe_psql('postgres',
+	"INSERT INTO t VALUES (generate_series(1, 10000));");
+
+# Wait for checksums enabled on the primary
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the primary');
+
+# Wait for checksums enabled on the standby
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the standby');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, '20000', 'ensure we can safely read all data with checksums');
+
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+# Disable checksums and ensure it's propagated to standby and that we can
+# still read all data
+$node_primary->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+# Wait for checksum disable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure data checksums are disabled on the primary 2');
+
+# Ensure that the standby has switched to off
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are off on standby_1');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, "20000", 'ensure we can safely read all data without checksums');
+
+done_testing();
diff --git a/src/test/checksum/t/004_offline.pl b/src/test/checksum/t/004_offline.pl
new file mode 100644
index 00000000000..9cee62c9b52
--- /dev/null
+++ b/src/test/checksum/t/004_offline.pl
@@ -0,0 +1,101 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums offline from various states
+# of checksum processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable checksums offline using pg_checksums
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are enabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums offline again using pg_checksums
+$node->stop;
+$node->checksum_disable_offline;
+$node->start;
+
+# Ensure that checksums are disabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, 1, 'ensure checksums are in the process of being enabled');
+
+# Turn the cluster off and enable checksums offline, then start back up
+$bsession->quit;
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are now enabled even though processing wasn't
+# restarted
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+done_testing();
diff --git a/src/test/meson.build b/src/test/meson.build
index ccc31d6a86a..c4bfef2c0e2 100644
--- a/src/test/meson.build
+++ b/src/test/meson.build
@@ -8,6 +8,7 @@ subdir('postmaster')
 subdir('recovery')
 subdir('subscription')
 subdir('modules')
+subdir('checksum')
 
 if ssl.found()
   subdir('ssl')
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index b105cba05a6..666bd2a2d4c 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3748,6 +3748,40 @@ sub advance_wal
 	}
 }
 
+=item $node->checksum_enable_offline()
+
+Enable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_enable_offline
+{
+	my ($self) = @_;
+
+	print "### Enabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-e');
+	return;
+}
+
+=item checksum_disable_offline
+
+Disable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_disable_offline
+{
+	my ($self) = @_;
+
+	print "### Disabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-d');
+	return;
+}
+
 =pod
 
 =back
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 62f69ac20b2..2cfea837554 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2041,6 +2041,43 @@ pg_stat_progress_create_index| SELECT s.pid,
     s.param15 AS partitions_done
    FROM (pg_stat_get_progress_info('CREATE INDEX'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
+pg_stat_progress_data_checksums| SELECT s.pid,
+    s.datid,
+    d.datname,
+        CASE s.param1
+            WHEN 0 THEN 'enabling'::text
+            WHEN 1 THEN 'disabling'::text
+            WHEN 2 THEN 'waiting'::text
+            WHEN 3 THEN 'waiting on backends'::text
+            WHEN 4 THEN 'waiting on temporary tables'::text
+            WHEN 5 THEN 'waiting on checkpoint'::text
+            WHEN 6 THEN 'done'::text
+            ELSE NULL::text
+        END AS phase,
+        CASE s.param2
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param2
+        END AS databases_total,
+    s.param3 AS databases_done,
+        CASE s.param4
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param4
+        END AS relations_total,
+        CASE s.param5
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param5
+        END AS relations_done,
+        CASE s.param6
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param6
+        END AS blocks_total,
+        CASE s.param7
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param7
+        END AS blocks_done
+   FROM (pg_stat_get_progress_info('DATACHECKSUMS'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)))
+  ORDER BY s.datid;
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,
diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 30d763c4aee..da6645f0d7b 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -51,6 +51,22 @@ client backend|relation|vacuum
 client backend|temp relation|normal
 client backend|wal|init
 client backend|wal|normal
+datachecksumsworker launcher|relation|bulkread
+datachecksumsworker launcher|relation|bulkwrite
+datachecksumsworker launcher|relation|init
+datachecksumsworker launcher|relation|normal
+datachecksumsworker launcher|relation|vacuum
+datachecksumsworker launcher|temp relation|normal
+datachecksumsworker launcher|wal|init
+datachecksumsworker launcher|wal|normal
+datachecksumsworker worker|relation|bulkread
+datachecksumsworker worker|relation|bulkwrite
+datachecksumsworker worker|relation|init
+datachecksumsworker worker|relation|normal
+datachecksumsworker worker|relation|vacuum
+datachecksumsworker worker|temp relation|normal
+datachecksumsworker worker|wal|init
+datachecksumsworker worker|wal|normal
 slotsync worker|relation|bulkread
 slotsync worker|relation|bulkwrite
 slotsync worker|relation|init
@@ -87,7 +103,7 @@ walsummarizer|wal|init
 walsummarizer|wal|normal
 walwriter|wal|init
 walwriter|wal|normal
-(71 rows)
+(87 rows)
 \a
 -- ensure that both seqscan and indexscan plans are allowed
 SET enable_seqscan TO on;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 9840060997f..86e4057f61d 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -403,6 +403,7 @@ CheckPointStmt
 CheckpointStatsData
 CheckpointerRequest
 CheckpointerShmemStruct
+ChecksumType
 Chromosome
 CkptSortItem
 CkptTsStatus
@@ -593,6 +594,10 @@ DataPageDeleteStack
 DataTypesUsageChecks
 DataTypesUsageVersionCheck
 DatabaseInfo
+DataChecksumsWorkerDatabase
+DataChecksumsWorkerResult
+DataChecksumsWorkerResultEntry
+DataChecksumsWorkerShmemStruct
 DateADT
 DateTimeErrorExtra
 Datum
@@ -4141,6 +4146,7 @@ xl_btree_split
 xl_btree_unlink_page
 xl_btree_update
 xl_btree_vacuum
+xl_checksum_state
 xl_clog_truncate
 xl_commit_ts_truncate
 xl_dbase_create_file_copy_rec
-- 
2.48.1

v20250310e-0002-review-fixes.patchtext/x-patch; charset=UTF-8; name=v20250310e-0002-review-fixes.patchDownload
From 98b6bf1adb3a5a009d130b93e1089c284a17d132 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Mon, 10 Mar 2025 14:56:14 +0100
Subject: [PATCH v20250310e 2/4] review fixes

---
 doc/src/sgml/monitoring.sgml                 | 13 +++----------
 src/backend/catalog/system_views.sql         |  7 +++----
 src/backend/postmaster/datachecksumsworker.c | 11 +----------
 src/include/commands/progress.h              |  5 ++---
 src/include/miscadmin.h                      |  4 +++-
 src/test/regress/expected/rules.out          |  7 +++----
 6 files changed, 15 insertions(+), 32 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 2a568ee2357..4602134cbde 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6876,7 +6876,7 @@ FROM pg_stat_get_backend_idset() AS backendid;
        <para>
         The total number of databases which will be processed. Only the
         launcher worker has this value set, the other worker processes
-        have this <literal>NULL</literal>.
+        have this set to <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6889,7 +6889,7 @@ FROM pg_stat_get_backend_idset() AS backendid;
        <para>
         The number of databases which have been processed. Only the
         launcher worker has this value set, the other worker processes
-        have this <literal>NULL</literal>.
+        have this set to <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6974,13 +6974,6 @@ FROM pg_stat_get_backend_idset() AS backendid;
        The command is currently disabling data checksums on the cluster.
       </entry>
      </row>
-     <row>
-      <entry><literal>waiting on backends</literal></entry>
-      <entry>
-       The command is currently waiting for backends to acknowledge the data
-       checksum operation.
-      </entry>
-     </row>
      <row>
       <entry><literal>waiting on temporary tables</literal></entry>
       <entry>
@@ -6992,7 +6985,7 @@ FROM pg_stat_get_backend_idset() AS backendid;
       <entry><literal>waiting on checkpoint</literal></entry>
       <entry>
        The command is currently waiting for a checkpoint to update the checksum
-       state at the end.
+       state before finishing.
       </entry>
      </row>
     </tbody>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 4330d0ad656..6ffd31ce39c 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1340,10 +1340,9 @@ CREATE VIEW pg_stat_progress_data_checksums AS
         CASE S.param1 WHEN 0 THEN 'enabling'
                       WHEN 1 THEN 'disabling'
                       WHEN 2 THEN 'waiting'
-                      WHEN 3 THEN 'waiting on backends'
-                      WHEN 4 THEN 'waiting on temporary tables'
-                      WHEN 5 THEN 'waiting on checkpoint'
-                      WHEN 6 THEN 'done'
+                      WHEN 3 THEN 'waiting on temporary tables'
+                      WHEN 4 THEN 'waiting on checkpoint'
+                      WHEN 5 THEN 'done'
                       END AS phase,
         CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
         S.param3 AS databases_done,
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
index bbbce61cfa6..d9833cd4c98 100644
--- a/src/backend/postmaster/datachecksumsworker.c
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -801,15 +801,6 @@ again:
 		}
 		RESUME_INTERRUPTS();
 
-		/*
-		 * Initialize progress and indicate that we are waiting on the other
-		 * backends to clear the procsignalbarrier.
-		 */
-		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
-									 PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS);
-
-		/* XXX isn't it weird there's no wait between the phase updates? */
-
 		/*
 		 * Set the state to inprogress-on and wait on the procsignal barrier.
 		 */
@@ -1083,7 +1074,7 @@ ProcessAllDatabases(bool immediate_checkpoint)
 
 	/*
 	 * When enabling checksums, we have to wait for a checkpoint for the
-	 * checksums to e.
+	 * checksums to change from in-progress to on.
 	 */
 	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
 								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT);
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 94b478a6cc9..b172a5f24ce 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -170,8 +170,7 @@
 #define PROGRESS_DATACHECKSUMS_PHASE_ENABLING			0
 #define PROGRESS_DATACHECKSUMS_PHASE_DISABLING			1
 #define PROGRESS_DATACHECKSUMS_PHASE_WAITING			2
-#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS	3
-#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL	4
-#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT 5
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL	3
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT 4
 
 #endif
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 9923c7f518d..69b1dc720f6 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -392,7 +392,9 @@ extern PGDLLIMPORT BackendType MyBackendType;
 #define AmWalReceiverProcess()		(MyBackendType == B_WAL_RECEIVER)
 #define AmWalSummarizerProcess()	(MyBackendType == B_WAL_SUMMARIZER)
 #define AmWalWriterProcess()		(MyBackendType == B_WAL_WRITER)
-#define AmDataChecksumsWorkerProcess()	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
+#define AmDataChecksumsWorkerProcess() \
+	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || \
+	 MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
 
 #define AmSpecialWorkerProcess() \
 	(AmAutoVacuumLauncherProcess() || \
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2cfea837554..0dd383a76dd 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2048,10 +2048,9 @@ pg_stat_progress_data_checksums| SELECT s.pid,
             WHEN 0 THEN 'enabling'::text
             WHEN 1 THEN 'disabling'::text
             WHEN 2 THEN 'waiting'::text
-            WHEN 3 THEN 'waiting on backends'::text
-            WHEN 4 THEN 'waiting on temporary tables'::text
-            WHEN 5 THEN 'waiting on checkpoint'::text
-            WHEN 6 THEN 'done'::text
+            WHEN 3 THEN 'waiting on temporary tables'::text
+            WHEN 4 THEN 'waiting on checkpoint'::text
+            WHEN 5 THEN 'done'::text
             ELSE NULL::text
         END AS phase,
         CASE s.param2
-- 
2.48.1

v20250310e-0003-pgindent.patchtext/x-patch; charset=UTF-8; name=v20250310e-0003-pgindent.patchDownload
From 239be70853039067f1408c0e46f18a6888403275 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Mon, 10 Mar 2025 14:57:05 +0100
Subject: [PATCH v20250310e 3/4] pgindent

---
 src/backend/postmaster/datachecksumsworker.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
index d9833cd4c98..6a201dca8de 100644
--- a/src/backend/postmaster/datachecksumsworker.c
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -921,7 +921,7 @@ ProcessAllDatabases(bool immediate_checkpoint)
 			PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
 		};
 
-		int64	vals[6];
+		int64		vals[6];
 
 		vals[0] = list_length(DatabaseList);
 		vals[1] = 0;
@@ -974,7 +974,8 @@ ProcessAllDatabases(bool immediate_checkpoint)
 			processed_databases++;
 
 			/*
-			 * Update the number of processed databases in the progress report.
+			 * Update the number of processed databases in the progress
+			 * report.
 			 */
 			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_DBS_DONE,
 										 processed_databases);
@@ -1126,9 +1127,9 @@ DataChecksumsWorkerShmemInit(void)
 		MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
 
 		/*
-		 * Even if this is a redundant assignment, we want to be explicit about
-		 * our intent for readability, since we want to be able to query this
-		 * state in case of restartability.
+		 * Even if this is a redundant assignment, we want to be explicit
+		 * about our intent for readability, since we want to be able to query
+		 * this state in case of restartability.
 		 */
 		DataChecksumsWorkerShmem->launch_enable_checksums = false;
 		DataChecksumsWorkerShmem->launcher_running = false;
@@ -1339,7 +1340,7 @@ DataChecksumsWorkerMain(Datum arg)
 			PROGRESS_DATACHECKSUMS_RELS_DONE
 		};
 
-		int64	vals[2];
+		int64		vals[2];
 
 		vals[0] = list_length(RelationList);
 		vals[1] = 0;
-- 
2.48.1

v20250310e-0004-perltidy.patchtext/x-patch; charset=UTF-8; name=v20250310e-0004-perltidy.patchDownload
From 219dc80d840db30365c78ea1f5bcd3ffd198a340 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Mon, 10 Mar 2025 15:05:43 +0100
Subject: [PATCH v20250310e 4/4] perltidy

---
 src/test/perl/PostgreSQL/Test/Cluster.pm | 6 ++++--
 src/test/subscription/t/013_partition.pl | 3 +--
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 666bd2a2d4c..1c66360c16c 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3761,7 +3761,8 @@ sub checksum_enable_offline
 	my ($self) = @_;
 
 	print "### Enabling checksums in \"$self->data_dir\"\n";
-	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-e');
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-e');
 	return;
 }
 
@@ -3778,7 +3779,8 @@ sub checksum_disable_offline
 	my ($self) = @_;
 
 	print "### Disabling checksums in \"$self->data_dir\"\n";
-	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-d');
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-d');
 	return;
 }
 
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
index 61b0cb4aa1a..4f78dd48815 100644
--- a/src/test/subscription/t/013_partition.pl
+++ b/src/test/subscription/t/013_partition.pl
@@ -51,8 +51,7 @@ $node_subscriber1->safe_psql('postgres',
 );
 # make a BRIN index to test aminsertcleanup logic in subscriber
 $node_subscriber1->safe_psql('postgres',
-	"CREATE INDEX tab1_c_brin_idx ON tab1 USING brin (c)"
-);
+	"CREATE INDEX tab1_c_brin_idx ON tab1 USING brin (c)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1_1 (b text, c text DEFAULT 'sub1_tab1', a int NOT NULL)"
 );
-- 
2.48.1

#27Tomas Vondra
tomas@vondra.me
In reply to: Tomas Vondra (#26)
Re: Changing the state of data checksums in a running cluster

One thing I forgot to mention is the progress reporting only updates
blocks for the FORK_MAIN. It wouldn't be difficult to report blocks for
each fork, but it'd be confusing - the relation counters would remain
the same, but the block counters would change for each fork.

I guess we could report the current_relation/fork, but it seems like an
overkill. The main fork is by far the largest one, so this seems OK.

regards

--
Tomas Vondra

#28Tomas Vondra
tomas@vondra.me
In reply to: Tomas Vondra (#27)
1 attachment(s)
Re: Changing the state of data checksums in a running cluster

Hi,

I continued stress testing this, as I was rather unsure why the assert
failures reported in [1]/messages/by-id/e4dbcb2c-e04a-4ba2-bff0-8d979f55960e@vondra.me disappeared. And I managed to reproduce that
again, and I think I actually understand why it happens.

I modified the test script (attached) to setup replication, not just a
single instance. And then it does a bit of work, flips the checksums,
restarts the instances (randomly, fast/immediate), verifies the checkums
and so on. And I can hit this assert in AbsorbChecksumsOnBarrier()
pretty easily:

Assert(LocalDataChecksumVersion ==
PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);

The reason is pretty simple - this happens on the standby:

1) standby receives XLOG_CHECKSUMS and applies it from 2 to 1 (i.e. it
sets ControlFile->data_checksum_version from "inprogress-on" to "on"),
and signals all other processes to refresh LocalDataChecksumVersion

2) the control file gets written to disk for whatever reason (redo does
this in a number of places)

3) standby gets restarted with "immediate" mode (I'm not sure if this
can happen with "fast" mode, I only recall seeing "immediate")

4) the standby receives the XLOG_CHECKSUMS record *again*, updates the
ControlFile->data_checksum_version (to the same value, no noop), and
then signals the other processes again

5) the other processes already have LocalDataChecksumVersion=1 (on), but
the assert says it should be 2 (inprogress-on) => kaboom

I believe this can happen for changes in either direction, although the
window while disabling checksums is more narrow.

I'm not sure what to do about this. Maybe we could relax the assert in
some way? But that seems a bit ... possibly risky. It's not necessarily
true we'll see the immediately preceding checksum state, we might see a
couple updates back (if the control file was not updated in between).

Could this affect checksum verification during recovery? Imagine we get
to the "on" state, the controlfile gets flushed, and then the standby
restarts and starts receiving older records again. The control file says
we should be verifying checksums, but couldn't some of the writes have
been lost (and so the pages may not have a valid checksum)?

The one idea I have is to create an "immediate" restartpoint in
xlog_redo() right after XLOG_CHECKSUMS updates the control file. AFAICS
a "spread" restartpoint would not be enough, because then we could get
into the same situation with a control file of sync (ahead of WAL) after
a restart. It'd not be cheap, but it should be a rare operation ...

I was wondering if the primary has the same issue, but AFAICS it does
not. It flushes the control file in only a couple places, I couldn't
think of a way to get it out of sync.

regards

[1]: /messages/by-id/e4dbcb2c-e04a-4ba2-bff0-8d979f55960e@vondra.me
/messages/by-id/e4dbcb2c-e04a-4ba2-bff0-8d979f55960e@vondra.me

--
Tomas Vondra

Attachments:

test.shapplication/x-shellscript; name=test.shDownload
In reply to: Tomas Vondra (#26)
Re: Changing the state of data checksums in a running cluster

As the resident perl style pedant, I'd just like to complain about the
below:

Tomas Vondra <tomas@vondra.me> writes:

diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 666bd2a2d4c..1c66360c16c 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3761,7 +3761,8 @@ sub checksum_enable_offline
my ($self) = @_;
print "### Enabling checksums in \"$self->data_dir\"\n";
-	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-e');
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-e');
return;
}

This breaking between the command line options and its arguments is why
we're switching to using fat commas. We're also using long options for
improved self-documentation, so this should be written as:

PostgreSQL::Test::Utils::system_or_bail('pg_checksums',
'--pgdata' => $self->data_dir,
'--enable');

And likewise below in the disable method.

- ilmari

#30Tomas Vondra
tomas@vondra.me
In reply to: Dagfinn Ilmari Mannsåker (#29)
Re: Changing the state of data checksums in a running cluster

On 3/11/25 14:07, Dagfinn Ilmari Mannsåker wrote:

As the resident perl style pedant, I'd just like to complain about the
below:

Tomas Vondra <tomas@vondra.me> writes:

diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 666bd2a2d4c..1c66360c16c 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3761,7 +3761,8 @@ sub checksum_enable_offline
my ($self) = @_;
print "### Enabling checksums in \"$self->data_dir\"\n";
-	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-e');
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-e');
return;
}

This breaking between the command line options and its arguments is why
we're switching to using fat commas. We're also using long options for
improved self-documentation, so this should be written as:

PostgreSQL::Test::Utils::system_or_bail('pg_checksums',
'--pgdata' => $self->data_dir,
'--enable');

And likewise below in the disable method.

I don't know what fat comma is, but that's simply what perltidy did. I
don't mind formatting it differently, if there's a better way.

thanks

--
Tomas Vondra

In reply to: Tomas Vondra (#30)
Re: Changing the state of data checksums in a running cluster

Tomas Vondra <tomas@vondra.me> writes:

On 3/11/25 14:07, Dagfinn Ilmari Mannsåker wrote:

As the resident perl style pedant, I'd just like to complain about the
below:

Tomas Vondra <tomas@vondra.me> writes:

diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 666bd2a2d4c..1c66360c16c 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3761,7 +3761,8 @@ sub checksum_enable_offline
my ($self) = @_;
print "### Enabling checksums in \"$self->data_dir\"\n";
-	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-e');
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-e');
return;
}

This breaking between the command line options and its arguments is why
we're switching to using fat commas. We're also using long options for
improved self-documentation, so this should be written as:

PostgreSQL::Test::Utils::system_or_bail('pg_checksums',
'--pgdata' => $self->data_dir,
'--enable');

And likewise below in the disable method.

I don't know what fat comma is, but that's simply what perltidy did. I
don't mind formatting it differently, if there's a better way.

Fat comma is the perlish name for the => arrow, which is semantically
equivalent to a comma (except it auto-quotes any immediately preceding
bareword), but looks fatter. Perltidy knows to not wrap lines around
them, keeping the key and value (or option and argument in this case)
together. See commit ce1b0f9da03 for a large (but not complete, I have
more patches pending) conversion to this new style.

- ilmari

#32Tomas Vondra
tomas@vondra.me
In reply to: Tomas Vondra (#28)
7 attachment(s)
Re: Changing the state of data checksums in a running cluster

On 3/10/25 18:35, Tomas Vondra wrote:

Hi,

I continued stress testing this, as I was rather unsure why the assert
failures reported in [1] disappeared. And I managed to reproduce that
again, and I think I actually understand why it happens.

I modified the test script (attached) to setup replication, not just a
single instance. And then it does a bit of work, flips the checksums,
restarts the instances (randomly, fast/immediate), verifies the checkums
and so on. And I can hit this assert in AbsorbChecksumsOnBarrier()
pretty easily:

Assert(LocalDataChecksumVersion ==
PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);

The reason is pretty simple - this happens on the standby:

1) standby receives XLOG_CHECKSUMS and applies it from 2 to 1 (i.e. it
sets ControlFile->data_checksum_version from "inprogress-on" to "on"),
and signals all other processes to refresh LocalDataChecksumVersion

2) the control file gets written to disk for whatever reason (redo does
this in a number of places)

3) standby gets restarted with "immediate" mode (I'm not sure if this
can happen with "fast" mode, I only recall seeing "immediate")

4) the standby receives the XLOG_CHECKSUMS record *again*, updates the
ControlFile->data_checksum_version (to the same value, no noop), and
then signals the other processes again

5) the other processes already have LocalDataChecksumVersion=1 (on), but
the assert says it should be 2 (inprogress-on) => kaboom

I believe this can happen for changes in either direction, although the
window while disabling checksums is more narrow.

I'm not sure what to do about this. Maybe we could relax the assert in
some way? But that seems a bit ... possibly risky. It's not necessarily
true we'll see the immediately preceding checksum state, we might see a
couple updates back (if the control file was not updated in between).

Could this affect checksum verification during recovery? Imagine we get
to the "on" state, the controlfile gets flushed, and then the standby
restarts and starts receiving older records again. The control file says
we should be verifying checksums, but couldn't some of the writes have
been lost (and so the pages may not have a valid checksum)?

The one idea I have is to create an "immediate" restartpoint in
xlog_redo() right after XLOG_CHECKSUMS updates the control file. AFAICS
a "spread" restartpoint would not be enough, because then we could get
into the same situation with a control file of sync (ahead of WAL) after
a restart. It'd not be cheap, but it should be a rare operation ...

I was wondering if the primary has the same issue, but AFAICS it does
not. It flushes the control file in only a couple places, I couldn't
think of a way to get it out of sync.

I continued investigating this and experimenting with alternative
approaches, and I think the way the patch relies on ControlFile is not
quite right. That is, it always sets data_checksum_version to the last
("current") value, but that's not what ControlFile is for ...

The ControlFile is meant to be a safe/consistent state, e.g. for crash
recovery. By setting data_checksum_version to the "last" value we've
seen, that's broken - if the control file gets persisted (haven't seen
this on primary, but pretty common on replica, per the report), the
recovery will start with a "future" data_checksum_version value. Which
is wrong - we'll read the XLOG_CHECKUMS record, triggering the assert. I
suspect it might also lead to confusion whether checksums should be
verified or not.

In my earlier message I suggested maybe this could be solved by forcing
a checkpoint every time we see the XLOG_CHECKUMS record (or rather a
restart point, as it'd be on the replica). Sure, that would have some
undesirable consequences (forcing an immediate checkpoint is not cheap,
and the redo would need to wait for that). But the assumption was it'd
be very rare (how often you enable checksums?), so this cost might be
acceptable.

But when I started experimenting with this, I realized it has a couple
other issues:

1) We can't do the checkpoint/restartpoint when handling XLOG_CHECKUMS,
because that'd mean we see this XLOG record again, which we don't want.
So the checkpoint would need to happen the *next* time we update the
control file.

2) But we can't trigger a checkpoint from UpdateControlFile, because of
locking (because CreateCheckPoint also calls UpdateControlFile). So this
would require much more invasive changes to all places updating the
control file.

3) It does not resolve the mismatch with using ControlFile to store
"current" data_checksums_version value.

4) ... probably more minor issues that I already forgot about.

In the end, I decided to try to rework this by storing the current value
elsewhere, and only updating the "persistent" value in the control file
when necessary.

XLogCtl seemed like a good place, so I used that - after all, it's a
value from XLOG. Maybe there's a better place? I'm open to suggestions,
but it does not really affect the overall approach.

So all the places now update XLogCtl->data_checksums_version instead of
the ControlFile, and also query this flag for *current* value.

The value is copied from XLogCtl to ControlFile when creating checkpoint
(or restartpoint), and the control file is persisted. This means (a) the
current value can't get written to the control file prematurely, and (b)
the value is consistent with the checkpoint (i.e. with the LSN where we
start crash recovery, if needed).

The attached 0005 patch implements this. It's a bit WIP and I'm sure it
can be improved, but I'm yet to see a single crash/failure with it. With
the original patch I've seen crashes after 5-10 loops (i.e. a couple
minutes), I'm now at loop 1000 and it's still OK.

I believe the approach is correct, but the number of possible states
(e.g. after a crash/restart) seems a bit complex. I wonder if there's a
better way to handle this, but I can't think of any. Ideas?

One issue I ran into is the postmaster does not seem to be processing
the barriers, and thus not getting info about the data_checksum_version
changes. That's fine until it needs to launch a child process (e.g. a
walreceiver), which will then see the LocalDataChecksumVersion as of the
start of the instance, not the "current" one. I fixed this by explicitly
refreshing the value in postmaster_child_launch(), but maybe I'm missing
something. (Also, EXEC_BACKEND may need to handle this too.)

regards

--
Tomas Vondra

Attachments:

v20250312-0001-Online-enabling-and-disabling-of-data-chec.patchtext/x-patch; charset=UTF-8; name=v20250312-0001-Online-enabling-and-disabling-of-data-chec.patchDownload
From ae739e9c25f1fe979b501f42d164ab92be46dad5 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Fri, 7 Mar 2025 19:21:22 +0100
Subject: [PATCH v20250312 1/6] Online enabling and disabling of data checksums

This allows data checksums to be enabled, or disabled, in a running
cluster without restricting access to the cluster during processing.

Data checksums could prior to this only be enabled during initdb or
when the cluster is offline using the pg_checksums app. This commit
introduce functionality to enable, or disable, data checksums while
the cluster is running regardless of how it was initialized.

A background worker launcher process is responsible for launching a
dynamic per-database background worker which will mark all buffers
dirty for all relation with storage in order for them to have data
checksums calcuated on write.  Once all relations in all databases
have been processed, the data_checksums state will be set to on and
the cluster will at that point be identical to one which had data
checksums enabled during initialization or via offline processing.

When data checksums are being enabled, concurrent I/O operations
from backends other than the data checksums worker will write the
checksums but not verify them on reading.  Only when all backends
have absorbed the procsignalbarrier for setting data_checksums to
on will they also start verifying checksums on reading.  The same
process is repeated during disabling; all backends write checksums
but do not verify them until the barrier for setting the state to
off has been absorbed by all.  This in-progress state is used to
ensure there are no false negatives (or positives) due to reading
a checksum which is not in sync with the page.

This work is based on an earlier version of this patch which was
reviewed by among others Heikki Linnakangas, Robert Haas, Andres
Freund, Tomas Vondra, Michael Banck and Andrey Borodin.

Author: Daniel Gustafsson <daniel@yesql.se>
Author: Magnus Hagander <magnus@hagander.net>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com
Discussion: https://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de
Discussion: https://postgr.es/m/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com
---
 doc/src/sgml/func.sgml                        |   71 +
 doc/src/sgml/glossary.sgml                    |   23 +
 doc/src/sgml/monitoring.sgml                  |  215 ++-
 doc/src/sgml/ref/pg_checksums.sgml            |    6 +
 doc/src/sgml/wal.sgml                         |   59 +-
 src/backend/access/rmgrdesc/xlogdesc.c        |   24 +
 src/backend/access/transam/xlog.c             |  489 +++++-
 src/backend/access/transam/xlogfuncs.c        |   41 +
 src/backend/backup/basebackup.c               |    6 +-
 src/backend/catalog/system_functions.sql      |   11 +
 src/backend/catalog/system_views.sql          |   21 +
 src/backend/postmaster/Makefile               |    1 +
 src/backend/postmaster/bgworker.c             |    7 +
 src/backend/postmaster/datachecksumsworker.c  | 1447 +++++++++++++++++
 src/backend/postmaster/launch_backend.c       |    3 +
 src/backend/postmaster/meson.build            |    1 +
 src/backend/postmaster/postmaster.c           |    5 +
 src/backend/replication/logical/decode.c      |    1 +
 src/backend/storage/ipc/ipci.c                |    4 +
 src/backend/storage/ipc/procsignal.c          |   13 +
 src/backend/storage/page/README               |    4 +-
 src/backend/storage/page/bufpage.c            |    6 +-
 src/backend/utils/activity/pgstat_backend.c   |    2 +
 src/backend/utils/activity/pgstat_io.c        |    2 +
 .../utils/activity/wait_event_names.txt       |    3 +
 src/backend/utils/adt/pgstatfuncs.c           |    8 +-
 src/backend/utils/init/miscinit.c             |    9 +-
 src/backend/utils/init/postinit.c             |    7 +-
 src/backend/utils/misc/guc_tables.c           |   31 +-
 src/bin/pg_checksums/pg_checksums.c           |    2 +-
 src/bin/pg_upgrade/controldata.c              |    9 +
 src/include/access/xlog.h                     |   17 +-
 src/include/access/xlog_internal.h            |    7 +
 src/include/catalog/pg_control.h              |    1 +
 src/include/catalog/pg_proc.dat               |   17 +
 src/include/commands/progress.h               |   17 +
 src/include/miscadmin.h                       |    4 +
 src/include/postmaster/datachecksumsworker.h  |   31 +
 src/include/storage/bufpage.h                 |   10 +
 src/include/storage/checksum.h                |    8 +
 src/include/storage/lwlocklist.h              |    1 +
 src/include/storage/proc.h                    |    5 +-
 src/include/storage/procsignal.h              |    5 +
 src/include/utils/backend_progress.h          |    1 +
 src/test/Makefile                             |   11 +-
 src/test/checksum/.gitignore                  |    2 +
 src/test/checksum/Makefile                    |   23 +
 src/test/checksum/README                      |   22 +
 src/test/checksum/meson.build                 |   15 +
 src/test/checksum/t/001_basic.pl              |   88 +
 src/test/checksum/t/002_restarts.pl           |   92 ++
 src/test/checksum/t/003_standby_restarts.pl   |  139 ++
 src/test/checksum/t/004_offline.pl            |  101 ++
 src/test/meson.build                          |    1 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |   34 +
 src/test/regress/expected/rules.out           |   37 +
 src/test/regress/expected/stats.out           |   18 +-
 src/tools/pgindent/typedefs.list              |    6 +
 58 files changed, 3194 insertions(+), 50 deletions(-)
 create mode 100644 src/backend/postmaster/datachecksumsworker.c
 create mode 100644 src/include/postmaster/datachecksumsworker.h
 create mode 100644 src/test/checksum/.gitignore
 create mode 100644 src/test/checksum/Makefile
 create mode 100644 src/test/checksum/README
 create mode 100644 src/test/checksum/meson.build
 create mode 100644 src/test/checksum/t/001_basic.pl
 create mode 100644 src/test/checksum/t/002_restarts.pl
 create mode 100644 src/test/checksum/t/003_standby_restarts.pl
 create mode 100644 src/test/checksum/t/004_offline.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 51dd8ad6571..a09843e4ecf 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29865,6 +29865,77 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
 
   </sect2>
 
+  <sect2 id="functions-admin-checksum">
+   <title>Data Checksum Functions</title>
+
+   <para>
+    The functions shown in <xref linkend="functions-checksums-table" /> can
+    be used to enable or disable data checksums in a running cluster.
+    See <xref linkend="checksums" /> for details.
+   </para>
+
+   <table id="functions-checksums-table">
+    <title>Data Checksum Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_enable_data_checksums</primary>
+        </indexterm>
+        <function>pg_enable_data_checksums</function> ( <optional><parameter>cost_delay</parameter> <type>int</type>, <parameter>cost_limit</parameter> <type>int</type></optional> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Initiates data checksums for the cluster. This will switch the data
+        checksums mode to <literal>inprogress-on</literal> as well as start a
+        background worker that will process all pages in the database and
+        enable checksums on them. When all data pages have had checksums
+        enabled, the cluster will automatically switch data checksums mode to
+        <literal>on</literal>.
+       </para>
+       <para>
+        If <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter> are
+        specified, the speed of the process is throttled using the same principles as
+        <link linkend="runtime-config-resource-vacuum-cost">Cost-based Vacuum Delay</link>.
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_disable_data_checksums</primary>
+        </indexterm>
+        <function>pg_disable_data_checksums</function> ()
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Disables data checksum validation and calculation for the cluster. This
+        will switch the data checksum mode to <literal>inprogress-off</literal>
+        while data checksums are being disabled. When all active backends have
+        stopped validating data checksums, the data checksum mode will be
+        changed to <literal>off</literal>.  At this point the data pages will
+        still have checksums recorded but they are not updated when pages are
+        modified.
+       </para></entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+
+  </sect2>
+
   <sect2 id="functions-admin-dbobject">
    <title>Database Object Management Functions</title>
 
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index c0f812e3f5e..547e8586d4b 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -159,6 +159,8 @@
      (but not the autovacuum workers),
      the <glossterm linkend="glossary-background-writer">background writer</glossterm>,
      the <glossterm linkend="glossary-checkpointer">checkpointer</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker">data checksums worker</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker-launcher">data checksums worker launcher</glossterm>,
      the <glossterm linkend="glossary-logger">logger</glossterm>,
      the <glossterm linkend="glossary-startup-process">startup process</glossterm>,
      the <glossterm linkend="glossary-wal-archiver">WAL archiver</glossterm>,
@@ -548,6 +550,27 @@
    </glossdef>
   </glossentry>
 
+  <glossentry id="glossary-data-checksums-worker">
+   <glossterm>Data Checksums Worker</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which enables or disables data checksums in a specific database.
+    </para>
+   </glossdef>
+  </glossentry>
+
+  <glossentry id="glossary-data-checksums-worker-launcher">
+   <glossterm>Data Checksums Worker Launcher</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which starts <glossterm linkend="glossary-data-checksums-worker"> processes</glossterm>
+     for each database.
+    </para>
+   </glossdef>
+  </glossentry>
+
   <glossentry id="glossary-db-cluster">
    <glossterm>Database cluster</glossterm>
    <glossdef>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index aaa6586d3a4..e51bf902dc2 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3497,8 +3497,9 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Number of data page checksum failures detected in this
-       database (or on a shared object), or NULL if data checksums are
-       disabled.
+       database (or on a shared object).
+       Detected failures are reported regardless of the
+       <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -3508,8 +3509,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Time at which the last data page checksum failure was detected in
-       this database (or on a shared object), or NULL if data checksums are
-       disabled.
+       this database (or on a shared object). Last failure is reported
+       regardless of the <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -6828,6 +6829,212 @@ FROM pg_stat_get_backend_idset() AS backendid;
 
  </sect2>
 
+ <sect2 id="data-checksum-progress-reporting">
+  <title>Data Checksum Progress Reporting</title>
+
+  <indexterm>
+   <primary>pg_stat_progress_data_checksums</primary>
+  </indexterm>
+
+  <para>
+   When data checksums are being enabled on a running cluster, the
+   <structname>pg_stat_progress_data_checksums</structname> view will contain
+   a row for the launcher process, and one row for each worker process which
+   is currently calculating checksums for the data pages in one database.
+  </para>
+
+  <table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
+   <title><structname>pg_stat_progress_data_checksums</structname> View</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        Column Type
+       </para>
+       <para>
+        Description>
+       </para>
+      </entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>pid</structfield> <type>integer</type>
+       </para>
+       <para>
+        Process ID of a datachecksumworker process.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datid</structfield> <type>oid</type>
+      </para>
+      <para>
+       OID of this database, or 0 for the launcher process
+       relation
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datname</structfield> <type>name</type>
+      </para>
+      <para>
+       Name of this database, or <literal>NULL</literal> for the
+       launcher process.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>phase</structfield> <type>text</type>
+       </para>
+       <para>
+        Current processing phase, see <xref linkend="datachecksum-phases"/>
+        for description of the phases.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of databases which will be processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of databases which have been processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of relations which will be processed, or
+        <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of relations yet. The launcher process has
+        this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of relations which have been processed. The launcher
+        process has this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>blocks_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks in the current relation which will be processed,
+        or <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of blocks yet. The launcher process has
+        this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>blocks_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks in the current relation which have been processed.
+        The launcher process has this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+  <table id="datachecksum-phases">
+   <title>Data Checksum Phases</title>
+   <tgroup cols="2">
+    <colspec colname="col1" colwidth="1*"/>
+    <colspec colname="col2" colwidth="2*"/>
+    <thead>
+     <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+    <tbody>
+     <row>
+      <entry><literal>enabling</literal></entry>
+      <entry>
+       The command is currently enabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>disabling</literal></entry>
+      <entry>
+       The command is currently disabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on backends</literal></entry>
+      <entry>
+       The command is currently waiting for backends to acknowledge the data
+       checksum operation.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on temporary tables</literal></entry>
+      <entry>
+       The command is currently waiting for all temporary tables which existed
+       at the time the command was started to be removed.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on checkpoint</literal></entry>
+      <entry>
+       The command is currently waiting for a checkpoint to update the checksum
+       state at the end.
+      </entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+ </sect2>
+
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index 95043aa329c..0343710af53 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -45,6 +45,12 @@ PostgreSQL documentation
    exit status is nonzero if the operation failed.
   </para>
 
+  <para>
+   When enabling checksums, if checksums were in the process of being enabled
+   when the cluster was shut down, <application>pg_checksums</application>
+   will still process all relations regardless of the online processing.
+  </para>
+
   <para>
    When verifying checksums, every file in the cluster is scanned. When
    enabling checksums, each relation file block with a changed checksum is
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index f3b86b26be9..0ada90ca0b1 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -246,9 +246,10 @@
   <para>
    Checksums can be disabled when the cluster is initialized using <link
    linkend="app-initdb-data-checksums"><application>initdb</application></link>.
-   They can also be enabled or disabled at a later time as an offline
-   operation. Data checksums are enabled or disabled at the full cluster
-   level, and cannot be specified individually for databases or tables.
+   They can also be enabled or disabled at a later time either as an offline
+   operation or online in a running cluster allowing concurrent access. Data
+   checksums are enabled or disabled at the full cluster level, and cannot be
+   specified individually for databases or tables.
   </para>
 
   <para>
@@ -265,7 +266,7 @@
   </para>
 
   <sect2 id="checksums-offline-enable-disable">
-   <title>Off-line Enabling of Checksums</title>
+   <title>Offline Enabling of Checksums</title>
 
    <para>
     The <link linkend="app-pgchecksums"><application>pg_checksums</application></link>
@@ -274,6 +275,56 @@
    </para>
 
   </sect2>
+
+  <sect2 id="checksums-online-enable-disable">
+   <title>Online Enabling of Checksums</title>
+
+   <para>
+    Checksums can be enabled or disabled online, by calling the appropriate
+    <link linkend="functions-admin-checksum">functions</link>.
+   </para>
+
+   <para>
+    Enabling checksums will put the cluster checksum mode in
+    <literal>inprogress-on</literal> mode.  During this time, checksums will be
+    written but not verified. In addition to this, a background worker process
+    is started that enables checksums on all existing data in the cluster. Once
+    this worker has completed processing all databases in the cluster, the
+    checksum mode will automatically switch to <literal>on</literal>. The
+    processing will consume two background worker processes, make sure that
+    <varname>max_worker_processes</varname> allows for at least two more
+    additional processes.
+   </para>
+
+   <para>
+    The process will initially wait for all open transactions to finish before
+    it starts, so that it can be certain that there are no tables that have been
+    created inside a transaction that has not committed yet and thus would not
+    be visible to the process enabling checksums. It will also, for each database,
+    wait for all pre-existing temporary tables to get removed before it finishes.
+    If long-lived temporary tables are used in the application it may be necessary
+    to terminate these application connections to allow the process to complete.
+   </para>
+
+   <para>
+    If the cluster is stopped while in <literal>inprogress-on</literal> mode, for
+    any reason, then this process must be restarted manually. To do this,
+    re-execute the function <function>pg_enable_data_checksums()</function>
+    once the cluster has been restarted. The process will start over, there is
+    no support for resuming work from where it was interrupted.
+   </para>
+
+   <note>
+    <para>
+     Enabling checksums can cause significant I/O to the system, as most of the
+     database pages will need to be rewritten, and will be written both to the
+     data files and the WAL. The impact may be limited by throttling using the
+     <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter>
+     parameters of the <function>pg_enable_data_checksums</function> function.
+    </para>
+   </note>
+
+  </sect2>
  </sect1>
 
   <sect1 id="wal-intro">
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 58040f28656..1e3ad2ab68d 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -18,6 +18,7 @@
 #include "access/xlog.h"
 #include "access/xlog_internal.h"
 #include "catalog/pg_control.h"
+#include "storage/bufpage.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -167,6 +168,26 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
 		memcpy(&wal_level, rec, sizeof(int));
 		appendStringInfo(buf, "wal_level %s", get_wal_level_string(wal_level));
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state xlrec;
+
+		memcpy(&xlrec, rec, sizeof(xl_checksum_state));
+		switch (xlrec.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_VERSION:
+				appendStringInfoString(buf, "on");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				appendStringInfoString(buf, "inprogress-off");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				appendStringInfoString(buf, "inprogress-on");
+				break;
+			default:
+				appendStringInfoString(buf, "off");
+		}
+	}
 }
 
 const char *
@@ -218,6 +239,9 @@ xlog_identify(uint8 info)
 		case XLOG_CHECKPOINT_REDO:
 			id = "CHECKPOINT_REDO";
 			break;
+		case XLOG_CHECKSUMS:
+			id = "CHECKSUMS";
+			break;
 	}
 
 	return id;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 799fc739e18..f137cdc6d42 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -647,6 +647,24 @@ static XLogRecPtr LocalMinRecoveryPoint;
 static TimeLineID LocalMinRecoveryPointTLI;
 static bool updateMinRecoveryPoint = true;
 
+/*
+ * Local state fror Controlfile data_checksum_version. After initialization
+ * this is only updated when absorbing a procsignal barrier during interrupt
+ * processing. The reason for keeping a copy in backend-private memory is to
+ * avoid locking for interrogating checksum state. Possible values are the
+ * checksum versions defined in storage/bufpage.h as well as zero when data
+ * checksums are disabled.
+ */
+static uint32 LocalDataChecksumVersion = 0;
+
+/*
+ * Variable backing the GUC, keep it in sync with LocalDataChecksumVersion.
+ * See SetLocalDataChecksumVersion().
+ */
+int			data_checksums = 0;
+
+static void SetLocalDataChecksumVersion(uint32 data_checksum_version);
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -715,6 +733,8 @@ static void WALInsertLockAcquireExclusive(void);
 static void WALInsertLockRelease(void);
 static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);
 
+static void XLogChecksums(ChecksumType new_type);
+
 /*
  * Insert an XLOG record represented by an already-constructed chain of data
  * chunks.  This is a low-level routine; to construct the WAL record header
@@ -828,9 +848,10 @@ XLogInsertRecord(XLogRecData *rdata,
 		 * only happen just after a checkpoint, so it's better to be slow in
 		 * this case and fast otherwise.
 		 *
-		 * Also check to see if fullPageWrites was just turned on or there's a
-		 * running backup (which forces full-page writes); if we weren't
-		 * already doing full-page writes then go back and recompute.
+		 * Also check to see if fullPageWrites was just turned on, there's a
+		 * running backup or if checksums are enabled (all of which forces
+		 * full-page writes); if we weren't already doing full-page writes
+		 * then go back and recompute.
 		 *
 		 * If we aren't doing full-page writes then RedoRecPtr doesn't
 		 * actually affect the contents of the XLOG record, so we'll update
@@ -843,7 +864,9 @@ XLogInsertRecord(XLogRecData *rdata,
 			Assert(RedoRecPtr < Insert->RedoRecPtr);
 			RedoRecPtr = Insert->RedoRecPtr;
 		}
-		doPageWrites = (Insert->fullPageWrites || Insert->runningBackups > 0);
+		doPageWrites = (Insert->fullPageWrites ||
+						Insert->runningBackups > 0 ||
+						DataChecksumsNeedWrite());
 
 		if (doPageWrites &&
 			(!prevDoPageWrites ||
@@ -4579,9 +4602,7 @@ ReadControlFile(void)
 
 	CalculateCheckpointSegments();
 
-	/* Make the initdb settings visible as GUC variables, too */
-	SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no",
-					PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
+	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
 }
 
 /*
@@ -4615,13 +4636,376 @@ GetMockAuthenticationNonce(void)
 }
 
 /*
- * Are checksums enabled for data pages?
+ * DataChecksumsNeedWrite
+ *		Returns whether data checksums must be written or not
+ *
+ * Returns true iff data checksums are enabled or are in the process of being
+ * enabled. During "inprogress-on" and "inprogress-off" states checksums must
+ * be written even though they are not verified (see datachecksumsworker.c for
+ * a longer discussion).
+ *
+ * This function is intended for callsites which are about to write a data page
+ * to storage, and need to know whether to re-calculate the checksum for the
+ * page header. Calling this function must be performed as close to the write
+ * operation as possible to keep the critical section short.
+ */
+bool
+DataChecksumsNeedWrite(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * DataChecksumsNeedVerify
+ *		Returns whether data checksums must be verified or not
+ *
+ * Data checksums are only verified if they are fully enabled in the cluster.
+ * During the "inprogress-on" and "inprogress-off" states they are only
+ * updated, not verified (see datachecksumsworker.c for a longer discussion).
+ *
+ * This function is intended for callsites which have read data and are about
+ * to perform checksum validation based on the result of this.  Calling this
+ * function must be performed as close to the validation call as possible to
+ * keep the critical section short. This is in order to protect against time of
+ * check/time of use situations around data checksum validation.
  */
 bool
-DataChecksumsEnabled(void)
+DataChecksumsNeedVerify(void)
 {
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
+}
+
+/*
+ * DataChecksumsOnInProgress
+ *		Returns whether data checksums are being enabled
+ *
+ * Most operations don't need to worry about the "inprogress" states, and
+ * should use DataChecksumsNeedVerify() or DataChecksumsNeedWrite(). The
+ * "inprogress-on" state for enabling checksums is used when the checksum
+ * worker is setting checksums on all pages, it can thus be used to check for
+ * aborted checksum processing which need to be restarted.
+ */
+inline bool
+DataChecksumsOnInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+}
+
+/*
+ * DataChecksumsOffInProgress
+ *		Returns whether data checksums are being disabled
+ *
+ * The "inprogress-off" state for disabling checksums is used for when the
+ * worker resets the catalog state.  DataChecksumsNeedVerify() or
+ * DataChecksumsNeedWrite() should be used for deciding whether to read/write
+ * checksums.
+ */
+bool
+DataChecksumsOffInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * SetDataChecksumsOnInProgress
+ *		Sets the data checksum state to "inprogress-on" to enable checksums
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on". See
+ * SetDataChecksumsOn below for a description on how this state change works.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOnInProgress(void)
+{
+	uint64		barrier;
+
 	Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+
+	/*
+	 * The state transition is performed in a critical section with
+	 * checkpoints held off to provide crash safety.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state change in all backends to ensure that all backends are in
+	 * "inprogress-on". Once done we know that all backends are writing data
+	 * checksums.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOn
+ *		Enables data checksums cluster-wide
+ *
+ * Enabling data checksums is performed using two barriers, the first one to
+ * set the checksums state to "inprogress-on" (which is performed by
+ * SetDataChecksumsOnInProgress()) and the second one to set the state to "on"
+ * (performed here).
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on".  This state
+ * requires data checksums to be written but not verified. This ensures that
+ * all data pages can be checksummed without the risk of false negatives in
+ * validation during the process.  When all existing pages are guaranteed to
+ * have checksums, and all new pages will be initiated with checksums, the
+ * state can be changed to "on". Once the state is "on" checksums will be both
+ * written and verified. See datachecksumsworker.c for a longer discussion on
+ * how data checksums can be enabled in a running cluster.
+ *
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOn(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile != NULL);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/*
+	 * The only allowed state transition to "on" is from "inprogress-on" since
+	 * that state ensures that all pages will have data checksums written.
+	 */
+	if (ControlFile->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+		elog(ERROR, "checksums not in \"inprogress-on\" mode");
+	}
+
+	LWLockRelease(ControlFileLock);
+
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state transition of "on" in all backends. When done we know that
+	 * data checksums are enabled in all backends and data checksums are both
+	 * written and verified.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOff
+ *		Disables data checksums cluster-wide
+ *
+ * Disabling data checksums must be performed with two sets of barriers, each
+ * carrying a different state. The state is first set to "inprogress-off"
+ * during which checksums are still written but not verified. This ensures that
+ * backends which have yet to observe the state change from "on" won't get
+ * validation errors on concurrently modified pages. Once all backends have
+ * changed to "inprogress-off", the barrier for moving to "off" can be emitted.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOff(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/* If data checksums are already disabled there is nothing to do */
+	if (ControlFile->data_checksum_version == 0)
+	{
+		LWLockRelease(ControlFileLock);
+		return;
+	}
+
+	/*
+	 * If data checksums are currently enabled we first transition to the
+	 * "inprogress-off" state during which backends continue to write
+	 * checksums without verifying them. When all backends are in
+	 * "inprogress-off" the next transition to "off" can be performed, after
+	 * which all data checksum processing is disabled.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+
+		MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+		START_CRIT_SECTION();
+
+		XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+		LWLockRelease(ControlFileLock);
+
+		barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+
+		END_CRIT_SECTION();
+		MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+
+		/*
+		 * Update local state in all backends to ensure that any backend in
+		 * "on" state is changed to "inprogress-off".
+		 */
+		WaitForProcSignalBarrier(barrier);
+
+		/*
+		 * At this point we know that no backends are verifying data checksums
+		 * during reading. Next, we can safely move to state "off" to also
+		 * stop writing checksums.
+		 */
+	}
+	else
+	{
+		/*
+		 * Ending up here implies that the checksums state is "inprogress-on"
+		 * or "inprogress-off" and we can transition directly to "off" from
+		 * there.
+		 */
+		LWLockRelease(ControlFileLock);
+	}
+
+	/*
+	 * Ensure that we don't incur a checkpoint during disabling checksums.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(0);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = 0;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * ProcSignalBarrier absorption functions for enabling and disabling data
+ * checksums in a running cluster. The procsignalbarriers are emitted in the
+ * SetDataChecksums* functions.
+ */
+bool
+AbsorbChecksumsOnInProgressBarrier(void)
+{
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	return true;
+}
+
+bool
+AbsorbChecksumsOnBarrier(void)
+{
+	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_VERSION);
+	return true;
+}
+
+bool
+AbsorbChecksumsOffInProgressBarrier(void)
+{
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+	return true;
+}
+
+bool
+AbsorbChecksumsOffBarrier(void)
+{
+	SetLocalDataChecksumVersion(0);
+	return true;
+}
+
+/*
+ * InitLocalControlData
+ *
+ * Set up backend local caches of controldata variables which may change at
+ * any point during runtime and thus require special cased locking. So far
+ * this only applies to data_checksum_version, but it's intended to be general
+ * purpose enough to handle future cases.
+ */
+void
+InitLocalControldata(void)
+{
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
+	LWLockRelease(ControlFileLock);
+}
+
+/*
+ * XXX probably should be called in all places that modify the value of
+ * LocalDataChecksumVersion (to make sure data_checksums GUC is in sync)
+ *
+ * XXX aren't PG_DATA_ and DATA_ constants the same? why do we need both?
+ */
+void
+SetLocalDataChecksumVersion(uint32 data_checksum_version)
+{
+	LocalDataChecksumVersion = data_checksum_version;
+
+	switch (LocalDataChecksumVersion)
+	{
+		case PG_DATA_CHECKSUM_VERSION:
+			data_checksums = DATA_CHECKSUMS_ON;
+			break;
+
+		case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+			data_checksums = DATA_CHECKSUMS_INPROGRESS_ON;
+			break;
+
+		case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+			data_checksums = DATA_CHECKSUMS_INPROGRESS_OFF;
+			break;
+
+		default:
+			data_checksums = DATA_CHECKSUMS_OFF;
+			break;
+	}
+}
+
+/* guc hook */
+const char *
+show_data_checksums(void)
+{
+	if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)
+		return "on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		return "inprogress-on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+		return "inprogress-off";
+	else
+		return "off";
 }
 
 /*
@@ -6194,6 +6578,33 @@ StartupXLOG(void)
 	 */
 	CompleteCommitTsInitialization();
 
+	/*
+	 * If we reach this point with checksums being enabled ("inprogress-on"
+	 * state), we notify the user that they need to manually restart the
+	 * process to enable checksums. This is because we cannot launch a dynamic
+	 * background worker directly from here, it has to be launched from a
+	 * regular backend.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		ereport(WARNING,
+				(errmsg("data checksums are being enabled, but no worker is running"),
+				 errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
+
+	/*
+	 * If data checksums were being disabled when the cluster was shut down,
+	 * we know that we have a state where all backends have stopped validating
+	 * checksums and we can move to off instead of prompting the user to
+	 * perform any action.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+	{
+		XLogChecksums(0);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = 0;
+		LWLockRelease(ControlFileLock);
+	}
+
 	/*
 	 * All done with end-of-recovery actions.
 	 *
@@ -8201,6 +8612,24 @@ XLogReportParameters(void)
 	}
 }
 
+/*
+ * Log the new state of checksums
+ */
+static void
+XLogChecksums(ChecksumType new_type)
+{
+	xl_checksum_state xlrec;
+	XLogRecPtr	recptr;
+
+	xlrec.new_checksumtype = new_type;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, sizeof(xl_checksum_state));
+
+	recptr = XLogInsert(RM_XLOG_ID, XLOG_CHECKSUMS);
+	XLogFlush(recptr);
+}
+
 /*
  * Update full_page_writes in shared memory, and write an
  * XLOG_FPW_CHANGE record if necessary.
@@ -8629,6 +9058,46 @@ xlog_redo(XLogReaderState *record)
 	{
 		/* nothing to do here, just for informational purposes */
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state state;
+		uint64		barrier;
+
+		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = state.new_checksumtype;
+		LWLockRelease(ControlFileLock);
+
+		/*
+		 * Block on a procsignalbarrier to await all processes having seen the
+		 * change to checksum status. Once the barrier has been passed we can
+		 * initiate the corresponding processing.
+		 */
+		switch (state.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			default:
+				Assert(state.new_checksumtype == 0);
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+		}
+	}
 }
 
 /*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 8c3090165f0..6b1c2ed85ea 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -26,6 +26,7 @@
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "postmaster/datachecksumsworker.h"
 #include "replication/walreceiver.h"
 #include "storage/fd.h"
 #include "storage/latch.h"
@@ -748,3 +749,43 @@ pg_promote(PG_FUNCTION_ARGS)
 						   wait_seconds)));
 	PG_RETURN_BOOL(false);
 }
+
+/*
+ * Disables data checksums for the cluster, if applicable. Starts a background
+ * worker which turns off the data checksums.
+ */
+Datum
+disable_data_checksums(PG_FUNCTION_ARGS)
+{
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	StartDataChecksumsWorkerLauncher(false, 0, 0, false);
+	PG_RETURN_VOID();
+}
+
+/*
+ * Enables data checksums for the cluster, if applicable.  Supports vacuum-
+ * like cost based throttling to limit system load. Starts a background worker
+ * which updates data checksums on existing data.
+ */
+Datum
+enable_data_checksums(PG_FUNCTION_ARGS)
+{
+	int			cost_delay = PG_GETARG_INT32(0);
+	int			cost_limit = PG_GETARG_INT32(1);
+	bool		fast = PG_GETARG_BOOL(2);
+
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	if (cost_delay < 0)
+		ereport(ERROR, errmsg("cost delay cannot be a negative value"));
+
+	if (cost_limit <= 0)
+		ereport(ERROR, errmsg("cost limit must be greater than zero"));
+
+	StartDataChecksumsWorkerLauncher(true, cost_delay, cost_limit, fast);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 3f8a3c55725..598dd41bae6 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1613,7 +1613,8 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	 * enabled for this cluster, and if this is a relation file, then verify
 	 * the checksum.
 	 */
-	if (!noverify_checksums && DataChecksumsEnabled() &&
+	if (!noverify_checksums &&
+		DataChecksumsNeedWrite() &&
 		RelFileNumberIsValid(relfilenumber))
 		verify_checksum = true;
 
@@ -2006,6 +2007,9 @@ verify_page_checksum(Page page, XLogRecPtr start_lsn, BlockNumber blkno,
 	if (PageIsNew(page) || PageGetLSN(page) >= start_lsn)
 		return true;
 
+	if (!DataChecksumsNeedVerify())
+		return true;
+
 	/* Perform the actual checksum calculation. */
 	checksum = pg_checksum_page(page, blkno);
 
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 566f308e443..b18db2b5dde 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -650,6 +650,13 @@ LANGUAGE INTERNAL
 CALLED ON NULL INPUT VOLATILE PARALLEL SAFE
 AS 'pg_stat_reset_slru';
 
+CREATE OR REPLACE FUNCTION
+  pg_enable_data_checksums(cost_delay integer DEFAULT 0,
+                           cost_limit integer DEFAULT 100,
+						   fast boolean DEFAULT false)
+  RETURNS void STRICT VOLATILE LANGUAGE internal AS 'enable_data_checksums'
+  PARALLEL RESTRICTED;
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
@@ -775,6 +782,10 @@ REVOKE EXECUTE ON FUNCTION pg_ls_logicalmapdir() FROM PUBLIC;
 
 REVOKE EXECUTE ON FUNCTION pg_ls_replslotdir(text) FROM PUBLIC;
 
+REVOKE EXECUTE ON FUNCTION pg_enable_data_checksums(integer, integer, boolean) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_disable_data_checksums() FROM public;
+
 --
 -- We also set up some things as accessible to standard roles.
 --
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index a4d2cfdcaf5..4330d0ad656 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1334,6 +1334,27 @@ CREATE VIEW pg_stat_progress_copy AS
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_data_checksums AS
+    SELECT
+        S.pid AS pid, S.datid, D.datname AS datname,
+        CASE S.param1 WHEN 0 THEN 'enabling'
+                      WHEN 1 THEN 'disabling'
+                      WHEN 2 THEN 'waiting'
+                      WHEN 3 THEN 'waiting on backends'
+                      WHEN 4 THEN 'waiting on temporary tables'
+                      WHEN 5 THEN 'waiting on checkpoint'
+                      WHEN 6 THEN 'done'
+                      END AS phase,
+        CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+        S.param3 AS databases_done,
+        CASE S.param4 WHEN -1 THEN NULL ELSE S.param4 END AS relations_total,
+        CASE S.param5 WHEN -1 THEN NULL ELSE S.param5 END AS relations_done,
+        CASE S.param6 WHEN -1 THEN NULL ELSE S.param6 END AS blocks_total,
+        CASE S.param7 WHEN -1 THEN NULL ELSE S.param7 END AS blocks_done
+    FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid
+    ORDER BY S.datid; -- return the launcher process first
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/postmaster/Makefile b/src/backend/postmaster/Makefile
index 0f4435d2d97..0c36765acfe 100644
--- a/src/backend/postmaster/Makefile
+++ b/src/backend/postmaster/Makefile
@@ -18,6 +18,7 @@ OBJS = \
 	bgworker.o \
 	bgwriter.o \
 	checkpointer.o \
+	datachecksumsworker.o \
 	fork_process.o \
 	interrupt.o \
 	launch_backend.o \
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 116ddf7b835..5720f66b0fd 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -18,6 +18,7 @@
 #include "pgstat.h"
 #include "port/atomics.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "replication/logicallauncher.h"
 #include "replication/logicalworker.h"
@@ -132,6 +133,12 @@ static const struct
 	},
 	{
 		"TablesyncWorkerMain", TablesyncWorkerMain
+	},
+	{
+		"DataChecksumsWorkerLauncherMain", DataChecksumsWorkerLauncherMain
+	},
+	{
+		"DataChecksumsWorkerMain", DataChecksumsWorkerMain
 	}
 };
 
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
new file mode 100644
index 00000000000..bbbce61cfa6
--- /dev/null
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -0,0 +1,1447 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.c
+ *	  Background worker for enabling or disabling data checksums online
+ *
+ * When enabling data checksums on a database at initdb time or when shut down
+ * with pg_checksums, no extra process is required as each page is checksummed,
+ * and verified, when accessed.  When enabling checksums on an already running
+ * cluster, this worker will ensure that all pages are checksummed before
+ * verification of the checksums is turned on. In the case of disabling
+ * checksums, the state transition is performed only in the control file, no
+ * changes are performed on the data pages.
+ *
+ * Checksums can be either enabled or disabled cluster-wide, with on/off being
+ * the end state for data_checksums.
+ *
+ * Enabling checksums
+ * ------------------
+ * When enabling checksums in an online cluster, data_checksums will be set to
+ * "inprogress-on" which signals that write operations MUST compute and write
+ * the checksum on the data page, but during reading the checksum SHALL NOT be
+ * verified. This ensures that all objects created during checksumming will
+ * have checksums set, but no reads will fail due to incorrect checksum. The
+ * DataChecksumsWorker will compile a list of databases which exist at the
+ * start of checksumming, and all of these which haven't been dropped during
+ * the processing MUST have been processed successfully in order for checksums
+ * to be enabled. Any new relation created during processing will see the
+ * in-progress state and will automatically be checksummed.
+ *
+ * For each database, all relations which have storage are read and every data
+ * page is marked dirty to force a write with the checksum. This will generate
+ * a lot of WAL as the entire database is read and written.
+ *
+ * If the processing is interrupted by a cluster restart, it will be restarted
+ * from the beginning again as state isn't persisted.
+ *
+ * Disabling checksums
+ * -------------------
+ * When disabling checksums, data_checksums will be set to "inprogress-off"
+ * which signals that checksums are written but no longer verified. This ensure
+ * that backends which have yet to move from the "on" state will still be able
+ * to process data checksum validation.
+ *
+ * Synchronization and Correctness
+ * -------------------------------
+ * The processes involved in enabling, or disabling, data checksums in an
+ * online cluster must be properly synchronized with the normal backends
+ * serving concurrent queries to ensure correctness. Correctness is defined
+ * as the following:
+ *
+ *    - Backends SHALL NOT violate local data_checksums state
+ *    - Data checksums SHALL NOT be considered enabled cluster-wide until all
+ *      currently connected backends have the local state "enabled"
+ *
+ * There are two levels of synchronization required for enabling data checksums
+ * in an online cluster: (i) changing state in the active backends ("on",
+ * "off", "inprogress-on" and "inprogress-off"), and (ii) ensuring no
+ * incompatible objects and processes are left in a database when workers end.
+ * The former deals with cluster-wide agreement on data checksum state and the
+ * latter with ensuring that any concurrent activity cannot break the data
+ * checksum contract during processing.
+ *
+ * Synchronizing the state change is done with procsignal barriers, where the
+ * WAL logging backend updating the global state in the controlfile will wait
+ * for all other backends to absorb the barrier. Barrier absorption will happen
+ * during interrupt processing, which means that connected backends will change
+ * state at different times. To prevent data checksum state changes when
+ * writing and verifying checksums, interrupts shall be held off before
+ * interrogating state and resumed when the IO operation has been performed.
+ *
+ *   When Enabling Data Checksums
+ *   ----------------------------
+ *   A process which fails to observe data checksums being enabled can induce
+ *   two types of errors: failing to write the checksum when modifying the page
+ *   and failing to validate the data checksum on the page when reading it.
+ *
+ *   When processing starts all backends belong to one of the below sets, with
+ *   one set being empty:
+ *
+ *   Bd: Backends in "off" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   If processing is started in an online cluster then all backends are in Bd.
+ *   If processing was halted by the cluster shutting down, the controlfile
+ *   state "inprogress-on" will be observed on system startup and all backends
+ *   will be in Bd. Backends transition Bd -> Bi via a procsignalbarrier.  When
+ *   the DataChecksumsWorker has finished writing checksums on all pages and
+ *   enables data checksums cluster-wide, there are four sets of backends where
+ *   Bd shall be an empty set:
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   Backends in Bi and Be will write checksums when modifying a page, but only
+ *   backends in Be will verify the checksum during reading. The Bg backend is
+ *   blocked waiting for all backends in Bi to process interrupts and move to
+ *   Be. Any backend starting while Bg is waiting on the procsignalbarrier will
+ *   observe the global state being "on" and will thus automatically belong to
+ *   Be.  Checksums are enabled cluster-wide when Bi is an empty set. Bi and Be
+ *   are compatible sets while still operating based on their local state as
+ *   both write data checksums.
+ *
+ *   When Disabling Data Checksums
+ *   -----------------------------
+ *   A process which fails to observe that data checksums have been disabled
+ *   can induce two types of errors: writing the checksum when modifying the
+ *   page and validating a data checksum which is no longer correct due to
+ *   modifications to the page.
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bo: Backends in "inprogress-off" state
+ *
+ *   Backends transition from the Be state to Bd like so: Be -> Bo -> Bd
+ *
+ *   The goal is to transition all backends to Bd making the others empty sets.
+ *   Backends in Bo write data checksums, but don't validate them, such that
+ *   backends still in Be can continue to validate pages until the barrier has
+ *   been absorbed such that they are in Bo. Once all backends are in Bo, the
+ *   barrier to transition to "off" can be raised and all backends can safely
+ *   stop writing data checksums as no backend is enforcing data checksum
+ *   validation any longer.
+ *
+ *
+ * Potential optimizations
+ * -----------------------
+ * Below are some potential optimizations and improvements which were brought
+ * up during reviews of this feature, but which weren't implemented in the
+ * initial version. These are ideas listed without any validation on their
+ * feasibility or potential payoff. More discussion on these can be found on
+ * the -hackers threads linked to in the commit message of this feature.
+ *
+ *   * Launching datachecksumsworker for resuming operation from the startup
+ *     process: Currently users have to restart processing manually after a
+ *     restart since dynamic background worker cannot be started from the
+ *     postmaster. Changing the startup process could make restarting the
+ *     processing automatic on cluster restart.
+ *   * Avoid dirtying the page when checksums already match: Iff the checksum
+ *     on the page happens to already match we still dirty the page. It should
+ *     be enough to only do the log_newpage_buffer() call in that case.
+ *   * Invent a lightweight WAL record that doesn't contain the full-page
+ *     image but just the block number: On replay, the redo routine would read
+ *     the page from disk.
+ *   * Teach pg_checksums to avoid checksummed pages when pg_checksums is used
+ *     to enable checksums on a cluster which is in inprogress-on state and
+ *     may have checksummed pages (make pg_checksums be able to resume an
+ *     online operation).
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/postmaster/datachecksumsworker.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/htup_details.h"
+#include "access/xact.h"
+#include "access/xloginsert.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_database.h"
+#include "commands/progress.h"
+#include "commands/vacuum.h"
+#include "common/relpath.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "postmaster/bgworker.h"
+#include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/bufmgr.h"
+#include "storage/checksum.h"
+#include "storage/ipc.h"
+#include "storage/lmgr.h"
+#include "storage/lwlock.h"
+#include "storage/procarray.h"
+#include "storage/smgr.h"
+#include "tcop/tcopprot.h"
+#include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
+#include "utils/ps_status.h"
+#include "utils/syscache.h"
+
+/*
+ * Number of times we retry to open a database before giving up and consider
+ * it to have failed processing.
+ */
+#define DATACHECKSUMSWORKER_MAX_DB_RETRIES 5
+
+typedef enum
+{
+	DATACHECKSUMSWORKER_SUCCESSFUL = 0,
+	DATACHECKSUMSWORKER_ABORTED,
+	DATACHECKSUMSWORKER_FAILED,
+	DATACHECKSUMSWORKER_RETRYDB,
+} DataChecksumsWorkerResult;
+
+/*
+ * Signaling between backends calling pg_enable/disable_data_checksums, the
+ * checksums launcher process, and the checksums worker process.
+ *
+ * This struct is protected by DataChecksumsWorkerLock
+ */
+typedef struct DataChecksumsWorkerShmemStruct
+{
+	/*
+	 * These are set by pg_enable/disable_data_checksums, to tell the launcher
+	 * what the target state is.
+	 */
+	bool		launch_enable_checksums;	/* True if checksums are being
+											 * enabled, else false */
+	int			launch_cost_delay;
+	int			launch_cost_limit;
+	bool		launch_fast;
+
+	/*
+	 * Is a launcher process is currently running?
+	 *
+	 * This is set by the launcher process, after it has read the above
+	 * launch_* parameters.
+	 */
+	bool		launcher_running;
+
+	/*
+	 * These fields indicate the target state that the launcher is currently
+	 * working towards. They can be different from the corresponding launch_*
+	 * fields, if a new pg_enable/disable_data_checksums() call was made while
+	 * the launcher/worker was already running.
+	 *
+	 * The below members are set when the launcher starts, and are only
+	 * accessed read-only by the single worker. Thus, we can access these
+	 * without a lock. If multiple workers, or dynamic cost parameters, are
+	 * supported at some point then this would need to be revisited.
+	 */
+	bool		enabling_checksums; /* True if checksums are being enabled,
+									 * else false */
+	int			cost_delay;
+	int			cost_limit;
+	bool		immediate_checkpoint;
+
+	/*
+	 * Signaling between the launcher and the worker process.
+	 *
+	 * As there is only a single worker, and the launcher won't read these
+	 * until the worker exits, they can be accessed without the need for a
+	 * lock. If multiple workers are supported then this will have to be
+	 * revisited.
+	 */
+
+	/* result, set by worker before exiting */
+	DataChecksumsWorkerResult success;
+
+	/*
+	 * tells the worker process whether it should also process the shared
+	 * catalogs
+	 */
+	bool		process_shared_catalogs;
+} DataChecksumsWorkerShmemStruct;
+
+/* Shared memory segment for datachecksumsworker */
+static DataChecksumsWorkerShmemStruct *DataChecksumsWorkerShmem;
+
+/* Bookkeeping for work to do */
+typedef struct DataChecksumsWorkerDatabase
+{
+	Oid			dboid;
+	char	   *dbname;
+} DataChecksumsWorkerDatabase;
+
+typedef struct DataChecksumsWorkerResultEntry
+{
+	Oid			dboid;
+	DataChecksumsWorkerResult result;
+	int			retries;
+} DataChecksumsWorkerResultEntry;
+
+
+/*
+ * Flag set by the interrupt handler
+ */
+static volatile sig_atomic_t abort_requested = false;
+
+/*
+ * Have we set the DataChecksumsWorkerShmemStruct->launcher_running flag?
+ * If we have, we need to clear it before exiting!
+ */
+static volatile sig_atomic_t launcher_running = false;
+
+/*
+ * Are we enabling data checksums, or disabling them?
+ */
+static bool enabling_checksums;
+
+/* Prototypes */
+static List *BuildDatabaseList(void);
+static List *BuildRelationList(bool temp_relations, bool include_shared);
+static void FreeDatabaseList(List *dblist);
+static DataChecksumsWorkerResult ProcessDatabase(DataChecksumsWorkerDatabase *db);
+static bool ProcessAllDatabases(bool immediate_checkpoint);
+static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy);
+static void launcher_cancel_handler(SIGNAL_ARGS);
+static void WaitForAllTransactionsToFinish(void);
+
+/*
+ * StartDataChecksumsWorkerLauncher
+ *		Main entry point for datachecksumsworker launcher process
+ *
+ * The main entrypoint for starting data checksums processing for enabling as
+ * well as disabling.
+ */
+void
+StartDataChecksumsWorkerLauncher(bool enable_checksums,
+								 int cost_delay,
+								 int cost_limit,
+								 bool fast)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	bool		launcher_running;
+
+	/* the cost delay settings have no effect when disabling */
+	Assert(enable_checksums || cost_delay == 0);
+	Assert(enable_checksums || cost_limit == 0);
+
+	/* store the desired state in shared memory */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	DataChecksumsWorkerShmem->launch_enable_checksums = enable_checksums;
+	DataChecksumsWorkerShmem->launch_cost_delay = cost_delay;
+	DataChecksumsWorkerShmem->launch_cost_limit = cost_limit;
+	DataChecksumsWorkerShmem->launch_fast = fast;
+
+	/* is the launcher already running? */
+	launcher_running = DataChecksumsWorkerShmem->launcher_running;
+
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * Launch a new launcher process, if it's not running already.
+	 *
+	 * If the launcher is currently busy enabling the checksums, and we want
+	 * them disabled (or vice versa), the launcher will notice that at latest
+	 * when it's about to exit, and will loop back process the new request. So
+	 * if the launcher is already running, we don't need to do anything more
+	 * here to abort it.
+	 *
+	 * If you call pg_enable/disable_data_checksums() twice in a row, before
+	 * the launcher has had a chance to start up, we still end up launching it
+	 * twice.  That's OK, the second invocation will see that a launcher is
+	 * already running and exit quickly.
+	 *
+	 * TODO: We could optimize here and skip launching the launcher, if we are
+	 * already in the desired state, i.e. if the checksums are already enabled
+	 * and you call pg_enable_data_checksums().
+	 */
+	if (!launcher_running)
+	{
+		/*
+		 * Prepare the BackgroundWorker and launch it.
+		 */
+		memset(&bgw, 0, sizeof(bgw));
+		bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+		bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+		snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+		snprintf(bgw.bgw_function_name, BGW_MAXLEN, "DataChecksumsWorkerLauncherMain");
+		snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker launcher");
+		snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker launcher");
+		bgw.bgw_restart_time = BGW_NEVER_RESTART;
+		bgw.bgw_notify_pid = MyProcPid;
+		bgw.bgw_main_arg = (Datum) 0;
+
+		if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+			ereport(ERROR,
+					(errmsg("failed to start background worker to process data checksums")));
+	}
+}
+
+/*
+ * ProcessSingleRelationFork
+ *		Enable data checksums in a single relation/fork.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy)
+{
+	BlockNumber numblocks = RelationGetNumberOfBlocksInFork(reln, forkNum);
+	BlockNumber blknum;
+	char		activity[NAMEDATALEN * 2 + 128];
+	char	   *relns;
+
+	relns = get_namespace_name(RelationGetNamespace(reln));
+
+	if (!relns)
+		return false;
+
+	/* Report the current relation to pgstat_activity */
+	snprintf(activity, sizeof(activity) - 1, "processing: %s.%s (%s, %dblocks)",
+			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	/* XXX only do this for main forks, maybe we should do it for all? */
+	if (forkNum == MAIN_FORKNUM)
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+									 numblocks);
+
+	/*
+	 * We are looping over the blocks which existed at the time of process
+	 * start, which is safe since new blocks are created with checksums set
+	 * already due to the state being "inprogress-on".
+	 */
+	for (blknum = 0; blknum < numblocks; blknum++)
+	{
+		Buffer		buf = ReadBufferExtended(reln, forkNum, blknum, RBM_NORMAL, strategy);
+
+		/* Need to get an exclusive lock before we can flag as dirty */
+		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+
+		/*
+		 * Mark the buffer as dirty and force a full page write.  We have to
+		 * re-write the page to WAL even if the checksum hasn't changed,
+		 * because if there is a replica it might have a slightly different
+		 * version of the page with an invalid checksum, caused by unlogged
+		 * changes (e.g. hintbits) on the master happening while checksums
+		 * were off. This can happen if there was a valid checksum on the page
+		 * at one point in the past, so only when checksums are first on, then
+		 * off, and then turned on again. Iff wal_level is set to "minimal",
+		 * this could be avoided iff the checksum is calculated to be correct.
+		 */
+		START_CRIT_SECTION();
+		MarkBufferDirty(buf);
+		log_newpage_buffer(buf, false);
+		END_CRIT_SECTION();
+
+		UnlockReleaseBuffer(buf);
+
+		/*
+		 * This is the only place where we check if we are asked to abort, the
+		 * abortion will bubble up from here. It's safe to check this without
+		 * a lock, because if we miss it being set, we will try again soon.
+		 */
+		Assert(enabling_checksums);
+		if (!DataChecksumsWorkerShmem->launch_enable_checksums)
+			abort_requested = true;
+		if (abort_requested)
+			return false;
+
+		/* XXX only do this for main forks, maybe we should do it for all? */
+		if (forkNum == MAIN_FORKNUM)
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+										 (blknum + 1));
+
+		vacuum_delay_point(false);
+	}
+
+	pfree(relns);
+	return true;
+}
+
+/*
+ * ProcessSingleRelationByOid
+ *		Process a single relation based on oid.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationByOid(Oid relationId, BufferAccessStrategy strategy)
+{
+	Relation	rel;
+	ForkNumber	fnum;
+	bool		aborted = false;
+
+	StartTransactionCommand();
+
+	rel = try_relation_open(relationId, AccessShareLock);
+	if (rel == NULL)
+	{
+		/*
+		 * Relation no longer exists. We don't consider this an error since
+		 * there are no pages in it that need data checksums, and thus return
+		 * true. The worker operates off a list of relations generated at the
+		 * start of processing, so relations being dropped in the meantime is
+		 * to be expected.
+		 */
+		CommitTransactionCommand();
+		pgstat_report_activity(STATE_IDLE, NULL);
+		return true;
+	}
+	RelationGetSmgr(rel);
+
+	for (fnum = 0; fnum <= MAX_FORKNUM; fnum++)
+	{
+		if (smgrexists(rel->rd_smgr, fnum))
+		{
+			if (!ProcessSingleRelationFork(rel, fnum, strategy))
+			{
+				aborted = true;
+				break;
+			}
+		}
+	}
+	relation_close(rel, AccessShareLock);
+	elog(DEBUG2,
+		 "data checksum processing done for relation with OID %u: %s",
+		 relationId, (aborted ? "aborted" : "finished"));
+
+	CommitTransactionCommand();
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return !aborted;
+}
+
+/*
+ * ProcessDatabase
+ *		Enable data checksums in a single database.
+ *
+ * We do this by launching a dynamic background worker into this database, and
+ * waiting for it to finish.  We have to do this in a separate worker, since
+ * each process can only be connected to one database during its lifetime.
+ */
+static DataChecksumsWorkerResult
+ProcessDatabase(DataChecksumsWorkerDatabase *db)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	BgwHandleStatus status;
+	pid_t		pid;
+	char		activity[NAMEDATALEN + 64];
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_FAILED;
+
+	memset(&bgw, 0, sizeof(bgw));
+	bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+	bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+	snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+	snprintf(bgw.bgw_function_name, BGW_MAXLEN, "%s", "DataChecksumsWorkerMain");
+	snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker worker");
+	snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker worker");
+	bgw.bgw_restart_time = BGW_NEVER_RESTART;
+	bgw.bgw_notify_pid = MyProcPid;
+	bgw.bgw_main_arg = ObjectIdGetDatum(db->dboid);
+
+	/*
+	 * If there are no worker slots available, make sure we retry processing
+	 * this database. This will make the datachecksumsworker move on to the
+	 * next database and quite likely fail with the same problem. TODO: Maybe
+	 * we need a backoff to avoid running through all the databases here in
+	 * short order.
+	 */
+	if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+	{
+		ereport(WARNING,
+				(errmsg("failed to start worker for enabling data checksums in database \"%s\", retrying",
+						db->dbname),
+				 errhint("The max_worker_processes setting might be too low.")));
+		return DATACHECKSUMSWORKER_RETRYDB;
+	}
+
+	status = WaitForBackgroundWorkerStartup(bgw_handle, &pid);
+	if (status == BGWH_STOPPED)
+	{
+		ereport(WARNING,
+				(errmsg("could not start background worker for enabling data checksums in database \"%s\"",
+						db->dbname),
+				 errhint("More details on the error might be found in the server log.")));
+		return DATACHECKSUMSWORKER_FAILED;
+	}
+
+	/*
+	 * If the postmaster crashed we cannot end up with a processed database so
+	 * we have no alternative other than exiting. When enabling checksums we
+	 * won't at this time have changed the pg_control version to enabled so
+	 * when the cluster comes back up processing will have to be restarted.
+	 * When disabling, the pg_control version will be set to off before this
+	 * so when the cluster comes up checksums will be off as expected.
+	 */
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("cannot enable data checksums without the postmaster process"),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	Assert(status == BGWH_STARTED);
+	ereport(DEBUG1,
+			(errmsg("initiating data checksum processing in database \"%s\"",
+					db->dbname)));
+
+	snprintf(activity, sizeof(activity) - 1,
+			 "Waiting for worker in database %s (pid %ld)", db->dbname, (long) pid);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	status = WaitForBackgroundWorkerShutdown(bgw_handle);
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("postmaster exited during data checksum processing in \"%s\"",
+						db->dbname),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	if (DataChecksumsWorkerShmem->success == DATACHECKSUMSWORKER_ABORTED)
+		ereport(LOG,
+				(errmsg("data checksums processing was aborted in database \"%s\"",
+						db->dbname)));
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return DataChecksumsWorkerShmem->success;
+}
+
+/*
+ * launcher_exit
+ *
+ * Internal routine for cleaning up state when the launcher process exits. We
+ * need to clean up the abort flag to ensure that processing can be restarted
+ * again after it was previously aborted.
+ */
+static void
+launcher_exit(int code, Datum arg)
+{
+	if (launcher_running)
+	{
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		launcher_running = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		LWLockRelease(DataChecksumsWorkerLock);
+	}
+}
+
+/*
+ * launcher_cancel_handler
+ *
+ * Internal routine for reacting to SIGINT and flagging the worker to abort.
+ * The worker won't be interrupted immediately but will check for abort flag
+ * between each block in a relation.
+ */
+static void
+launcher_cancel_handler(SIGNAL_ARGS)
+{
+	int			save_errno = errno;
+
+	abort_requested = true;
+
+	/*
+	 * There is no sleeping in the main loop, the flag will be checked
+	 * periodically in ProcessSingleRelationFork. The worker does however
+	 * sleep when waiting for concurrent transactions to end so we still need
+	 * to set the latch.
+	 */
+	SetLatch(MyLatch);
+
+	errno = save_errno;
+}
+
+/*
+ * WaitForAllTransactionsToFinish
+ *		Blocks awaiting all current transactions to finish
+ *
+ * Returns when all transactions which are active at the call of the function
+ * have ended, or if the postmaster dies while waiting. If the postmaster dies
+ * the abort flag will be set to indicate that the caller of this shouldn't
+ * proceed.
+ *
+ * NB: this will return early, if aborted by SIGINT or if the target state
+ * is changed while we're running.
+ */
+static void
+WaitForAllTransactionsToFinish(void)
+{
+	TransactionId waitforxid;
+
+	LWLockAcquire(XidGenLock, LW_SHARED);
+	waitforxid = XidFromFullTransactionId(TransamVariables->nextXid);
+	LWLockRelease(XidGenLock);
+
+	while (TransactionIdPrecedes(GetOldestActiveTransactionId(), waitforxid))
+	{
+		char		activity[64];
+		int			rc;
+
+		/* Oldest running xid is older than us, so wait */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for current transactions to finish (waiting for %u)",
+				 waitforxid);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		rc = WaitLatch(MyLatch,
+					   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+					   5000,
+					   WAIT_EVENT_CHECKSUM_ENABLE_STARTCONDITION);
+
+		/*
+		 * If the postmaster died we won't be able to enable checksums
+		 * cluster-wide so abort and hope to continue when restarted.
+		 */
+		if (rc & WL_POSTMASTER_DEATH)
+			ereport(FATAL,
+					(errmsg("postmaster exited during data checksum processing"),
+					 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_SHARED);
+		if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			abort_requested = true;
+		LWLockRelease(DataChecksumsWorkerLock);
+		if (abort_requested)
+			break;
+	}
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+	return;
+}
+
+/*
+ * DataChecksumsWorkerLauncherMain
+ *
+ * Main function for launching dynamic background workers for processing data
+ * checksums in databases. This function has the bgworker management, with
+ * ProcessAllDatabases being responsible for looping over the databases and
+ * initiating processing.
+ */
+void
+DataChecksumsWorkerLauncherMain(Datum arg)
+{
+	on_shmem_exit(launcher_exit, 0);
+
+	ereport(DEBUG1,
+			errmsg("background worker \"datachecksumsworker\" launcher started"));
+
+	pqsignal(SIGTERM, die);
+	pqsignal(SIGINT, launcher_cancel_handler);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_LAUNCHER;
+	init_ps_display(NULL);
+
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	if (DataChecksumsWorkerShmem->launcher_running)
+	{
+		/* Launcher was already running, let it finish */
+		LWLockRelease(DataChecksumsWorkerLock);
+		return;
+	}
+
+	launcher_running = true;
+
+	/*
+	 * Initialize a connection to shared catalogs only.
+	 */
+	BackgroundWorkerInitializeConnectionByOid(InvalidOid, InvalidOid, 0);
+
+	enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+	DataChecksumsWorkerShmem->launcher_running = true;
+	DataChecksumsWorkerShmem->enabling_checksums = enabling_checksums;
+	DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+	DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+	DataChecksumsWorkerShmem->immediate_checkpoint = DataChecksumsWorkerShmem->launch_fast;
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * The target state can change while we are busy enabling/disabling
+	 * checksums, if the user calls pg_disable/enable_data_checksums() before
+	 * we are finished with the previous request. In that case, we will loop
+	 * back here, to process the new request.
+	 */
+again:
+
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	/*
+	 * If we're asked to enable checksums, we need to check if processing was
+	 * previously interrupted such that we should resume rather than start
+	 * from scratch.
+	 */
+	if (enabling_checksums)
+	{
+		/*
+		 * If we are asked to enable checksums in a cluster which already has
+		 * checksums enabled, exit immediately as there is nothing more to do.
+		 * Hold interrupts to make sure state doesn't change during checking.
+		 */
+		HOLD_INTERRUPTS();
+		if (DataChecksumsNeedVerify())
+		{
+			RESUME_INTERRUPTS();
+			goto done;
+		}
+		RESUME_INTERRUPTS();
+
+		/*
+		 * Initialize progress and indicate that we are waiting on the other
+		 * backends to clear the procsignalbarrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS);
+
+		/* XXX isn't it weird there's no wait between the phase updates? */
+
+		/*
+		 * Set the state to inprogress-on and wait on the procsignal barrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_ENABLING);
+		SetDataChecksumsOnInProgress();
+
+		if (!ProcessAllDatabases(DataChecksumsWorkerShmem->immediate_checkpoint))
+		{
+			/*
+			 * If the target state changed during processing then it's not a
+			 * failure, so restart processing instead.
+			 */
+			LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+			if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			{
+				LWLockRelease(DataChecksumsWorkerLock);
+				goto done;
+			}
+			LWLockRelease(DataChecksumsWorkerLock);
+			ereport(ERROR,
+					(errmsg("unable to enable data checksums in cluster")));
+		}
+
+		SetDataChecksumsOn();
+	}
+	else
+	{
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_DISABLING);
+		SetDataChecksumsOff();
+	}
+
+done:
+
+	/*
+	 * All done. But before we exit, check if the target state was changed
+	 * while we were running. In that case we will have to start all over
+	 * again.
+	 */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+	if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+	{
+		DataChecksumsWorkerShmem->enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+		DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+		LWLockRelease(DataChecksumsWorkerLock);
+		goto again;
+	}
+
+	/* Shut down progress reporting as we are done */
+	pgstat_progress_end_command();
+
+	launcher_running = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	LWLockRelease(DataChecksumsWorkerLock);
+}
+
+/*
+ * ProcessAllDatabases
+ *		Compute the list of all databases and process checksums in each
+ *
+ * This will repeatedly generate a list of databases to process for enabling
+ * checksums. Until no new databases are found, this will loop around computing
+ * a new list and comparing it to the already seen ones.
+ *
+ * If immediate_checkpoint is set to true then a CHECKPOINT_IMMEDIATE will be
+ * issued. This is useful for testing but should be avoided in production use
+ * as it may affect cluster performance drastically.
+ */
+static bool
+ProcessAllDatabases(bool immediate_checkpoint)
+{
+	List	   *DatabaseList;
+	HTAB	   *ProcessedDatabases = NULL;
+	HASHCTL		hash_ctl;
+	bool		found_failed = false;
+	int			flags;
+
+	/* Initialize a hash tracking all processed databases */
+	memset(&hash_ctl, 0, sizeof(hash_ctl));
+	hash_ctl.keysize = sizeof(Oid);
+	hash_ctl.entrysize = sizeof(DataChecksumsWorkerResultEntry);
+	ProcessedDatabases = hash_create("Processed databases",
+									 64,
+									 &hash_ctl,
+									 HASH_ELEM | HASH_BLOBS);
+
+	/*
+	 * Set up so first run processes shared catalogs, but not once in every
+	 * db.
+	 */
+	DataChecksumsWorkerShmem->process_shared_catalogs = true;
+
+	/*
+	 * Get a list of all databases to process. This may include databases that
+	 * were created during our runtime.  Since a database can be created as a
+	 * copy of any other database (which may not have existed in our last
+	 * run), we have to repeat this loop until no new databases show up in the
+	 * list.
+	 */
+	DatabaseList = BuildDatabaseList();
+
+	/*
+	 * Update progress reporting with the total number of databases we need to
+	 * process. This number should not be changed during processing, the
+	 * columns for processed databases is instead increased such that it can
+	 * be compared against the total.
+	 */
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_DBS_TOTAL,
+			PROGRESS_DATACHECKSUMS_DBS_DONE,
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE,
+			PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+			PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+		};
+
+		int64	vals[6];
+
+		vals[0] = list_length(DatabaseList);
+		vals[1] = 0;
+
+		/* translated to NULL */
+		vals[2] = -1;
+		vals[3] = -1;
+		vals[4] = -1;
+		vals[5] = -1;
+
+		pgstat_progress_update_multi_param(6, index, vals);
+	}
+
+	while (true)
+	{
+		int			processed_databases = 0;
+
+		foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+		{
+			DataChecksumsWorkerResult result;
+			DataChecksumsWorkerResultEntry *entry;
+			bool		found;
+
+			/*
+			 * Check if this database has been processed already, and if so
+			 * whether it should be retried or skipped.
+			 */
+			entry = (DataChecksumsWorkerResultEntry *) hash_search(ProcessedDatabases, &db->dboid,
+																   HASH_FIND, NULL);
+
+			if (entry)
+			{
+				if (entry->result == DATACHECKSUMSWORKER_RETRYDB)
+				{
+					/*
+					 * Limit the number of retries to avoid infinite looping
+					 * in case there simply won't be enough workers in the
+					 * cluster to finish this operation.
+					 */
+					if (entry->retries > DATACHECKSUMSWORKER_MAX_DB_RETRIES)
+						entry->result = DATACHECKSUMSWORKER_FAILED;
+				}
+
+				/* Skip if this database has been processed already */
+				if (entry->result != DATACHECKSUMSWORKER_RETRYDB)
+					continue;
+			}
+
+			result = ProcessDatabase(db);
+			processed_databases++;
+
+			/*
+			 * Update the number of processed databases in the progress report.
+			 */
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_DBS_DONE,
+										 processed_databases);
+
+			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
+			{
+				/*
+				 * If one database has completed shared catalogs, we don't
+				 * have to process them again.
+				 */
+				if (DataChecksumsWorkerShmem->process_shared_catalogs)
+					DataChecksumsWorkerShmem->process_shared_catalogs = false;
+			}
+			else if (result == DATACHECKSUMSWORKER_ABORTED)
+			{
+				/* Abort flag set, so exit the whole process */
+				return false;
+			}
+
+			entry = hash_search(ProcessedDatabases, &db->dboid, HASH_ENTER, &found);
+			entry->dboid = db->dboid;
+			entry->result = result;
+			if (!found)
+				entry->retries = 0;
+			else
+				entry->retries++;
+		}
+
+		elog(DEBUG1,
+			 "%i databases processed for data checksum enabling, %s",
+			 processed_databases,
+			 (processed_databases ? "process with restart" : "process completed"));
+
+		FreeDatabaseList(DatabaseList);
+
+		/*
+		 * If no databases were processed in this run of the loop, we have now
+		 * finished all databases and no concurrently created ones can exist.
+		 */
+		if (processed_databases == 0)
+			break;
+
+		/*
+		 * Re-generate the list of databases for another pass. Since we wait
+		 * for all pre-existing transactions finish, this way we can be
+		 * certain that there are no databases left without checksums.
+		 */
+		WaitForAllTransactionsToFinish();
+		DatabaseList = BuildDatabaseList();
+	}
+
+	/*
+	 * ProcessedDatabases now has all databases and the results of their
+	 * processing. Failure to enable checksums for a database can be because
+	 * they actually failed for some reason, or because the database was
+	 * dropped between us getting the database list and trying to process it.
+	 * Get a fresh list of databases to detect the second case where the
+	 * database was dropped before we had started processing it. If a database
+	 * still exists, but enabling checksums failed then we fail the entire
+	 * checksumming process and exit with an error.
+	 */
+	WaitForAllTransactionsToFinish();
+	DatabaseList = BuildDatabaseList();
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+	{
+		DataChecksumsWorkerResult *entry;
+		bool		found;
+
+		entry = hash_search(ProcessedDatabases, (void *) &db->dboid,
+							HASH_FIND, &found);
+
+		/*
+		 * We are only interested in the databases where the failed database
+		 * still exists.
+		 */
+		if (found && *entry == DATACHECKSUMSWORKER_FAILED)
+		{
+			ereport(WARNING,
+					(errmsg("failed to enable data checksums in \"%s\"",
+							db->dbname)));
+			found_failed = found;
+			continue;
+		}
+	}
+
+	FreeDatabaseList(DatabaseList);
+
+	if (found_failed)
+	{
+		/* Disable checksums on cluster, because we failed */
+		SetDataChecksumsOff();
+		ereport(ERROR,
+				(errmsg("data checksums failed to get enabled in all databases, aborting"),
+				 errhint("The server log might have more information on the cause of the error.")));
+	}
+
+	/*
+	 * When enabling checksums, we have to wait for a checkpoint for the
+	 * checksums to e.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT);
+
+	/*
+	 * Force a checkpoint to get everything out to disk. The use of immediate
+	 * checkpoints is for running tests, as they would otherwise not execute
+	 * in such a way that they can reliably be placed under timeout control.
+	 */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_IMMEDIATE;
+	RequestCheckpoint(flags);
+
+	return true;
+}
+
+/*
+ * DataChecksumsWorkerShmemSize
+ *		Compute required space for datachecksumsworker-related shared memory
+ */
+Size
+DataChecksumsWorkerShmemSize(void)
+{
+	Size		size;
+
+	size = sizeof(DataChecksumsWorkerShmemStruct);
+	size = MAXALIGN(size);
+
+	return size;
+}
+
+/*
+ * DataChecksumsWorkerShmemInit
+ *		Allocate and initialize datachecksumsworker-related shared memory
+ */
+void
+DataChecksumsWorkerShmemInit(void)
+{
+	bool		found;
+
+	DataChecksumsWorkerShmem = (DataChecksumsWorkerShmemStruct *)
+		ShmemInitStruct("DataChecksumsWorker Data",
+						DataChecksumsWorkerShmemSize(),
+						&found);
+
+	if (!found)
+	{
+		MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
+
+		/*
+		 * Even if this is a redundant assignment, we want to be explicit about
+		 * our intent for readability, since we want to be able to query this
+		 * state in case of restartability.
+		 */
+		DataChecksumsWorkerShmem->launch_enable_checksums = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		DataChecksumsWorkerShmem->launch_fast = false;
+	}
+}
+
+/*
+ * BuildDatabaseList
+ *		Compile a list of all currently available databases in the cluster
+ *
+ * This creates the list of databases for the datachecksumsworker workers to
+ * add checksums to. If the caller wants to ensure that no concurrently
+ * running CREATE DATABASE calls exist, this needs to be preceded by a call
+ * to WaitForAllTransactionsToFinish().
+ */
+static List *
+BuildDatabaseList(void)
+{
+	List	   *DatabaseList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(DatabaseRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_database pgdb = (Form_pg_database) GETSTRUCT(tup);
+		DataChecksumsWorkerDatabase *db;
+
+		oldctx = MemoryContextSwitchTo(ctx);
+
+		db = (DataChecksumsWorkerDatabase *) palloc0(sizeof(DataChecksumsWorkerDatabase));
+
+		db->dboid = pgdb->oid;
+		db->dbname = pstrdup(NameStr(pgdb->datname));
+
+		DatabaseList = lappend(DatabaseList, db);
+
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return DatabaseList;
+}
+
+static void
+FreeDatabaseList(List *dblist)
+{
+	if (!dblist)
+		return;
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, dblist)
+	{
+		if (db->dbname != NULL)
+			pfree(db->dbname);
+	}
+
+	list_free_deep(dblist);
+}
+
+/*
+ * BuildRelationList
+ *		Compile a list of relations in the database
+ *
+ * Returns a list of OIDs for the request relation types. If temp_relations
+ * is True then only temporary relations are returned. If temp_relations is
+ * False then non-temporary relations which have data checksums are returned.
+ * If include_shared is True then shared relations are included as well in a
+ * non-temporary list. include_shared has no relevance when building a list of
+ * temporary relations.
+ */
+static List *
+BuildRelationList(bool temp_relations, bool include_shared)
+{
+	List	   *RelationList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(RelationRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_class pgc = (Form_pg_class) GETSTRUCT(tup);
+
+		/*
+		 * Only include temporary relations when asked for a temp relation
+		 * list.
+		 */
+		if (pgc->relpersistence == RELPERSISTENCE_TEMP)
+		{
+			if (!temp_relations)
+				continue;
+		}
+		else
+		{
+			/*
+			 * If we are only interested in temp relations then continue
+			 * immediately as the current relation isn't a temp relation.
+			 */
+			if (temp_relations)
+				continue;
+
+			if (!RELKIND_HAS_STORAGE(pgc->relkind))
+				continue;
+
+			if (pgc->relisshared && !include_shared)
+				continue;
+		}
+
+		oldctx = MemoryContextSwitchTo(ctx);
+		RelationList = lappend_oid(RelationList, pgc->oid);
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return RelationList;
+}
+
+/*
+ * DataChecksumsWorkerMain
+ *
+ * Main function for enabling checksums in a single database, This is the
+ * function set as the bgw_function_name in the dynamic background worker
+ * process initiated for each database by the worker launcher. After enabling
+ * data checksums in each applicable relation in the database, it will wait for
+ * all temporary relations that were present when the function started to
+ * disappear before returning. This is required since we cannot rewrite
+ * existing temporary relations with data checksums.
+ */
+void
+DataChecksumsWorkerMain(Datum arg)
+{
+	Oid			dboid = DatumGetObjectId(arg);
+	List	   *RelationList = NIL;
+	List	   *InitialTempTableList = NIL;
+	BufferAccessStrategy strategy;
+	bool		aborted = false;
+	int64		rels_done;
+
+	enabling_checksums = true;
+
+	pqsignal(SIGTERM, die);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_WORKER;
+	init_ps_display(NULL);
+
+	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
+											  BGWORKER_BYPASS_ALLOWCONN);
+
+	/* worker will have a separate entry in pg_stat_progress_data_checksums */
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	/*
+	 * Get a list of all temp tables present as we start in this database. We
+	 * need to wait until they are all gone until we are done, since we cannot
+	 * access these relations and modify them.
+	 */
+	InitialTempTableList = BuildRelationList(true, false);
+
+	/*
+	 * Enable vacuum cost delay, if any.
+	 */
+	Assert(DataChecksumsWorkerShmem->enabling_checksums);
+	VacuumCostDelay = DataChecksumsWorkerShmem->cost_delay;
+	VacuumCostLimit = DataChecksumsWorkerShmem->cost_limit;
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumCostPageHit = 0;
+	VacuumCostPageMiss = 0;
+	VacuumCostPageDirty = 0;
+
+	/*
+	 * Create and set the vacuum strategy as our buffer strategy.
+	 */
+	strategy = GetAccessStrategy(BAS_VACUUM);
+
+	RelationList = BuildRelationList(false,
+									 DataChecksumsWorkerShmem->process_shared_catalogs);
+
+	/* Update the total number of relations to be processed in this DB. */
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE
+		};
+
+		int64	vals[2];
+
+		vals[0] = list_length(RelationList);
+		vals[1] = 0;
+
+		pgstat_progress_update_multi_param(2, index, vals);
+	}
+
+	/* process the relations */
+	rels_done = 0;
+	foreach_oid(reloid, RelationList)
+	{
+		if (!ProcessSingleRelationByOid(reloid, strategy))
+		{
+			aborted = true;
+			break;
+		}
+
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_RELS_DONE,
+									 ++rels_done);
+	}
+	list_free(RelationList);
+
+	if (aborted)
+	{
+		DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+		ereport(DEBUG1,
+				(errmsg("data checksum processing aborted in database OID %u",
+						dboid)));
+		return;
+	}
+
+	/* The worker is about to wait for temporary tables to go away. */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL);
+
+	/*
+	 * Wait for all temp tables that existed when we started to go away. This
+	 * is necessary since we cannot "reach" them to enable checksums. Any temp
+	 * tables created after we started will already have checksums in them
+	 * (due to the "inprogress-on" state), so no need to wait for those.
+	 */
+	for (;;)
+	{
+		List	   *CurrentTempTables;
+		int			numleft;
+		char		activity[64];
+
+		CurrentTempTables = BuildRelationList(true, false);
+		numleft = 0;
+		foreach_oid(tmptbloid, InitialTempTableList)
+		{
+			if (list_member_oid(CurrentTempTables, tmptbloid))
+				numleft++;
+		}
+		list_free(CurrentTempTables);
+
+		if (numleft == 0)
+			break;
+
+		/*
+		 * At least one temp table is left to wait for, indicate in pgstat
+		 * activity and progress reporting.
+		 */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for %d temp tables to be removed", numleft);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		(void) WaitLatch(MyLatch,
+						 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+						 5000,
+						 WAIT_EVENT_CHECKSUM_ENABLE_FINISHCONDITION);
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		aborted = DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums;
+		LWLockRelease(DataChecksumsWorkerLock);
+
+		if (aborted || abort_requested)
+		{
+			DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+			ereport(DEBUG1,
+					(errmsg("data checksum processing aborted in database OID %u",
+							dboid)));
+			return;
+		}
+	}
+
+	list_free(InitialTempTableList);
+
+	/* worker done */
+	pgstat_progress_end_command();
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
+}
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 47375e5bfaa..92d8017fd56 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -202,6 +202,9 @@ static child_process_kind child_process_kinds[] = {
 	[B_WAL_SUMMARIZER] = {"wal_summarizer", WalSummarizerMain, true},
 	[B_WAL_WRITER] = {"wal_writer", WalWriterMain, true},
 
+	[B_DATACHECKSUMSWORKER_LAUNCHER] = {"datachecksum launcher", NULL, false},
+	[B_DATACHECKSUMSWORKER_WORKER] = {"datachecksum worker", NULL, false},
+
 	[B_LOGGER] = {"syslogger", SysLoggerMain, false},
 };
 
diff --git a/src/backend/postmaster/meson.build b/src/backend/postmaster/meson.build
index 0008603cfee..ce10ef1059a 100644
--- a/src/backend/postmaster/meson.build
+++ b/src/backend/postmaster/meson.build
@@ -6,6 +6,7 @@ backend_sources += files(
   'bgworker.c',
   'bgwriter.c',
   'checkpointer.c',
+  'datachecksumsworker.c',
   'fork_process.c',
   'interrupt.c',
   'launch_backend.c',
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index d2a7a7add6f..2fc438987b5 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2947,6 +2947,11 @@ PostmasterStateMachine(void)
 									B_INVALID,
 									B_STANDALONE_BACKEND);
 
+			/* also add checksumming processes */
+			remainMask = btmask_add(remainMask,
+									B_DATACHECKSUMSWORKER_LAUNCHER,
+									B_DATACHECKSUMSWORKER_WORKER);
+
 			/* All types should be included in targetMask or remainMask */
 			Assert((remainMask.mask | targetMask.mask) == BTYPE_MASK_ALL.mask);
 		}
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 24d88f368d8..00cce4c7673 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -186,6 +186,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 		case XLOG_FPW_CHANGE:
 		case XLOG_FPI_FOR_HINT:
 		case XLOG_FPI:
+		case XLOG_CHECKSUMS:
 		case XLOG_OVERWRITE_CONTRECORD:
 		case XLOG_CHECKPOINT_REDO:
 			break;
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 174eed70367..e7761a3ddb6 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -30,6 +30,8 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "postmaster/postmaster.h"
 #include "postmaster/walsummarizer.h"
 #include "replication/logicallauncher.h"
 #include "replication/origin.h"
@@ -148,6 +150,7 @@ CalculateShmemSize(int *num_semaphores)
 	size = add_size(size, WaitEventCustomShmemSize());
 	size = add_size(size, InjectionPointShmemSize());
 	size = add_size(size, SlotSyncShmemSize());
+	size = add_size(size, DataChecksumsWorkerShmemSize());
 
 	/* include additional requested shmem from preload libraries */
 	size = add_size(size, total_addin_request);
@@ -330,6 +333,7 @@ CreateOrAttachShmemStructs(void)
 	PgArchShmemInit();
 	ApplyLauncherShmemInit();
 	SlotSyncShmemInit();
+	DataChecksumsWorkerShmemInit();
 
 	/*
 	 * Set up other modules that need some shared memory space
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 7d201965503..2b13a8cd260 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -18,6 +18,7 @@
 #include <unistd.h>
 
 #include "access/parallel.h"
+#include "access/xlog.h"
 #include "commands/async.h"
 #include "miscadmin.h"
 #include "pgstat.h"
@@ -575,6 +576,18 @@ ProcessProcSignalBarrier(void)
 					case PROCSIGNAL_BARRIER_SMGRRELEASE:
 						processed = ProcessBarrierSmgrRelease();
 						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON:
+						processed = AbsorbChecksumsOnInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_ON:
+						processed = AbsorbChecksumsOnBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF:
+						processed = AbsorbChecksumsOffInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_OFF:
+						processed = AbsorbChecksumsOffBarrier();
+						break;
 				}
 
 				/*
diff --git a/src/backend/storage/page/README b/src/backend/storage/page/README
index e30d7ac59ad..73c36a63908 100644
--- a/src/backend/storage/page/README
+++ b/src/backend/storage/page/README
@@ -10,7 +10,9 @@ http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf, discussed
 2010/12/22 on -hackers list.
 
 Current implementation requires this be enabled system-wide at initdb time, or
-by using the pg_checksums tool on an offline cluster.
+by using the pg_checksums tool on an offline cluster.  Checksums can also be
+enabled at runtime using pg_enable_data_checksums(), and disabled by using
+pg_disable_data_checksums().
 
 The checksum is not valid at all times on a data page!!
 The checksum is valid when the page leaves the shared pool and is checked
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index ecc81aacfc3..8bd5fed8c85 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -98,7 +98,7 @@ PageIsVerifiedExtended(PageData *page, BlockNumber blkno, int flags)
 	 */
 	if (!PageIsNew(page))
 	{
-		if (DataChecksumsEnabled())
+		if (DataChecksumsNeedVerify())
 		{
 			checksum = pg_checksum_page(page, blkno);
 
@@ -1501,7 +1501,7 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 	static char *pageCopy = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return page;
 
 	/*
@@ -1531,7 +1531,7 @@ void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
 	/* If we don't need a checksum, just return */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return;
 
 	((PageHeader) page)->pd_checksum = pg_checksum_page(page, blkno);
diff --git a/src/backend/utils/activity/pgstat_backend.c b/src/backend/utils/activity/pgstat_backend.c
index a8cb54a7732..7c9ff5c19c5 100644
--- a/src/backend/utils/activity/pgstat_backend.c
+++ b/src/backend/utils/activity/pgstat_backend.c
@@ -376,6 +376,8 @@ pgstat_tracks_backend_bktype(BackendType bktype)
 		case B_BG_WRITER:
 		case B_CHECKPOINTER:
 		case B_STARTUP:
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 			return false;
 
 		case B_AUTOVAC_WORKER:
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index eb575025596..e7b418a000e 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -370,6 +370,8 @@ pgstat_tracks_io_bktype(BackendType bktype)
 		case B_LOGGER:
 			return false;
 
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 		case B_AUTOVAC_LAUNCHER:
 		case B_AUTOVAC_WORKER:
 		case B_BACKEND:
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index 3c594415bfd..b1b5cdcf36c 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -115,6 +115,8 @@ CHECKPOINT_DELAY_COMPLETE	"Waiting for a backend that blocks a checkpoint from c
 CHECKPOINT_DELAY_START	"Waiting for a backend that blocks a checkpoint from starting."
 CHECKPOINT_DONE	"Waiting for a checkpoint to complete."
 CHECKPOINT_START	"Waiting for a checkpoint to start."
+CHECKSUM_ENABLE_STARTCONDITION	"Waiting for data checksums enabling to start."
+CHECKSUM_ENABLE_FINISHCONDITION	"Waiting for data checksums to be enabled."
 EXECUTE_GATHER	"Waiting for activity from a child process while executing a <literal>Gather</literal> plan node."
 HASH_BATCH_ALLOCATE	"Waiting for an elected Parallel Hash participant to allocate a hash table."
 HASH_BATCH_ELECT	"Waiting to elect a Parallel Hash participant to allocate a hash table."
@@ -346,6 +348,7 @@ WALSummarizer	"Waiting to read or update WAL summarization state."
 DSMRegistry	"Waiting to read or update the dynamic shared memory registry."
 InjectionPoint	"Waiting to read or update information related to injection points."
 SerialControl	"Waiting to read or update shared <filename>pg_serial</filename> state."
+DataChecksumsWorker	"Waiting for data checksumsworker."
 
 #
 # END OF PREDEFINED LWLOCKS (DO NOT CHANGE THIS LINE)
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 662ce46cbc2..2cb7766c1e2 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -274,6 +274,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		cmdtype = PROGRESS_COMMAND_BASEBACKUP;
 	else if (pg_strcasecmp(cmd, "COPY") == 0)
 		cmdtype = PROGRESS_COMMAND_COPY;
+	else if (pg_strcasecmp(cmd, "DATACHECKSUMS") == 0)
+		cmdtype = PROGRESS_COMMAND_DATACHECKSUMS;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -1146,9 +1148,6 @@ pg_stat_get_db_checksum_failures(PG_FUNCTION_ARGS)
 	int64		result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
@@ -1164,9 +1163,6 @@ pg_stat_get_db_checksum_last_failure(PG_FUNCTION_ARGS)
 	TimestampTz result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index dc3521457c7..a071ba6f455 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -293,6 +293,12 @@ GetBackendTypeDesc(BackendType backendType)
 		case B_CHECKPOINTER:
 			backendDesc = gettext_noop("checkpointer");
 			break;
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+			backendDesc = "datachecksumsworker launcher";
+			break;
+		case B_DATACHECKSUMSWORKER_WORKER:
+			backendDesc = "datachecksumsworker worker";
+			break;
 		case B_LOGGER:
 			backendDesc = gettext_noop("logger");
 			break;
@@ -892,7 +898,8 @@ InitializeSessionUserIdStandalone(void)
 	 * workers, in slot sync worker and in background workers.
 	 */
 	Assert(!IsUnderPostmaster || AmAutoVacuumWorkerProcess() ||
-		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess());
+		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess() ||
+		   AmDataChecksumsWorkerProcess());
 
 	/* call only once */
 	Assert(!OidIsValid(AuthenticatedUserId));
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index ee1a9d5d98b..70899e6ef2d 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -739,6 +739,11 @@ InitPostgres(const char *in_dbname, Oid dboid,
 	 */
 	SharedInvalBackendInit(false);
 
+	/*
+	 * Set up backend local cache of Controldata values.
+	 */
+	InitLocalControldata();
+
 	ProcSignalInit(MyCancelKeyValid, MyCancelKey);
 
 	/*
@@ -869,7 +874,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
 					 errhint("You should immediately run CREATE USER \"%s\" SUPERUSER;.",
 							 username != NULL ? username : "postgres")));
 	}
-	else if (AmBackgroundWorkerProcess())
+	else if (AmBackgroundWorkerProcess() || AmDataChecksumsWorkerProcess())
 	{
 		if (username == NULL && !OidIsValid(useroid))
 		{
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index ad25cbb39c5..66abf056159 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -476,6 +476,14 @@ static const struct config_enum_entry wal_compression_options[] = {
 	{NULL, 0, false}
 };
 
+static const struct config_enum_entry data_checksums_options[] = {
+	{"on", DATA_CHECKSUMS_ON, true},
+	{"off", DATA_CHECKSUMS_OFF, true},
+	{"inprogress-on", DATA_CHECKSUMS_INPROGRESS_ON, true},
+	{"inprogress-off", DATA_CHECKSUMS_INPROGRESS_OFF, true},
+	{NULL, 0, false}
+};
+
 /*
  * Options for enum values stored in other modules
  */
@@ -601,7 +609,6 @@ static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
 static int	num_os_semaphores;
-static bool data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
@@ -1952,17 +1959,6 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
-	{
-		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
-			gettext_noop("Shows whether data checksums are turned on for this cluster."),
-			NULL,
-			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
-		},
-		&data_checksums,
-		false,
-		NULL, NULL, NULL
-	},
-
 	{
 		{"syslog_sequence_numbers", PGC_SIGHUP, LOGGING_WHERE,
 			gettext_noop("Add sequence number to syslog messages to avoid duplicate suppression."),
@@ -5299,6 +5295,17 @@ struct config_enum ConfigureNamesEnum[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows whether data checksums are turned on for this cluster."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
+		},
+		&data_checksums,
+		DATA_CHECKSUMS_OFF, data_checksums_options,
+		NULL, NULL, show_data_checksums
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 867aeddc601..79c3f86357e 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -576,7 +576,7 @@ main(int argc, char *argv[])
 		mode == PG_MODE_DISABLE)
 		pg_fatal("data checksums are already disabled in cluster");
 
-	if (ControlFile->data_checksum_version > 0 &&
+	if (ControlFile->data_checksum_version == DATA_CHECKSUMS_ON &&
 		mode == PG_MODE_ENABLE)
 		pg_fatal("data checksums are already enabled in cluster");
 
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index bd49ea867bf..f48936d3b00 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -14,6 +14,7 @@
 
 #include "common/string.h"
 #include "pg_upgrade.h"
+#include "storage/bufpage.h"
 
 
 /*
@@ -735,6 +736,14 @@ check_control_data(ControlData *oldctrl,
 	 * check_for_isn_and_int8_passing_mismatch().
 	 */
 
+	/*
+	 * If data checksums are in any in-progress state then disallow the
+	 * upgrade. The user should either let the process finish, or turn off
+	 * data checksums, before retrying.
+	 */
+	if (oldctrl->data_checksum_version > PG_DATA_CHECKSUM_VERSION)
+		pg_fatal("checksums are being enabled in the old cluster");
+
 	/*
 	 * We might eventually allow upgrades from checksum to no-checksum
 	 * clusters.
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index d313099c027..aec3ea0bc63 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -56,6 +56,7 @@ extern PGDLLIMPORT int CommitDelay;
 extern PGDLLIMPORT int CommitSiblings;
 extern PGDLLIMPORT bool track_wal_io_timing;
 extern PGDLLIMPORT int wal_decode_buffer_size;
+extern PGDLLIMPORT int data_checksums;
 
 extern PGDLLIMPORT int CheckPointSegments;
 
@@ -117,7 +118,7 @@ extern PGDLLIMPORT int wal_level;
  * of the bits make it to disk, but the checksum wouldn't match.  Also WAL-log
  * them if forced by wal_log_hints=on.
  */
-#define XLogHintBitIsNeeded() (DataChecksumsEnabled() || wal_log_hints)
+#define XLogHintBitIsNeeded() (wal_log_hints || DataChecksumsNeedWrite())
 
 /* Do we need to WAL-log information required only for Hot Standby and logical replication? */
 #define XLogStandbyInfoActive() (wal_level >= WAL_LEVEL_REPLICA)
@@ -230,7 +231,19 @@ extern XLogRecPtr GetXLogWriteRecPtr(void);
 
 extern uint64 GetSystemIdentifier(void);
 extern char *GetMockAuthenticationNonce(void);
-extern bool DataChecksumsEnabled(void);
+extern bool DataChecksumsNeedWrite(void);
+extern bool DataChecksumsNeedVerify(void);
+extern bool DataChecksumsOnInProgress(void);
+extern bool DataChecksumsOffInProgress(void);
+extern void SetDataChecksumsOnInProgress(void);
+extern void SetDataChecksumsOn(void);
+extern void SetDataChecksumsOff(void);
+extern bool AbsorbChecksumsOnInProgressBarrier(void);
+extern bool AbsorbChecksumsOffInProgressBarrier(void);
+extern bool AbsorbChecksumsOnBarrier(void);
+extern bool AbsorbChecksumsOffBarrier(void);
+extern const char *show_data_checksums(void);
+extern void InitLocalControldata(void);
 extern bool GetDefaultCharSignedness(void);
 extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
 extern Size XLOGShmemSize(void);
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index 2cf8d55d706..ee82d4ce73d 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,6 +25,7 @@
 #include "lib/stringinfo.h"
 #include "pgtime.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/relfilelocator.h"
 
 
@@ -289,6 +290,12 @@ typedef struct xl_restore_point
 	char		rp_name[MAXFNAMELEN];
 } xl_restore_point;
 
+/* Information logged when data checksum level is changed */
+typedef struct xl_checksum_state
+{
+	ChecksumType new_checksumtype;
+} xl_checksum_state;
+
 /* Overwrite of prior contrecord */
 typedef struct xl_overwrite_contrecord
 {
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 63e834a6ce4..9b5a50adf54 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -80,6 +80,7 @@ typedef struct CheckPoint
 /* 0xC0 is used in Postgres 9.5-11 */
 #define XLOG_OVERWRITE_CONTRECORD		0xD0
 #define XLOG_CHECKPOINT_REDO			0xE0
+#define XLOG_CHECKSUMS					0xF0
 
 
 /*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 42e427f8fe8..b9de4da8c53 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12253,6 +12253,23 @@
   proname => 'jsonb_subscript_handler', prorettype => 'internal',
   proargtypes => 'internal', prosrc => 'jsonb_subscript_handler' },
 
+# data checksum management functions
+{ oid => '9258',
+  descr => 'disable data checksums',
+  proname => 'pg_disable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => '',
+  prosrc => 'disable_data_checksums' },
+
+{ oid => '9257',
+  descr => 'enable data checksums',
+  proname => 'pg_enable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'int4 int4 bool', proallargtypes => '{int4,int4,bool}',
+  proargmodes => '{i,i,i}',
+  proargnames => '{cost_delay,cost_limit,fast}',
+  prosrc => 'enable_data_checksums' },
+
 # collation management functions
 { oid => '3445', descr => 'import collations from operating system',
   proname => 'pg_import_system_collations', procost => '100',
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 7c736e7b03b..94b478a6cc9 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -157,4 +157,21 @@
 #define PROGRESS_COPY_TYPE_PIPE 3
 #define PROGRESS_COPY_TYPE_CALLBACK 4
 
+/* Progress parameters for PROGRESS_DATACHECKSUMS */
+#define PROGRESS_DATACHECKSUMS_PHASE		0
+#define PROGRESS_DATACHECKSUMS_DBS_TOTAL	1
+#define PROGRESS_DATACHECKSUMS_DBS_DONE		2
+#define PROGRESS_DATACHECKSUMS_RELS_TOTAL	3
+#define PROGRESS_DATACHECKSUMS_RELS_DONE	4
+#define PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL	5
+#define PROGRESS_DATACHECKSUMS_BLOCKS_DONE	6
+
+/* Phases of datachecksumsworker operation */
+#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING			0
+#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING			1
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING			2
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS	3
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL	4
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT 5
+
 #endif
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index a2b63495eec..9923c7f518d 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -365,6 +365,9 @@ typedef enum BackendType
 	B_WAL_SUMMARIZER,
 	B_WAL_WRITER,
 
+	B_DATACHECKSUMSWORKER_LAUNCHER,
+	B_DATACHECKSUMSWORKER_WORKER,
+
 	/*
 	 * Logger is not connected to shared memory and does not have a PGPROC
 	 * entry.
@@ -389,6 +392,7 @@ extern PGDLLIMPORT BackendType MyBackendType;
 #define AmWalReceiverProcess()		(MyBackendType == B_WAL_RECEIVER)
 #define AmWalSummarizerProcess()	(MyBackendType == B_WAL_SUMMARIZER)
 #define AmWalWriterProcess()		(MyBackendType == B_WAL_WRITER)
+#define AmDataChecksumsWorkerProcess()	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
 
 #define AmSpecialWorkerProcess() \
 	(AmAutoVacuumLauncherProcess() || \
diff --git a/src/include/postmaster/datachecksumsworker.h b/src/include/postmaster/datachecksumsworker.h
new file mode 100644
index 00000000000..59c9000d646
--- /dev/null
+++ b/src/include/postmaster/datachecksumsworker.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.h
+ *	  header file for checksum helper background worker
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/postmaster/datachecksumsworker.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef DATACHECKSUMSWORKER_H
+#define DATACHECKSUMSWORKER_H
+
+/* Shared memory */
+extern Size DataChecksumsWorkerShmemSize(void);
+extern void DataChecksumsWorkerShmemInit(void);
+
+/* Start the background processes for enabling or disabling checksums */
+void		StartDataChecksumsWorkerLauncher(bool enable_checksums,
+											 int cost_delay,
+											 int cost_limit,
+											 bool fast);
+
+/* Background worker entrypoints */
+void		DataChecksumsWorkerLauncherMain(Datum arg);
+void		DataChecksumsWorkerMain(Datum arg);
+
+#endif							/* DATACHECKSUMSWORKER_H */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 6646b6f6371..5326782171f 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -205,7 +205,17 @@ typedef PageHeaderData *PageHeader;
  * handling pages.
  */
 #define PG_PAGE_LAYOUT_VERSION		4
+
+/*
+ * Checksum version 0 is used for when data checksums are disabled.
+ * PG_DATA_CHECKSUM_VERSION defines that data checksums are enabled in the
+ * cluster and PG_DATA_CHECKSUM_INPROGRESS_{ON|OFF}_VERSION defines that data
+ * checksums are either currently being enabled or disabled.
+ */
 #define PG_DATA_CHECKSUM_VERSION	1
+#define PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION	2
+#define PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION	3
+
 
 /* ----------------------------------------------------------------
  *						page support functions
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 25d13a798d1..6faff962ef0 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -15,6 +15,14 @@
 
 #include "storage/block.h"
 
+typedef enum ChecksumType
+{
+	DATA_CHECKSUMS_OFF = 0,
+	DATA_CHECKSUMS_ON,
+	DATA_CHECKSUMS_INPROGRESS_ON,
+	DATA_CHECKSUMS_INPROGRESS_OFF
+} ChecksumType;
+
 /*
  * Compute the checksum for a Postgres page.  The page must be aligned on a
  * 4-byte boundary.
diff --git a/src/include/storage/lwlocklist.h b/src/include/storage/lwlocklist.h
index cf565452382..afe39db753c 100644
--- a/src/include/storage/lwlocklist.h
+++ b/src/include/storage/lwlocklist.h
@@ -83,3 +83,4 @@ PG_LWLOCK(49, WALSummarizer)
 PG_LWLOCK(50, DSMRegistry)
 PG_LWLOCK(51, InjectionPoint)
 PG_LWLOCK(52, SerialControl)
+PG_LWLOCK(53, DataChecksumsWorker)
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 114eb1f8f76..61a3e3af9d4 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -447,9 +447,10 @@ extern PGDLLIMPORT PGPROC *PreparedXactProcs;
  * Background writer, checkpointer, WAL writer, WAL summarizer, and archiver
  * run during normal operation.  Startup process and WAL receiver also consume
  * 2 slots, but WAL writer is launched only after startup has exited, so we
- * only need 6 slots.
+ * only need 6 slots to cover these. The DataChecksums worker and launcher
+ * can consume 2 slots when data checksums are enabled or disabled.
  */
-#define NUM_AUXILIARY_PROCS		6
+#define NUM_AUXILIARY_PROCS		8
 
 /* configurable options */
 extern PGDLLIMPORT int DeadlockTimeout;
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 022fd8ed933..8937fa6ed3d 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -54,6 +54,11 @@ typedef enum
 typedef enum
 {
 	PROCSIGNAL_BARRIER_SMGRRELEASE, /* ask smgr to close files */
+
+	PROCSIGNAL_BARRIER_CHECKSUM_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_ON,
 } ProcSignalBarrierType;
 
 /*
diff --git a/src/include/utils/backend_progress.h b/src/include/utils/backend_progress.h
index dda813ab407..c664e92dbfe 100644
--- a/src/include/utils/backend_progress.h
+++ b/src/include/utils/backend_progress.h
@@ -28,6 +28,7 @@ typedef enum ProgressCommandType
 	PROGRESS_COMMAND_CREATE_INDEX,
 	PROGRESS_COMMAND_BASEBACKUP,
 	PROGRESS_COMMAND_COPY,
+	PROGRESS_COMMAND_DATACHECKSUMS,
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	20
diff --git a/src/test/Makefile b/src/test/Makefile
index 511a72e6238..278ce3e8a86 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,16 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = perl postmaster regress isolation modules authentication recovery subscription
+SUBDIRS = \
+		perl \
+		postmaster \
+		regress \
+		isolation \
+		modules \
+		authentication \
+		recovery \
+		subscription \
+		checksum
 
 ifeq ($(with_icu),yes)
 SUBDIRS += icu
diff --git a/src/test/checksum/.gitignore b/src/test/checksum/.gitignore
new file mode 100644
index 00000000000..871e943d50e
--- /dev/null
+++ b/src/test/checksum/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/checksum/Makefile b/src/test/checksum/Makefile
new file mode 100644
index 00000000000..fd03bf73df4
--- /dev/null
+++ b/src/test/checksum/Makefile
@@ -0,0 +1,23 @@
+#-------------------------------------------------------------------------
+#
+# Makefile for src/test/checksum
+#
+# Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/checksum/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/test/checksum
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
+
+clean distclean maintainer-clean:
+	rm -rf tmp_check
diff --git a/src/test/checksum/README b/src/test/checksum/README
new file mode 100644
index 00000000000..0f0317060b3
--- /dev/null
+++ b/src/test/checksum/README
@@ -0,0 +1,22 @@
+src/test/checksum/README
+
+Regression tests for data checksums
+===================================
+
+This directory contains a test suite for enabling data checksums
+in a running cluster.
+
+Running the tests
+=================
+
+    make check
+
+or
+
+    make installcheck
+
+NOTE: This creates a temporary installation (in the case of "check"),
+with multiple nodes, be they master or standby(s) for the purpose of
+the tests.
+
+NOTE: This requires the --enable-tap-tests argument to configure.
diff --git a/src/test/checksum/meson.build b/src/test/checksum/meson.build
new file mode 100644
index 00000000000..5f96b5c246d
--- /dev/null
+++ b/src/test/checksum/meson.build
@@ -0,0 +1,15 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+tests += {
+  'name': 'checksums',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'tests': [
+      't/001_basic.pl',
+      't/002_restarts.pl',
+      't/003_standby_restarts.pl',
+      't/004_offline.pl',
+    ],
+  },
+}
diff --git a/src/test/checksum/t/001_basic.pl b/src/test/checksum/t/001_basic.pl
new file mode 100644
index 00000000000..4c64f6a14fc
--- /dev/null
+++ b/src/test/checksum/t/001_basic.pl
@@ -0,0 +1,88 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are turned off
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable data checksums
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+# Wait for checksums to become enabled
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Enable data checksums again which should be a no-op..
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+# ..and make sure we can still read/write data
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums again
+$node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are disabled');
+
+# Test reading again
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure previously checksummed pages can be read back');
+
+# Re-enable checksums and make sure that the underlying data has changed to
+# ensure that checksums will be different.
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back the data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$node->stop;
+
+done_testing();
diff --git a/src/test/checksum/t/002_restarts.pl b/src/test/checksum/t/002_restarts.pl
new file mode 100644
index 00000000000..2697b722257
--- /dev/null
+++ b/src/test/checksum/t/002_restarts.pl
@@ -0,0 +1,92 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# restarting the processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, '1', "ensure checksums aren't enabled yet");
+
+$bsession->quit;
+$node->stop;
+$node->start;
+
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'inprogress-on', "ensure checksums aren't enabled yet");
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are turned on');
+
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+$result = $node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are turned off');
+
+done_testing();
diff --git a/src/test/checksum/t/003_standby_restarts.pl b/src/test/checksum/t/003_standby_restarts.pl
new file mode 100644
index 00000000000..6782664f4e6
--- /dev/null
+++ b/src/test/checksum/t/003_standby_restarts.pl
@@ -0,0 +1,139 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# streaming replication
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize primary node
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+$node_primary->init(allows_streaming => 1, no_data_checksums => 1);
+$node_primary->start;
+
+my $slotname = 'physical_slot';
+$node_primary->safe_psql('postgres',
+	"SELECT pg_create_physical_replication_slot('$slotname')");
+
+# Take backup
+my $backup_name = 'my_backup';
+$node_primary->backup($backup_name);
+
+# Create streaming standby linking to primary
+my $node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $backup_name,
+	has_streaming => 1);
+$node_standby_1->append_conf(
+	'postgresql.conf', qq[
+primary_slot_name = '$slotname'
+]);
+$node_standby_1->start;
+
+# Create some content on the primary to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Wait for standbys to catch up
+$node_primary->wait_for_catchup($node_standby_1, 'replay',
+	$node_primary->lsn('insert'));
+
+# Check that checksums are turned off on all nodes
+my $result = $node_primary->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on primary');
+
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on standby_1');
+
+# Enable checksums for the cluster
+$node_primary->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+# Ensure that the primary switches to "inprogress-on"
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	"inprogress-on");
+is($result, 1, 'ensure checksums are in progress on primary');
+
+# Wait for checksum enable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Ensure that the standby has switched to "inprogress-on" or "on".  Normally it
+# would be "inprogress-on", but it is theoretically possible for the primary to
+# complete the checksum enabling *and* have the standby replay that record
+# before we reach the check below.
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting = 'off' FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'f');
+is($result, 1, 'ensure standby has absorbed the inprogress-on barrier');
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+
+is(($result eq 'inprogress-on' || $result eq 'on'),
+	1, 'ensure checksums are on, or in progress, on standby_1');
+
+# Insert some more data which should be checksummed on INSERT
+$node_primary->safe_psql('postgres',
+	"INSERT INTO t VALUES (generate_series(1, 10000));");
+
+# Wait for checksums enabled on the primary
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the primary');
+
+# Wait for checksums enabled on the standby
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the standby');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, '20000', 'ensure we can safely read all data with checksums');
+
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+# Disable checksums and ensure it's propagated to standby and that we can
+# still read all data
+$node_primary->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+# Wait for checksum disable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure data checksums are disabled on the primary 2');
+
+# Ensure that the standby has switched to off
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are off on standby_1');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, "20000", 'ensure we can safely read all data without checksums');
+
+done_testing();
diff --git a/src/test/checksum/t/004_offline.pl b/src/test/checksum/t/004_offline.pl
new file mode 100644
index 00000000000..9cee62c9b52
--- /dev/null
+++ b/src/test/checksum/t/004_offline.pl
@@ -0,0 +1,101 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums offline from various states
+# of checksum processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable checksums offline using pg_checksums
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are enabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums offline again using pg_checksums
+$node->stop;
+$node->checksum_disable_offline;
+$node->start;
+
+# Ensure that checksums are disabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, 1, 'ensure checksums are in the process of being enabled');
+
+# Turn the cluster off and enable checksums offline, then start back up
+$bsession->quit;
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are now enabled even though processing wasn't
+# restarted
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+done_testing();
diff --git a/src/test/meson.build b/src/test/meson.build
index ccc31d6a86a..c4bfef2c0e2 100644
--- a/src/test/meson.build
+++ b/src/test/meson.build
@@ -8,6 +8,7 @@ subdir('postmaster')
 subdir('recovery')
 subdir('subscription')
 subdir('modules')
+subdir('checksum')
 
 if ssl.found()
   subdir('ssl')
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index b105cba05a6..666bd2a2d4c 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3748,6 +3748,40 @@ sub advance_wal
 	}
 }
 
+=item $node->checksum_enable_offline()
+
+Enable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_enable_offline
+{
+	my ($self) = @_;
+
+	print "### Enabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-e');
+	return;
+}
+
+=item checksum_disable_offline
+
+Disable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_disable_offline
+{
+	my ($self) = @_;
+
+	print "### Disabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-d');
+	return;
+}
+
 =pod
 
 =back
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 62f69ac20b2..2cfea837554 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2041,6 +2041,43 @@ pg_stat_progress_create_index| SELECT s.pid,
     s.param15 AS partitions_done
    FROM (pg_stat_get_progress_info('CREATE INDEX'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
+pg_stat_progress_data_checksums| SELECT s.pid,
+    s.datid,
+    d.datname,
+        CASE s.param1
+            WHEN 0 THEN 'enabling'::text
+            WHEN 1 THEN 'disabling'::text
+            WHEN 2 THEN 'waiting'::text
+            WHEN 3 THEN 'waiting on backends'::text
+            WHEN 4 THEN 'waiting on temporary tables'::text
+            WHEN 5 THEN 'waiting on checkpoint'::text
+            WHEN 6 THEN 'done'::text
+            ELSE NULL::text
+        END AS phase,
+        CASE s.param2
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param2
+        END AS databases_total,
+    s.param3 AS databases_done,
+        CASE s.param4
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param4
+        END AS relations_total,
+        CASE s.param5
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param5
+        END AS relations_done,
+        CASE s.param6
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param6
+        END AS blocks_total,
+        CASE s.param7
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param7
+        END AS blocks_done
+   FROM (pg_stat_get_progress_info('DATACHECKSUMS'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)))
+  ORDER BY s.datid;
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,
diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index f77caacc17d..85e15d4cc2b 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -51,6 +51,22 @@ client backend|relation|vacuum
 client backend|temp relation|normal
 client backend|wal|init
 client backend|wal|normal
+datachecksumsworker launcher|relation|bulkread
+datachecksumsworker launcher|relation|bulkwrite
+datachecksumsworker launcher|relation|init
+datachecksumsworker launcher|relation|normal
+datachecksumsworker launcher|relation|vacuum
+datachecksumsworker launcher|temp relation|normal
+datachecksumsworker launcher|wal|init
+datachecksumsworker launcher|wal|normal
+datachecksumsworker worker|relation|bulkread
+datachecksumsworker worker|relation|bulkwrite
+datachecksumsworker worker|relation|init
+datachecksumsworker worker|relation|normal
+datachecksumsworker worker|relation|vacuum
+datachecksumsworker worker|temp relation|normal
+datachecksumsworker worker|wal|init
+datachecksumsworker worker|wal|normal
 slotsync worker|relation|bulkread
 slotsync worker|relation|bulkwrite
 slotsync worker|relation|init
@@ -87,7 +103,7 @@ walsummarizer|wal|init
 walsummarizer|wal|normal
 walwriter|wal|init
 walwriter|wal|normal
-(71 rows)
+(87 rows)
 \a
 -- ensure that both seqscan and indexscan plans are allowed
 SET enable_seqscan TO on;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index dfe2690bdd3..e5be803e159 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -403,6 +403,7 @@ CheckPointStmt
 CheckpointStatsData
 CheckpointerRequest
 CheckpointerShmemStruct
+ChecksumType
 Chromosome
 CkptSortItem
 CkptTsStatus
@@ -593,6 +594,10 @@ DataPageDeleteStack
 DataTypesUsageChecks
 DataTypesUsageVersionCheck
 DatabaseInfo
+DataChecksumsWorkerDatabase
+DataChecksumsWorkerResult
+DataChecksumsWorkerResultEntry
+DataChecksumsWorkerShmemStruct
 DateADT
 DateTimeErrorExtra
 Datum
@@ -4143,6 +4148,7 @@ xl_btree_split
 xl_btree_unlink_page
 xl_btree_update
 xl_btree_vacuum
+xl_checksum_state
 xl_clog_truncate
 xl_commit_ts_truncate
 xl_dbase_create_file_copy_rec
-- 
2.48.1

v20250312-0002-review-fixes.patchtext/x-patch; charset=UTF-8; name=v20250312-0002-review-fixes.patchDownload
From ebf63bf17a022d6c8068aefba38d66538642240b Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Mon, 10 Mar 2025 14:56:14 +0100
Subject: [PATCH v20250312 2/6] review fixes

---
 doc/src/sgml/monitoring.sgml                 | 13 +++----------
 src/backend/catalog/system_views.sql         |  7 +++----
 src/backend/postmaster/datachecksumsworker.c | 11 +----------
 src/include/commands/progress.h              |  5 ++---
 src/include/miscadmin.h                      |  4 +++-
 src/test/regress/expected/rules.out          |  7 +++----
 6 files changed, 15 insertions(+), 32 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index e51bf902dc2..8fec0277f97 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6911,7 +6911,7 @@ FROM pg_stat_get_backend_idset() AS backendid;
        <para>
         The total number of databases which will be processed. Only the
         launcher worker has this value set, the other worker processes
-        have this <literal>NULL</literal>.
+        have this set to <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6924,7 +6924,7 @@ FROM pg_stat_get_backend_idset() AS backendid;
        <para>
         The number of databases which have been processed. Only the
         launcher worker has this value set, the other worker processes
-        have this <literal>NULL</literal>.
+        have this set to <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -7009,13 +7009,6 @@ FROM pg_stat_get_backend_idset() AS backendid;
        The command is currently disabling data checksums on the cluster.
       </entry>
      </row>
-     <row>
-      <entry><literal>waiting on backends</literal></entry>
-      <entry>
-       The command is currently waiting for backends to acknowledge the data
-       checksum operation.
-      </entry>
-     </row>
      <row>
       <entry><literal>waiting on temporary tables</literal></entry>
       <entry>
@@ -7027,7 +7020,7 @@ FROM pg_stat_get_backend_idset() AS backendid;
       <entry><literal>waiting on checkpoint</literal></entry>
       <entry>
        The command is currently waiting for a checkpoint to update the checksum
-       state at the end.
+       state before finishing.
       </entry>
      </row>
     </tbody>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 4330d0ad656..6ffd31ce39c 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1340,10 +1340,9 @@ CREATE VIEW pg_stat_progress_data_checksums AS
         CASE S.param1 WHEN 0 THEN 'enabling'
                       WHEN 1 THEN 'disabling'
                       WHEN 2 THEN 'waiting'
-                      WHEN 3 THEN 'waiting on backends'
-                      WHEN 4 THEN 'waiting on temporary tables'
-                      WHEN 5 THEN 'waiting on checkpoint'
-                      WHEN 6 THEN 'done'
+                      WHEN 3 THEN 'waiting on temporary tables'
+                      WHEN 4 THEN 'waiting on checkpoint'
+                      WHEN 5 THEN 'done'
                       END AS phase,
         CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
         S.param3 AS databases_done,
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
index bbbce61cfa6..d9833cd4c98 100644
--- a/src/backend/postmaster/datachecksumsworker.c
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -801,15 +801,6 @@ again:
 		}
 		RESUME_INTERRUPTS();
 
-		/*
-		 * Initialize progress and indicate that we are waiting on the other
-		 * backends to clear the procsignalbarrier.
-		 */
-		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
-									 PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS);
-
-		/* XXX isn't it weird there's no wait between the phase updates? */
-
 		/*
 		 * Set the state to inprogress-on and wait on the procsignal barrier.
 		 */
@@ -1083,7 +1074,7 @@ ProcessAllDatabases(bool immediate_checkpoint)
 
 	/*
 	 * When enabling checksums, we have to wait for a checkpoint for the
-	 * checksums to e.
+	 * checksums to change from in-progress to on.
 	 */
 	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
 								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT);
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 94b478a6cc9..b172a5f24ce 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -170,8 +170,7 @@
 #define PROGRESS_DATACHECKSUMS_PHASE_ENABLING			0
 #define PROGRESS_DATACHECKSUMS_PHASE_DISABLING			1
 #define PROGRESS_DATACHECKSUMS_PHASE_WAITING			2
-#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS	3
-#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL	4
-#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT 5
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL	3
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT 4
 
 #endif
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 9923c7f518d..69b1dc720f6 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -392,7 +392,9 @@ extern PGDLLIMPORT BackendType MyBackendType;
 #define AmWalReceiverProcess()		(MyBackendType == B_WAL_RECEIVER)
 #define AmWalSummarizerProcess()	(MyBackendType == B_WAL_SUMMARIZER)
 #define AmWalWriterProcess()		(MyBackendType == B_WAL_WRITER)
-#define AmDataChecksumsWorkerProcess()	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
+#define AmDataChecksumsWorkerProcess() \
+	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || \
+	 MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
 
 #define AmSpecialWorkerProcess() \
 	(AmAutoVacuumLauncherProcess() || \
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2cfea837554..0dd383a76dd 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2048,10 +2048,9 @@ pg_stat_progress_data_checksums| SELECT s.pid,
             WHEN 0 THEN 'enabling'::text
             WHEN 1 THEN 'disabling'::text
             WHEN 2 THEN 'waiting'::text
-            WHEN 3 THEN 'waiting on backends'::text
-            WHEN 4 THEN 'waiting on temporary tables'::text
-            WHEN 5 THEN 'waiting on checkpoint'::text
-            WHEN 6 THEN 'done'::text
+            WHEN 3 THEN 'waiting on temporary tables'::text
+            WHEN 4 THEN 'waiting on checkpoint'::text
+            WHEN 5 THEN 'done'::text
             ELSE NULL::text
         END AS phase,
         CASE s.param2
-- 
2.48.1

v20250312-0003-pgindent.patchtext/x-patch; charset=UTF-8; name=v20250312-0003-pgindent.patchDownload
From 3fb57e3846f6f015b0a4d9d9165197d5d67dd5bc Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Mon, 10 Mar 2025 14:57:05 +0100
Subject: [PATCH v20250312 3/6] pgindent

---
 src/backend/postmaster/datachecksumsworker.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
index d9833cd4c98..6a201dca8de 100644
--- a/src/backend/postmaster/datachecksumsworker.c
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -921,7 +921,7 @@ ProcessAllDatabases(bool immediate_checkpoint)
 			PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
 		};
 
-		int64	vals[6];
+		int64		vals[6];
 
 		vals[0] = list_length(DatabaseList);
 		vals[1] = 0;
@@ -974,7 +974,8 @@ ProcessAllDatabases(bool immediate_checkpoint)
 			processed_databases++;
 
 			/*
-			 * Update the number of processed databases in the progress report.
+			 * Update the number of processed databases in the progress
+			 * report.
 			 */
 			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_DBS_DONE,
 										 processed_databases);
@@ -1126,9 +1127,9 @@ DataChecksumsWorkerShmemInit(void)
 		MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
 
 		/*
-		 * Even if this is a redundant assignment, we want to be explicit about
-		 * our intent for readability, since we want to be able to query this
-		 * state in case of restartability.
+		 * Even if this is a redundant assignment, we want to be explicit
+		 * about our intent for readability, since we want to be able to query
+		 * this state in case of restartability.
 		 */
 		DataChecksumsWorkerShmem->launch_enable_checksums = false;
 		DataChecksumsWorkerShmem->launcher_running = false;
@@ -1339,7 +1340,7 @@ DataChecksumsWorkerMain(Datum arg)
 			PROGRESS_DATACHECKSUMS_RELS_DONE
 		};
 
-		int64	vals[2];
+		int64		vals[2];
 
 		vals[0] = list_length(RelationList);
 		vals[1] = 0;
-- 
2.48.1

v20250312-0004-perltidy.patchtext/x-patch; charset=UTF-8; name=v20250312-0004-perltidy.patchDownload
From 1585dc3d1c8d2887a82ab41cc26433acac7039f5 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Mon, 10 Mar 2025 15:05:43 +0100
Subject: [PATCH v20250312 4/6] perltidy

---
 src/test/perl/PostgreSQL/Test/Cluster.pm | 6 ++++--
 src/test/subscription/t/013_partition.pl | 3 +--
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 666bd2a2d4c..1c66360c16c 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3761,7 +3761,8 @@ sub checksum_enable_offline
 	my ($self) = @_;
 
 	print "### Enabling checksums in \"$self->data_dir\"\n";
-	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-e');
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-e');
 	return;
 }
 
@@ -3778,7 +3779,8 @@ sub checksum_disable_offline
 	my ($self) = @_;
 
 	print "### Disabling checksums in \"$self->data_dir\"\n";
-	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-d');
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-d');
 	return;
 }
 
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
index 61b0cb4aa1a..4f78dd48815 100644
--- a/src/test/subscription/t/013_partition.pl
+++ b/src/test/subscription/t/013_partition.pl
@@ -51,8 +51,7 @@ $node_subscriber1->safe_psql('postgres',
 );
 # make a BRIN index to test aminsertcleanup logic in subscriber
 $node_subscriber1->safe_psql('postgres',
-	"CREATE INDEX tab1_c_brin_idx ON tab1 USING brin (c)"
-);
+	"CREATE INDEX tab1_c_brin_idx ON tab1 USING brin (c)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1_1 (b text, c text DEFAULT 'sub1_tab1', a int NOT NULL)"
 );
-- 
2.48.1

v20250312-0005-data_checksum_version-reworks.patchtext/x-patch; charset=UTF-8; name=v20250312-0005-data_checksum_version-reworks.patchDownload
From b78ba0ddda20dd2331358c157589ac80c807705a Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Tue, 11 Mar 2025 19:16:23 +0100
Subject: [PATCH v20250312 5/6] data_checksum_version reworks

---
 src/backend/access/transam/xlog.c       | 133 +++++++++++++++++-------
 src/backend/postmaster/launch_backend.c |  10 ++
 src/include/catalog/pg_control.h        |   5 +-
 src/include/miscadmin.h                 |   4 +
 4 files changed, 116 insertions(+), 36 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index f137cdc6d42..61da6d583cd 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -550,6 +550,9 @@ typedef struct XLogCtlData
 	 */
 	XLogRecPtr	lastFpwDisableRecPtr;
 
+	/* last data_checksum_version we've seen */
+	uint32		data_checksum_version;
+
 	slock_t		info_lck;		/* locks shared variables shown above */
 } XLogCtlData;
 
@@ -4262,6 +4265,12 @@ InitControlFile(uint64 sysidentifier, uint32 data_checksum_version)
 	ControlFile->wal_log_hints = wal_log_hints;
 	ControlFile->track_commit_timestamp = track_commit_timestamp;
 	ControlFile->data_checksum_version = data_checksum_version;
+
+	/*
+	 * Set the data_checksum_version value into XLogCtl, which is where all
+	 * processes get the current value from. (Maybe it should go just there?)
+	 */
+	XLogCtl->data_checksum_version = data_checksum_version;
 }
 
 static void
@@ -4601,8 +4610,6 @@ ReadControlFile(void)
 		(SizeOfXLogLongPHD - SizeOfXLogShortPHD);
 
 	CalculateCheckpointSegments();
-
-	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
 }
 
 /*
@@ -4734,9 +4741,9 @@ SetDataChecksumsOnInProgress(void)
 
 	XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
 
-	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
-	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
-	LWLockRelease(ControlFileLock);
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	SpinLockRelease(&XLogCtl->info_lck);
 
 	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
 
@@ -4780,28 +4787,28 @@ SetDataChecksumsOn(void)
 
 	Assert(ControlFile != NULL);
 
-	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	SpinLockAcquire(&XLogCtl->info_lck);
 
 	/*
 	 * The only allowed state transition to "on" is from "inprogress-on" since
 	 * that state ensures that all pages will have data checksums written.
 	 */
-	if (ControlFile->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	if (XLogCtl->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
 	{
-		LWLockRelease(ControlFileLock);
+		SpinLockRelease(&XLogCtl->info_lck);
 		elog(ERROR, "checksums not in \"inprogress-on\" mode");
 	}
 
-	LWLockRelease(ControlFileLock);
+	SpinLockRelease(&XLogCtl->info_lck);
 
 	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
 	START_CRIT_SECTION();
 
 	XLogChecksums(PG_DATA_CHECKSUM_VERSION);
 
-	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
-	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
-	LWLockRelease(ControlFileLock);
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
+	SpinLockRelease(&XLogCtl->info_lck);
 
 	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
 
@@ -4836,12 +4843,12 @@ SetDataChecksumsOff(void)
 
 	Assert(ControlFile);
 
-	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	SpinLockAcquire(&XLogCtl->info_lck);
 
 	/* If data checksums are already disabled there is nothing to do */
-	if (ControlFile->data_checksum_version == 0)
+	if (XLogCtl->data_checksum_version == 0)
 	{
-		LWLockRelease(ControlFileLock);
+		SpinLockRelease(&XLogCtl->info_lck);
 		return;
 	}
 
@@ -4852,18 +4859,18 @@ SetDataChecksumsOff(void)
 	 * "inprogress-off" the next transition to "off" can be performed, after
 	 * which all data checksum processing is disabled.
 	 */
-	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
 	{
-		LWLockRelease(ControlFileLock);
+		SpinLockRelease(&XLogCtl->info_lck);
 
 		MyProc->delayChkptFlags |= DELAY_CHKPT_START;
 		START_CRIT_SECTION();
 
 		XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
 
-		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
-		ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
-		LWLockRelease(ControlFileLock);
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+		SpinLockRelease(&XLogCtl->info_lck);
 
 		barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
 
@@ -4890,7 +4897,7 @@ SetDataChecksumsOff(void)
 		 * or "inprogress-off" and we can transition directly to "off" from
 		 * there.
 		 */
-		LWLockRelease(ControlFileLock);
+		SpinLockRelease(&XLogCtl->info_lck);
 	}
 
 	/*
@@ -4901,9 +4908,9 @@ SetDataChecksumsOff(void)
 
 	XLogChecksums(0);
 
-	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
-	ControlFile->data_checksum_version = 0;
-	LWLockRelease(ControlFileLock);
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = 0;
+	SpinLockRelease(&XLogCtl->info_lck);
 
 	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
 
@@ -4958,9 +4965,9 @@ AbsorbChecksumsOffBarrier(void)
 void
 InitLocalControldata(void)
 {
-	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
-	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
-	LWLockRelease(ControlFileLock);
+	SpinLockAcquire(&XLogCtl->info_lck);
+	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+	SpinLockRelease(&XLogCtl->info_lck);
 }
 
 /*
@@ -4994,6 +5001,40 @@ SetLocalDataChecksumVersion(uint32 data_checksum_version)
 	}
 }
 
+/*
+ * Initialize the various data checksum values - GUC, local, ....
+ */
+void
+InitLocalDataChecksumVersion(void)
+{
+	uint32	data_checksum_version;
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	data_checksum_version = XLogCtl->data_checksum_version;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	SetLocalDataChecksumVersion(data_checksum_version);
+}
+
+/*
+ * Get the local data_checksum_version (cached XLogCtl value).
+ */
+uint32
+GetLocalDataChecksumVersion(void)
+{
+	return LocalDataChecksumVersion;
+}
+
+/*
+ * Get the *current* data_checksum_version (might not be written to control
+ * file yet).
+ */
+uint32
+GetCurrentDataChecksumVersion(void)
+{
+	return XLogCtl->data_checksum_version;
+}
+
 /* guc hook */
 const char *
 show_data_checksums(void)
@@ -5447,6 +5488,11 @@ XLOGShmemInit(void)
 	XLogCtl->InstallXLogFileSegmentActive = false;
 	XLogCtl->WalWriterSleeping = false;
 
+	/* use the checksum info from control file */
+	XLogCtl->data_checksum_version = ControlFile->data_checksum_version;
+
+	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+
 	SpinLockInit(&XLogCtl->Insert.insertpos_lck);
 	SpinLockInit(&XLogCtl->info_lck);
 	pg_atomic_init_u64(&XLogCtl->logInsertResult, InvalidXLogRecPtr);
@@ -6585,7 +6631,7 @@ StartupXLOG(void)
 	 * background worker directly from here, it has to be launched from a
 	 * regular backend.
 	 */
-	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
 		ereport(WARNING,
 				(errmsg("data checksums are being enabled, but no worker is running"),
 				 errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
@@ -6596,13 +6642,13 @@ StartupXLOG(void)
 	 * checksums and we can move to off instead of prompting the user to
 	 * perform any action.
 	 */
-	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
 	{
 		XLogChecksums(0);
 
-		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
-		ControlFile->data_checksum_version = 0;
-		LWLockRelease(ControlFileLock);
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = 0;
+		SpinLockRelease(&XLogCtl->info_lck);
 	}
 
 	/*
@@ -7460,6 +7506,12 @@ CreateCheckPoint(int flags)
 	checkPoint.fullPageWrites = Insert->fullPageWrites;
 	checkPoint.wal_level = wal_level;
 
+	/*
+	 * Get the current data_checksum_version value from xlogctl, valid at
+	 * the time of the checkpoint.
+	 */
+	checkPoint.data_checksum_version = XLogCtl->data_checksum_version;
+
 	if (shutdown)
 	{
 		XLogRecPtr	curInsert = XLogBytePosToRecPtr(Insert->CurrBytePos);
@@ -7715,6 +7767,9 @@ CreateCheckPoint(int flags)
 	ControlFile->minRecoveryPoint = InvalidXLogRecPtr;
 	ControlFile->minRecoveryPointTLI = 0;
 
+	/* make sure we start with the checksum version as of the checkpoint */
+	ControlFile->data_checksum_version = checkPoint.data_checksum_version;
+
 	/*
 	 * Persist unloggedLSN value. It's reset on crash recovery, so this goes
 	 * unused on non-shutdown checkpoints, but seems useful to store it always
@@ -7861,6 +7916,10 @@ CreateEndOfRecoveryRecord(void)
 	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
 	ControlFile->minRecoveryPoint = recptr;
 	ControlFile->minRecoveryPointTLI = xlrec.ThisTimeLineID;
+
+	/* start with the latest checksum version (as of the end of recovery) */
+	ControlFile->data_checksum_version = XLogCtl->data_checksum_version;
+
 	UpdateControlFile();
 	LWLockRelease(ControlFileLock);
 
@@ -8203,6 +8262,10 @@ CreateRestartPoint(int flags)
 			if (flags & CHECKPOINT_IS_SHUTDOWN)
 				ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY;
 		}
+
+		/* we shall start with the latest checksum version */
+		ControlFile->data_checksum_version = lastCheckPoint.data_checksum_version;
+
 		UpdateControlFile();
 	}
 	LWLockRelease(ControlFileLock);
@@ -9065,9 +9128,9 @@ xlog_redo(XLogReaderState *record)
 
 		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
 
-		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
-		ControlFile->data_checksum_version = state.new_checksumtype;
-		LWLockRelease(ControlFileLock);
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = state.new_checksumtype;
+		SpinLockRelease(&XLogCtl->info_lck);
 
 		/*
 		 * Block on a procsignalbarrier to await all processes having seen the
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 92d8017fd56..fc6e2c6ed41 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -271,6 +271,16 @@ postmaster_child_launch(BackendType child_type, int child_slot,
 			memcpy(MyClientSocket, client_sock, sizeof(ClientSocket));
 		}
 
+		/*
+		 * update the LocalProcessControlFile to match XLogCtl->data_checksum_version
+		 *
+		 * XXX It seems the postmaster (which is what gets forked into the new
+		 * child process) does not absorb the checksum barriers, therefore it
+		 * does not update the value (except after a restart). Not sure if there
+		 * is some sort of race condition.
+		 */
+		InitLocalDataChecksumVersion();
+
 		/*
 		 * Run the appropriate Main function
 		 */
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 9b5a50adf54..727042b73be 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -62,6 +62,9 @@ typedef struct CheckPoint
 	 * set to InvalidTransactionId.
 	 */
 	TransactionId oldestActiveXid;
+
+	/* data checksums at the time of the checkpoint  */
+	uint32		data_checksum_version;
 } CheckPoint;
 
 /* XLOG info values for XLOG rmgr */
@@ -220,7 +223,7 @@ typedef struct ControlFileData
 	bool		float8ByVal;	/* float8, int8, etc pass-by-value? */
 
 	/* Are data pages protected by checksums? Zero if no checksum version */
-	uint32		data_checksum_version;
+	uint32		data_checksum_version;		/* persistent */
 
 	/*
 	 * True if the default signedness of char is "signed" on a platform where
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 69b1dc720f6..befab157ecf 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -532,6 +532,10 @@ extern Size EstimateClientConnectionInfoSpace(void);
 extern void SerializeClientConnectionInfo(Size maxsize, char *start_address);
 extern void RestoreClientConnectionInfo(char *conninfo);
 
+extern uint32 GetLocalDataChecksumVersion(void);
+extern uint32 GetCurrentDataChecksumVersion(void);
+extern void InitLocalDataChecksumVersion(void);
+
 /* in executor/nodeHash.c */
 extern size_t get_hash_memory_limit(void);
 
-- 
2.48.1

v20250312-0006-debug.patchtext/x-patch; charset=UTF-8; name=v20250312-0006-debug.patchDownload
From a7aa7c355418db44b23cdcb3e920bd95a9891d13 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Wed, 12 Mar 2025 12:40:46 +0100
Subject: [PATCH v20250312 6/6] debug

---
 src/backend/access/transam/xlog.c       | 43 +++++++++++++++++++++++++
 src/backend/postmaster/launch_backend.c |  3 ++
 src/backend/storage/ipc/procsignal.c    | 34 +++++++++++++++++++
 src/backend/utils/init/miscinit.c       |  3 ++
 4 files changed, 83 insertions(+)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 61da6d583cd..63d74e95075 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4271,6 +4271,8 @@ InitControlFile(uint64 sysidentifier, uint32 data_checksum_version)
 	 * processes get the current value from. (Maybe it should go just there?)
 	 */
 	XLogCtl->data_checksum_version = data_checksum_version;
+
+	elog(LOG, "InitControlFile %p data_checksum_version %u", XLogCtl, ControlFile->data_checksum_version);
 }
 
 static void
@@ -4928,6 +4930,7 @@ SetDataChecksumsOff(void)
 bool
 AbsorbChecksumsOnInProgressBarrier(void)
 {
+	elog(LOG, "AbsorbChecksumsOnInProgressBarrier");
 	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
 	return true;
 }
@@ -4935,6 +4938,7 @@ AbsorbChecksumsOnInProgressBarrier(void)
 bool
 AbsorbChecksumsOnBarrier(void)
 {
+	elog(LOG, "AbsorbChecksumsOnBarrier");
 	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
 	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_VERSION);
 	return true;
@@ -4943,6 +4947,7 @@ AbsorbChecksumsOnBarrier(void)
 bool
 AbsorbChecksumsOffInProgressBarrier(void)
 {
+	elog(LOG, "AbsorbChecksumsOffInProgressBarrier");
 	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
 	return true;
 }
@@ -4950,6 +4955,7 @@ AbsorbChecksumsOffInProgressBarrier(void)
 bool
 AbsorbChecksumsOffBarrier(void)
 {
+	elog(LOG, "AbsorbChecksumsOffBarrier");
 	SetLocalDataChecksumVersion(0);
 	return true;
 }
@@ -4979,6 +4985,7 @@ InitLocalControldata(void)
 void
 SetLocalDataChecksumVersion(uint32 data_checksum_version)
 {
+	elog(LOG, "SetLocalDataChecksumVersion %u", data_checksum_version);
 	LocalDataChecksumVersion = data_checksum_version;
 
 	switch (LocalDataChecksumVersion)
@@ -5488,6 +5495,9 @@ XLOGShmemInit(void)
 	XLogCtl->InstallXLogFileSegmentActive = false;
 	XLogCtl->WalWriterSleeping = false;
 
+	elog(LOG, "XLogCtl->data_checksum_version %u ControlFile->data_checksum_version %u",
+		 XLogCtl->data_checksum_version, ControlFile->data_checksum_version);
+
 	/* use the checksum info from control file */
 	XLogCtl->data_checksum_version = ControlFile->data_checksum_version;
 
@@ -7512,6 +7522,8 @@ CreateCheckPoint(int flags)
 	 */
 	checkPoint.data_checksum_version = XLogCtl->data_checksum_version;
 
+	elog(WARNING, "CREATECHECKPOINT XLogCtl->data_checksum_version %u", XLogCtl->data_checksum_version);
+
 	if (shutdown)
 	{
 		XLogRecPtr	curInsert = XLogBytePosToRecPtr(Insert->CurrBytePos);
@@ -7767,6 +7779,10 @@ CreateCheckPoint(int flags)
 	ControlFile->minRecoveryPoint = InvalidXLogRecPtr;
 	ControlFile->minRecoveryPointTLI = 0;
 
+	elog(LOG, "CreateCheckPoint data_checksum_version %u %u",
+		 ControlFile->data_checksum_version,
+		 checkPoint.data_checksum_version);
+
 	/* make sure we start with the checksum version as of the checkpoint */
 	ControlFile->data_checksum_version = checkPoint.data_checksum_version;
 
@@ -7917,6 +7933,10 @@ CreateEndOfRecoveryRecord(void)
 	ControlFile->minRecoveryPoint = recptr;
 	ControlFile->minRecoveryPointTLI = xlrec.ThisTimeLineID;
 
+	elog(LOG, "CreateEndOfRecoveryRecord data_checksum_version %u xlog %u",
+		 ControlFile->data_checksum_version,
+		 XLogCtl->data_checksum_version);
+
 	/* start with the latest checksum version (as of the end of recovery) */
 	ControlFile->data_checksum_version = XLogCtl->data_checksum_version;
 
@@ -8127,6 +8147,9 @@ CreateRestartPoint(int flags)
 	lastCheckPoint = XLogCtl->lastCheckPoint;
 	SpinLockRelease(&XLogCtl->info_lck);
 
+	elog(LOG, "CreateRestartPoint lastCheckPointRecPtr %X/%X lastCheckPointEndPtr %X/%X",
+		 LSN_FORMAT_ARGS(lastCheckPointRecPtr), LSN_FORMAT_ARGS(lastCheckPointEndPtr));
+
 	/*
 	 * Check that we're still in recovery mode. It's ok if we exit recovery
 	 * mode after this check, the restart point is valid anyway.
@@ -8263,6 +8286,10 @@ CreateRestartPoint(int flags)
 				ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY;
 		}
 
+		elog(LOG, "CreateRestartPoint data_checksum_version %u %u",
+			 ControlFile->data_checksum_version,
+			 lastCheckPoint.data_checksum_version);
+
 		/* we shall start with the latest checksum version */
 		ControlFile->data_checksum_version = lastCheckPoint.data_checksum_version;
 
@@ -9125,13 +9152,25 @@ xlog_redo(XLogReaderState *record)
 	{
 		xl_checksum_state state;
 		uint64		barrier;
+		XLogRecPtr	checkpointLsn;
+		uint32		value,
+					value_last;
 
 		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
 
 		SpinLockAcquire(&XLogCtl->info_lck);
+		value_last = XLogCtl->data_checksum_version;
 		XLogCtl->data_checksum_version = state.new_checksumtype;
 		SpinLockRelease(&XLogCtl->info_lck);
 
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		checkpointLsn = ControlFile->checkPoint;
+		value = ControlFile->data_checksum_version;
+		LWLockRelease(ControlFileLock);
+
+		elog(LOG, "XLOG_CHECKSUMS xlog_redo %X/%X control checkpoint %X/%X control %u last %u record %u",
+			 LSN_FORMAT_ARGS(lsn), LSN_FORMAT_ARGS(checkpointLsn), value, value_last, state.new_checksumtype);
+
 		/*
 		 * Block on a procsignalbarrier to await all processes having seen the
 		 * change to checksum status. Once the barrier has been passed we can
@@ -9140,22 +9179,26 @@ xlog_redo(XLogReaderState *record)
 		switch (state.new_checksumtype)
 		{
 			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				elog(LOG, "XLOG_CHECKSUMS emit PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON");
 				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
 				WaitForProcSignalBarrier(barrier);
 				break;
 
 			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				elog(LOG, "XLOG_CHECKSUMS emit PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF");
 				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
 				WaitForProcSignalBarrier(barrier);
 				break;
 
 			case PG_DATA_CHECKSUM_VERSION:
+				elog(LOG, "XLOG_CHECKSUMS emit PROCSIGNAL_BARRIER_CHECKSUM_ON");
 				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
 				WaitForProcSignalBarrier(barrier);
 				break;
 
 			default:
 				Assert(state.new_checksumtype == 0);
+				elog(LOG, "XLOG_CHECKSUMS emit PROCSIGNAL_BARRIER_CHECKSUM_OFF");
 				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
 				WaitForProcSignalBarrier(barrier);
 				break;
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index fc6e2c6ed41..2034c204cee 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -235,6 +235,9 @@ postmaster_child_launch(BackendType child_type, int child_slot,
 
 	Assert(IsPostmasterEnvironment && !IsUnderPostmaster);
 
+	elog(LOG, "postmaster_child_launch: LocalDataChecksumVersion %u xlog %u", GetLocalDataChecksumVersion(),
+	GetCurrentDataChecksumVersion());
+
 #ifdef EXEC_BACKEND
 	pid = internal_forkexec(child_process_kinds[child_type].name, child_slot,
 							startup_data, startup_data_len, client_sock);
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 2b13a8cd260..b88d1d07431 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -553,6 +553,40 @@ ProcessProcSignalBarrier(void)
 
 		PG_TRY();
 		{
+			/* print info about barriers */
+			{
+				uint32	tmp = flags;
+
+				elog(LOG, "ProcessProcSignalBarrier flags %u", tmp);
+
+				while (tmp != 0)
+				{
+					ProcSignalBarrierType type;
+
+					type = (ProcSignalBarrierType) pg_rightmost_one_pos32(tmp);
+					switch (type)
+					{
+						case PROCSIGNAL_BARRIER_SMGRRELEASE:
+							elog(LOG, "PROCSIGNAL_BARRIER_SMGRRELEASE");
+							break;
+						case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON:
+							elog(LOG, "PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON");
+							break;
+						case PROCSIGNAL_BARRIER_CHECKSUM_ON:
+							elog(LOG, "PROCSIGNAL_BARRIER_CHECKSUM_ON");
+							break;
+						case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF:
+							elog(LOG, "PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF");
+							break;
+						case PROCSIGNAL_BARRIER_CHECKSUM_OFF:
+							elog(LOG, "PROCSIGNAL_BARRIER_CHECKSUM_OFF");
+							break;
+					}
+
+					BARRIER_CLEAR_BIT(tmp, type);
+				}
+			}
+
 			/*
 			 * Process each type of barrier. The barrier-processing functions
 			 * should normally return true, but may return false if the
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index a071ba6f455..df52ce8ad7f 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -164,6 +164,9 @@ InitPostmasterChild(void)
 				(errcode_for_socket_access(),
 				 errmsg_internal("could not set postmaster death monitoring pipe to FD_CLOEXEC mode: %m")));
 #endif
+
+	elog(LOG, "InitPostmasterChild: LocalDataChecksumVersion %u xlog %u", GetLocalDataChecksumVersion(),
+	GetCurrentDataChecksumVersion());
 }
 
 /*
-- 
2.48.1

checksum-test.shapplication/x-shellscript; name=checksum-test.shDownload
#33Daniel Gustafsson
daniel@yesql.se
In reply to: Tomas Vondra (#32)
Re: Changing the state of data checksums in a running cluster

On 12 Mar 2025, at 14:16, Tomas Vondra <tomas@vondra.me> wrote:

I continued investigating this and experimenting with alternative
approaches, and I think the way the patch relies on ControlFile is not
quite right. That is, it always sets data_checksum_version to the last
("current") value, but that's not what ControlFile is for ...

Agreed, that's a thinko on my part. Reading it makes it clear, but I had
failed to see that when hacking =/

XLogCtl seemed like a good place, so I used that - after all, it's a
value from XLOG. Maybe there's a better place? I'm open to suggestions,
but it does not really affect the overall approach.

Seems like a good place for it.

So all the places now update XLogCtl->data_checksums_version instead of
the ControlFile, and also query this flag for *current* value.

The value is copied from XLogCtl to ControlFile when creating checkpoint
(or restartpoint), and the control file is persisted. This means (a) the
current value can't get written to the control file prematurely, and (b)
the value is consistent with the checkpoint (i.e. with the LSN where we
start crash recovery, if needed).

+1

The attached 0005 patch implements this. It's a bit WIP and I'm sure it
can be improved, but I'm yet to see a single crash/failure with it. With
the original patch I've seen crashes after 5-10 loops (i.e. a couple
minutes), I'm now at loop 1000 and it's still OK.

Given how successful this test has been at stressing out errors that is indeed
comforting to hear.

I believe the approach is correct, but the number of possible states
(e.g. after a crash/restart) seems a bit complex. I wonder if there's a
better way to handle this, but I can't think of any. Ideas?

Not sure if this moves the needle too far in terms of complexity wrt to the
previous version of the patch, there were already multiple copies.

One issue I ran into is the postmaster does not seem to be processing
the barriers, and thus not getting info about the data_checksum_version
changes.

Makes sense, that seems like a pretty reasonable constraint for the barrier.

That's fine until it needs to launch a child process (e.g. a
walreceiver), which will then see the LocalDataChecksumVersion as of the
start of the instance, not the "current" one. I fixed this by explicitly
refreshing the value in postmaster_child_launch(), but maybe I'm missing
something. (Also, EXEC_BACKEND may need to handle this too.)

The pg_checksums test is failing for me on this version due to the GUC not
being initialized, don't we need something like the below as well? (With a
comment explaining why ReadControlFile wasn't enough.)

@@ -5319,6 +5319,7 @@ LocalProcessControlFile(bool reset)
        Assert(reset || ControlFile == NULL);
        ControlFile = palloc(sizeof(ControlFileData));
        ReadControlFile();
+       SetLocalDataChecksumVersion(ControlFile->data_checksum_version);

A few comments on the patchset:

+ * Local state fror Controlfile data_checksum_version. After initialization
s/fror/for/. Also, this is no longer true as it's a local copy of the XlogCtl
value and not the Controlfile value (which may or may not be equal).

- if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+ if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
 	ereport(WARNING,
 		(errmsg("data checksums are being enabled, but no worker is running"),
 		 errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
Reading this made me realize what a terrible error message I had placed there,
the hint is good but the message says checksums are being enabled but they're
not being enabled.  Maybe "data checksums are marked as being in-progress, but
no worker is running"
+uint32
+GetLocalDataChecksumVersion(void)
+{
+ return LocalDataChecksumVersion;
+}
+
+/*
+ * Get the *current* data_checksum_version (might not be written to control
+ * file yet).
+ */
+uint32
+GetCurrentDataChecksumVersion(void)
+{
+ return XLogCtl->data_checksum_version;
+}
I wonder if CachedDataChecksumVersion would be more appropriate to distinguish
it from the Current value, and also to make appear less like actual copies of
controlfile values like LocalMinRecoveryPoint.  Another thought is if we should
have the GetLocalDataChecksumVersion() API?  GetCurrentDataChecksumVersion()
should be a better API no?

--
Daniel Gustafsson

#34Tomas Vondra
tomas@vondra.me
In reply to: Daniel Gustafsson (#33)
Re: Changing the state of data checksums in a running cluster

On 3/13/25 10:54, Daniel Gustafsson wrote:

On 12 Mar 2025, at 14:16, Tomas Vondra <tomas@vondra.me> wrote:

I continued investigating this and experimenting with alternative
approaches, and I think the way the patch relies on ControlFile is not
quite right. That is, it always sets data_checksum_version to the last
("current") value, but that's not what ControlFile is for ...

Agreed, that's a thinko on my part. Reading it makes it clear, but I had
failed to see that when hacking =/

It wasn't obvious to me either, until I managed to trigger the failure
and investigated the root cause.

XLogCtl seemed like a good place, so I used that - after all, it's a
value from XLOG. Maybe there's a better place? I'm open to suggestions,
but it does not really affect the overall approach.

Seems like a good place for it.

OK

So all the places now update XLogCtl->data_checksums_version instead of
the ControlFile, and also query this flag for *current* value.

The value is copied from XLogCtl to ControlFile when creating checkpoint
(or restartpoint), and the control file is persisted. This means (a) the
current value can't get written to the control file prematurely, and (b)
the value is consistent with the checkpoint (i.e. with the LSN where we
start crash recovery, if needed).

+1

OK. I still want to go over the places once more and double check it
sets the ControlFile value to the right data_checksum_version.

The attached 0005 patch implements this. It's a bit WIP and I'm sure it
can be improved, but I'm yet to see a single crash/failure with it. With
the original patch I've seen crashes after 5-10 loops (i.e. a couple
minutes), I'm now at loop 1000 and it's still OK.

Given how successful this test has been at stressing out errors that is indeed
comforting to hear.

It is. I plan to vary the stress test a bit more, and also run it on
another machine (rpi5, to get some non-x86 testing).

I believe the approach is correct, but the number of possible states
(e.g. after a crash/restart) seems a bit complex. I wonder if there's a
better way to handle this, but I can't think of any. Ideas?

Not sure if this moves the needle too far in terms of complexity wrt to the
previous version of the patch, there were already multiple copies.

It does add one more place (XLogCtl->data_checksum_version) to store the
current state, so it's not that much more complex, ofc. But I was not
really comparing this to the previous patch version, I meant the state
space in general - all possible combinations of all the flags (control
file, local + xlogct).

I wonder if it might be possible to have a more thorough validation of
the transitions. We already have that for the LocalDataChecksumVersion,
thanks to the asserts - and it was damn useful, otherwise we would not
have noticed this issue for a long time, I think.

I wonder if we can have similar checks for the other flags. I'm pretty
sure we can have the same checks for XLogCtl, right? I'm not quite sure
about ControlFile - can't that "skip" some of the changes, e.g. if we do
(enable->disable->enable) within a single checkpoint? Need to check.

This also reminds me I had a question about the barrier - can't it
happen a process gets to process multiple barriers at the same time? I
mean, let's say it gets stuck for a while, and the cluster happens to go
through disable+enable. Won't it then see both barriers? That'd be a
problem, because the core processes the barriers in the order determined
by the enum value, not in the order the barriers happened. Which means
it might break the expected state transitions again (and end with the
wrong local value). I haven't tried, though.

One issue I ran into is the postmaster does not seem to be processing
the barriers, and thus not getting info about the data_checksum_version
changes.

Makes sense, that seems like a pretty reasonable constraint for the barrier.

Not sure I follow. What's a reasonable constraint?

That's fine until it needs to launch a child process (e.g. a
walreceiver), which will then see the LocalDataChecksumVersion as of the
start of the instance, not the "current" one. I fixed this by explicitly
refreshing the value in postmaster_child_launch(), but maybe I'm missing
something. (Also, EXEC_BACKEND may need to handle this too.)

The pg_checksums test is failing for me on this version due to the GUC not
being initialized, don't we need something like the below as well? (With a
comment explaining why ReadControlFile wasn't enough.)

@@ -5319,6 +5319,7 @@ LocalProcessControlFile(bool reset)
Assert(reset || ControlFile == NULL);
ControlFile = palloc(sizeof(ControlFileData));
ReadControlFile();
+       SetLocalDataChecksumVersion(ControlFile->data_checksum_version);

Yeah, I think this (or something like it) is missing.

A few comments on the patchset:

+ * Local state fror Controlfile data_checksum_version. After initialization
s/fror/for/. Also, this is no longer true as it's a local copy of the XlogCtl
value and not the Controlfile value (which may or may not be equal).

- if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+ if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
ereport(WARNING,
(errmsg("data checksums are being enabled, but no worker is running"),
errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
Reading this made me realize what a terrible error message I had placed there,
the hint is good but the message says checksums are being enabled but they're
not being enabled.  Maybe "data checksums are marked as being in-progress, but
no worker is running"

Makes sense, will reword.

+uint32
+GetLocalDataChecksumVersion(void)
+{
+ return LocalDataChecksumVersion;
+}
+
+/*
+ * Get the *current* data_checksum_version (might not be written to control
+ * file yet).
+ */
+uint32
+GetCurrentDataChecksumVersion(void)
+{
+ return XLogCtl->data_checksum_version;
+}
I wonder if CachedDataChecksumVersion would be more appropriate to distinguish
it from the Current value, and also to make appear less like actual copies of
controlfile values like LocalMinRecoveryPoint.  Another thought is if we should
have the GetLocalDataChecksumVersion() API?  GetCurrentDataChecksumVersion()
should be a better API no?

FWIW those functions are for debug logging only, I needed to print the
values in a couple places outside xlog.c. I don't intend to make that
part of the patch.

regards

--
Tomas Vondra

#35Daniel Gustafsson
daniel@yesql.se
In reply to: Tomas Vondra (#34)
Re: Changing the state of data checksums in a running cluster

On 13 Mar 2025, at 12:03, Tomas Vondra <tomas@vondra.me> wrote:
On 3/13/25 10:54, Daniel Gustafsson wrote:

On 12 Mar 2025, at 14:16, Tomas Vondra <tomas@vondra.me> wrote:

I believe the approach is correct, but the number of possible states
(e.g. after a crash/restart) seems a bit complex. I wonder if there's a
better way to handle this, but I can't think of any. Ideas?

Not sure if this moves the needle too far in terms of complexity wrt to the
previous version of the patch, there were already multiple copies.

It does add one more place (XLogCtl->data_checksum_version) to store the
current state, so it's not that much more complex, ofc. But I was not
really comparing this to the previous patch version, I meant the state
space in general - all possible combinations of all the flags (control
file, local + xlogct).

Fair point.

I wonder if it might be possible to have a more thorough validation of
the transitions. We already have that for the LocalDataChecksumVersion,
thanks to the asserts - and it was damn useful, otherwise we would not
have noticed this issue for a long time, I think.

I wonder if we can have similar checks for the other flags. I'm pretty
sure we can have the same checks for XLogCtl, right?

I don't see why not, they should abide by the same rules.

I'm not quite sure
about ControlFile - can't that "skip" some of the changes, e.g. if we do
(enable->disable->enable) within a single checkpoint? Need to check.

For enable->disable->enable within a single checkpoint then ControlFile should
never see the disable state.

This also reminds me I had a question about the barrier - can't it
happen a process gets to process multiple barriers at the same time? I
mean, let's say it gets stuck for a while, and the cluster happens to go
through disable+enable. Won't it then see both barriers? That'd be a
problem, because the core processes the barriers in the order determined
by the enum value, not in the order the barriers happened. Which means
it might break the expected state transitions again (and end with the
wrong local value). I haven't tried, though.

Interesting, that seems like a general deficiency in the barriers, surely
processing them in-order would be more intuitive? That would probably require
some form of Lamport clock though.

One issue I ran into is the postmaster does not seem to be processing
the barriers, and thus not getting info about the data_checksum_version
changes.

Makes sense, that seems like a pretty reasonable constraint for the barrier.

Not sure I follow. What's a reasonable constraint?

That the postmaster deosn't process them.

That's fine until it needs to launch a child process (e.g. a
walreceiver), which will then see the LocalDataChecksumVersion as of the
start of the instance, not the "current" one. I fixed this by explicitly
refreshing the value in postmaster_child_launch(), but maybe I'm missing
something. (Also, EXEC_BACKEND may need to handle this too.)

The pg_checksums test is failing for me on this version due to the GUC not
being initialized, don't we need something like the below as well? (With a
comment explaining why ReadControlFile wasn't enough.)

@@ -5319,6 +5319,7 @@ LocalProcessControlFile(bool reset)
Assert(reset || ControlFile == NULL);
ControlFile = palloc(sizeof(ControlFileData));
ReadControlFile();
+       SetLocalDataChecksumVersion(ControlFile->data_checksum_version);

Yeah, I think this (or something like it) is missing.

Thanks for confirming.

+uint32
+GetLocalDataChecksumVersion(void)
+{
+ return LocalDataChecksumVersion;
+}
+
+/*
+ * Get the *current* data_checksum_version (might not be written to control
+ * file yet).
+ */
+uint32
+GetCurrentDataChecksumVersion(void)
+{
+ return XLogCtl->data_checksum_version;
+}
I wonder if CachedDataChecksumVersion would be more appropriate to distinguish
it from the Current value, and also to make appear less like actual copies of
controlfile values like LocalMinRecoveryPoint.  Another thought is if we should
have the GetLocalDataChecksumVersion() API?  GetCurrentDataChecksumVersion()
should be a better API no?

FWIW those functions are for debug logging only, I needed to print the
values in a couple places outside xlog.c. I don't intend to make that
part of the patch.

Ah, gotcha, I never applied the debug patch from the patchset so I figured this
was a planned API. The main question still stands though, if LocalDataCheckXX
can be confusing and CachedDataCheckXX would be better in order to
distinguish it from actual controlfile copies?

--
Daniel Gustafsson

#36Tomas Vondra
tomas@vondra.me
In reply to: Daniel Gustafsson (#35)
Re: Changing the state of data checksums in a running cluster

On 3/13/25 13:32, Daniel Gustafsson wrote:

On 13 Mar 2025, at 12:03, Tomas Vondra <tomas@vondra.me> wrote:
On 3/13/25 10:54, Daniel Gustafsson wrote:

On 12 Mar 2025, at 14:16, Tomas Vondra <tomas@vondra.me> wrote:

I believe the approach is correct, but the number of possible states
(e.g. after a crash/restart) seems a bit complex. I wonder if there's a
better way to handle this, but I can't think of any. Ideas?

Not sure if this moves the needle too far in terms of complexity wrt to the
previous version of the patch, there were already multiple copies.

It does add one more place (XLogCtl->data_checksum_version) to store the
current state, so it's not that much more complex, ofc. But I was not
really comparing this to the previous patch version, I meant the state
space in general - all possible combinations of all the flags (control
file, local + xlogct).

Fair point.

I wonder if it might be possible to have a more thorough validation of
the transitions. We already have that for the LocalDataChecksumVersion,
thanks to the asserts - and it was damn useful, otherwise we would not
have noticed this issue for a long time, I think.

I wonder if we can have similar checks for the other flags. I'm pretty
sure we can have the same checks for XLogCtl, right?

I don't see why not, they should abide by the same rules.

OK, I'll add these asserts.

I'm not quite sure
about ControlFile - can't that "skip" some of the changes, e.g. if we do
(enable->disable->enable) within a single checkpoint? Need to check.

For enable->disable->enable within a single checkpoint then ControlFile should
never see the disable state.

Hmm, that means we can't have the same checks for the ControlFile
fields, but I don't think that's a problem. We've verified the "path" to
that (on the XLogCtl field), so that seems fine.

This also reminds me I had a question about the barrier - can't it
happen a process gets to process multiple barriers at the same time? I
mean, let's say it gets stuck for a while, and the cluster happens to go
through disable+enable. Won't it then see both barriers? That'd be a
problem, because the core processes the barriers in the order determined
by the enum value, not in the order the barriers happened. Which means
it might break the expected state transitions again (and end with the
wrong local value). I haven't tried, though.

Interesting, that seems like a general deficiency in the barriers, surely
processing them in-order would be more intuitive? That would probably require
some form of Lamport clock though.

Yeah, that seems non-trivial. What if we instead ensured there can't be
two barriers set at the same time? Say, if we (somehow) ensured all
processes saw the previous barrier before allowing a new one, we would
not have this issue, right?

But I don't know what would be a good way to ensure this. Is there a way
to check if all processes saw the barrier? Any ideas?

One issue I ran into is the postmaster does not seem to be processing
the barriers, and thus not getting info about the data_checksum_version
changes.

Makes sense, that seems like a pretty reasonable constraint for the barrier.

Not sure I follow. What's a reasonable constraint?

That the postmaster deosn't process them.

OK, that means we need a way to "refresh" the value for new child
processses, similar to what my patch does. But I suspect there might be
a race condition - if the child process starts while processing the
XLOG_CHECKUMS record, it might happen to get the new value and then also
the barrier (if it does the "refresh" in between the XLogCtl update and
the barrier). Doesn't this need some sort of interlock, preventing this?

The child startup would need to do this:

1) acquire lock
2) reset barriers
3) refresh the LocalDataChecksumValue (from XLogCtl)
4) release lock

while the walreceiver would do this

1) acquire lock
2) update XLogCtl value
3) emit barrier
4) release lock

Or is there a reason why this would be unnecessary?

That's fine until it needs to launch a child process (e.g. a
walreceiver), which will then see the LocalDataChecksumVersion as of the
start of the instance, not the "current" one. I fixed this by explicitly
refreshing the value in postmaster_child_launch(), but maybe I'm missing
something. (Also, EXEC_BACKEND may need to handle this too.)

The pg_checksums test is failing for me on this version due to the GUC not
being initialized, don't we need something like the below as well? (With a
comment explaining why ReadControlFile wasn't enough.)

@@ -5319,6 +5319,7 @@ LocalProcessControlFile(bool reset)
Assert(reset || ControlFile == NULL);
ControlFile = palloc(sizeof(ControlFileData));
ReadControlFile();
+       SetLocalDataChecksumVersion(ControlFile->data_checksum_version);

Yeah, I think this (or something like it) is missing.

Thanks for confirming.

+uint32
+GetLocalDataChecksumVersion(void)
+{
+ return LocalDataChecksumVersion;
+}
+
+/*
+ * Get the *current* data_checksum_version (might not be written to control
+ * file yet).
+ */
+uint32
+GetCurrentDataChecksumVersion(void)
+{
+ return XLogCtl->data_checksum_version;
+}
I wonder if CachedDataChecksumVersion would be more appropriate to distinguish
it from the Current value, and also to make appear less like actual copies of
controlfile values like LocalMinRecoveryPoint.  Another thought is if we should
have the GetLocalDataChecksumVersion() API?  GetCurrentDataChecksumVersion()
should be a better API no?

FWIW those functions are for debug logging only, I needed to print the
values in a couple places outside xlog.c. I don't intend to make that
part of the patch.

Ah, gotcha, I never applied the debug patch from the patchset so I figured this
was a planned API. The main question still stands though, if LocalDataCheckXX
can be confusing and CachedDataCheckXX would be better in order to
distinguish it from actual controlfile copies?

Yeah, I'll think about the naming.

regards

--
Tomas Vondra

#37Tomas Vondra
tomas@vondra.me
In reply to: Tomas Vondra (#36)
Re: Changing the state of data checksums in a running cluster

On 3/13/25 17:26, Tomas Vondra wrote:

On 3/13/25 13:32, Daniel Gustafsson wrote:

On 13 Mar 2025, at 12:03, Tomas Vondra <tomas@vondra.me> wrote:

...

This also reminds me I had a question about the barrier - can't it
happen a process gets to process multiple barriers at the same time? I
mean, let's say it gets stuck for a while, and the cluster happens to go
through disable+enable. Won't it then see both barriers? That'd be a
problem, because the core processes the barriers in the order determined
by the enum value, not in the order the barriers happened. Which means
it might break the expected state transitions again (and end with the
wrong local value). I haven't tried, though.

Interesting, that seems like a general deficiency in the barriers, surely
processing them in-order would be more intuitive? That would probably require
some form of Lamport clock though.

Yeah, that seems non-trivial. What if we instead ensured there can't be
two barriers set at the same time? Say, if we (somehow) ensured all
processes saw the previous barrier before allowing a new one, we would
not have this issue, right?

But I don't know what would be a good way to ensure this. Is there a way
to check if all processes saw the barrier? Any ideas?

Actually, scratch this. There already is a way to do this, by using
WaitForProcSignalBarrier. And the XLOG_CHECKSUMS processing already
calls this. So we should not see two barriers at the same time ...

One issue I ran into is the postmaster does not seem to be processing
the barriers, and thus not getting info about the data_checksum_version
changes.

Makes sense, that seems like a pretty reasonable constraint for the barrier.

Not sure I follow. What's a reasonable constraint?

That the postmaster deosn't process them.

OK, that means we need a way to "refresh" the value for new child
processses, similar to what my patch does. But I suspect there might be
a race condition - if the child process starts while processing the
XLOG_CHECKUMS record, it might happen to get the new value and then also
the barrier (if it does the "refresh" in between the XLogCtl update and
the barrier). Doesn't this need some sort of interlock, preventing this?

The child startup would need to do this:

1) acquire lock
2) reset barriers
3) refresh the LocalDataChecksumValue (from XLogCtl)
4) release lock

while the walreceiver would do this

1) acquire lock
2) update XLogCtl value
3) emit barrier
4) release lock

Or is there a reason why this would be unnecessary?

I still think this might be a problem. I wonder if we could maybe
leverage the barrier generation, to detect that we don't need to process
this barrier, because we already got the value directly ...

FWIW we'd have this problem even if postmaster was processing barriers,
because there'd always be a "gap" between the fork and ProcSignalInit()
registering the new process into the procsignal array.

regards

--
Tomas Vondra

#38Tomas Vondra
tomas@vondra.me
In reply to: Tomas Vondra (#37)
Re: Changing the state of data checksums in a running cluster

On 3/14/25 00:11, Tomas Vondra wrote:

...

One issue I ran into is the postmaster does not seem to be processing
the barriers, and thus not getting info about the data_checksum_version
changes.

Makes sense, that seems like a pretty reasonable constraint for the barrier.

Not sure I follow. What's a reasonable constraint?

That the postmaster deosn't process them.

OK, that means we need a way to "refresh" the value for new child
processses, similar to what my patch does. But I suspect there might be
a race condition - if the child process starts while processing the
XLOG_CHECKUMS record, it might happen to get the new value and then also
the barrier (if it does the "refresh" in between the XLogCtl update and
the barrier). Doesn't this need some sort of interlock, preventing this?

The child startup would need to do this:

1) acquire lock
2) reset barriers
3) refresh the LocalDataChecksumValue (from XLogCtl)
4) release lock

while the walreceiver would do this

1) acquire lock
2) update XLogCtl value
3) emit barrier
4) release lock

Or is there a reason why this would be unnecessary?

I still think this might be a problem. I wonder if we could maybe
leverage the barrier generation, to detect that we don't need to process
this barrier, because we already got the value directly ...

FWIW we'd have this problem even if postmaster was processing barriers,
because there'd always be a "gap" between the fork and ProcSignalInit()
registering the new process into the procsignal array.

I experimented with this a little bit, and unfortunately I ran into not
one, but two race conditions in this :-( I don't have reproducers, all
of this was done by manually adding sleep() calls / gdb breakpoints to
pause the processes for a while, but I'll try to explain what/why ...

1) race #1: SetDataChecksumsOn

The function (and all the other "SetDataChecksums" funcs) does this

SpinLockAcquire(&XLogCtl->info_lck);
XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
SpinLockRelease(&XLogCtl->info_lck);

barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);

Now, imagine there's a sleep() before the EmitProcSignalBarrier. A new
process may start during that, and it'll read the current checksum value
from XLogCtl. And then the SetDataChecksumsOn() wakes up, and emits the
barrier. So far so good.

But the new backend is already registered in ProcSignal, so it'll get
the barrier too, and will try to set the local version to "on" again.
And kaboom - that hits the assert in AbsorbChecksumsOnBarrier():

Assert(LocalDataChecksumVersion ==
PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);

The other "SetDataChecksums" have the same issue, except that in those
cases there are no asserts to trip. Only AbsorbChecksumsOnBarrier() has
such assert to check the state transition.

This is "ephemeral" in the sense that setting the value to "on" again
would be harmless, and indeed a non-assert build will run just fine.

2) race #2: InitPostgres

The InitPostgres does this:

InitLocalControldata();

ProcSignalInit(MyCancelKeyValid, MyCancelKey);

where InitLocalControldata gets the current checksum value from XLogCtl,
and ProcSignalInit registers the backend into the procsignal (which is
what barriers are based on).

Imagine there's a sleep() between these two calls, and the cluster does
not have checksums enabled. A backend will start, will read "off" from
XLogCtl, and then gets stuck on the sleep before it gets added to the
procsignal/barrier array.

Now, we enable checksums, and the instance goes through 'inprogress-on'
and 'on' states. This completes, and the backend wakes up and registers
itself into procsignal - but it won't get any barriers, of course.

So we end up with an instance with data_checksums="on", but this one
backend still believes data_checksums="on". This can cause a lot of
trouble, because it won't write blocks with checksums. I.e. this is
persistent data corruption.

I have been thinking about how to fix this. One way would be to
introduce some sort of locking, so that the two steps (update of the
XLogCtl version + barrier emit) and (local flag init + procsignal init)
would always happen atomically. So, something like this:

SpinLockAcquire(&XLogCtl->info_lck);
XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
SpinLockRelease(&XLogCtl->info_lck);

and

SpinLockAcquire(&XLogCtl->info_lck);
InitLocalControldata();
ProcSignalInit(MyCancelKeyValid, MyCancelKey);
SpinLockRelease(&XLogCtl->info_lck);

But that seems pretty heavy-handed, it's definitely much more work while
holding a spinlock than I'm comfortable with, and I wouldn't be
surprised if there were deadlock cases etc. (FWIW I believe it needs to
use XLogCtl->info_lck, to make the value consistent with checkpoints.)

Anyway, I think a much simpler solution would be to reorder InitPostgres
like this:

ProcSignalInit(MyCancelKeyValid, MyCancelKey);

InitLocalControldata();

i.e. to first register into procsignal, and then read the new value.
AFAICS this guarantees we won't lose any checksum version updates. It
does mean we still can get a barrier for a value we've already seen, but
I think we should simply ignore this for the very first update.

Opinions? Other ideas how to fix this?

regards

--
Tomas Vondra

#39Daniel Gustafsson
daniel@yesql.se
In reply to: Tomas Vondra (#38)
Re: Changing the state of data checksums in a running cluster

On 14 Mar 2025, at 13:20, Tomas Vondra <tomas@vondra.me> wrote:

I experimented with this a little bit, and unfortunately I ran into not
one, but two race conditions in this :-( I don't have reproducers, all
of this was done by manually adding sleep() calls / gdb breakpoints to
pause the processes for a while, but I'll try to explain what/why ...

Ugh. Thanks for this!

1) race #1: SetDataChecksumsOn

The function (and all the other "SetDataChecksums" funcs) does this

SpinLockAcquire(&XLogCtl->info_lck);
XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
SpinLockRelease(&XLogCtl->info_lck);

barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);

Now, imagine there's a sleep() before the EmitProcSignalBarrier. A new
process may start during that, and it'll read the current checksum value
from XLogCtl. And then the SetDataChecksumsOn() wakes up, and emits the
barrier. So far so good.

But the new backend is already registered in ProcSignal, so it'll get
the barrier too, and will try to set the local version to "on" again.
And kaboom - that hits the assert in AbsorbChecksumsOnBarrier():

Assert(LocalDataChecksumVersion ==
PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);

The other "SetDataChecksums" have the same issue, except that in those
cases there are no asserts to trip. Only AbsorbChecksumsOnBarrier() has
such assert to check the state transition.

This is "ephemeral" in the sense that setting the value to "on" again
would be harmless, and indeed a non-assert build will run just fine.

As mentioned off-list, being able to loosen the restriction for the first
barrier seen seem like a good way to keep this assertion. Removing it is of
course the alternative solution, as it's not causing any issues, but given how
handy it's been to find actual issues it would be good to be able to keep it.

2) race #2: InitPostgres

The InitPostgres does this:

InitLocalControldata();

ProcSignalInit(MyCancelKeyValid, MyCancelKey);

where InitLocalControldata gets the current checksum value from XLogCtl,
and ProcSignalInit registers the backend into the procsignal (which is
what barriers are based on).

Imagine there's a sleep() between these two calls, and the cluster does
not have checksums enabled. A backend will start, will read "off" from
XLogCtl, and then gets stuck on the sleep before it gets added to the
procsignal/barrier array.

Now, we enable checksums, and the instance goes through 'inprogress-on'
and 'on' states. This completes, and the backend wakes up and registers
itself into procsignal - but it won't get any barriers, of course.

So we end up with an instance with data_checksums="on", but this one
backend still believes data_checksums="on". This can cause a lot of
trouble, because it won't write blocks with checksums. I.e. this is
persistent data corruption.

I have been thinking about how to fix this. One way would be to
introduce some sort of locking, so that the two steps (update of the
XLogCtl version + barrier emit) and (local flag init + procsignal init)
would always happen atomically. So, something like this:

SpinLockAcquire(&XLogCtl->info_lck);
XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
SpinLockRelease(&XLogCtl->info_lck);

and

SpinLockAcquire(&XLogCtl->info_lck);
InitLocalControldata();
ProcSignalInit(MyCancelKeyValid, MyCancelKey);
SpinLockRelease(&XLogCtl->info_lck);

But that seems pretty heavy-handed, it's definitely much more work while
holding a spinlock than I'm comfortable with, and I wouldn't be
surprised if there were deadlock cases etc. (FWIW I believe it needs to
use XLogCtl->info_lck, to make the value consistent with checkpoints.)

Yeah, that seems quite likely to introduce a new set if issues.

Anyway, I think a much simpler solution would be to reorder InitPostgres
like this:

ProcSignalInit(MyCancelKeyValid, MyCancelKey);

InitLocalControldata();

Agreed.

i.e. to first register into procsignal, and then read the new value.
AFAICS this guarantees we won't lose any checksum version updates. It
does mean we still can get a barrier for a value we've already seen, but
I think we should simply ignore this for the very first update.

Calling functions with sideeffects in setting state seems like a bad idea
before ProcSignalInit has run, that's thinko on my part in this patch. Your
solution of reordering seems like the right way to handle this.

--
Daniel Gustafsson

#40Daniel Gustafsson
daniel@yesql.se
In reply to: Daniel Gustafsson (#39)
6 attachment(s)
Re: Changing the state of data checksums in a running cluster

On 14 Mar 2025, at 14:38, Daniel Gustafsson <daniel@yesql.se> wrote:

On 14 Mar 2025, at 13:20, Tomas Vondra <tomas@vondra.me> wrote:

This is "ephemeral" in the sense that setting the value to "on" again
would be harmless, and indeed a non-assert build will run just fine.

As mentioned off-list, being able to loosen the restriction for the first
barrier seen seem like a good way to keep this assertion. Removing it is of
course the alternative solution, as it's not causing any issues, but given how
handy it's been to find actual issues it would be good to be able to keep it.

i.e. to first register into procsignal, and then read the new value.
AFAICS this guarantees we won't lose any checksum version updates. It
does mean we still can get a barrier for a value we've already seen, but
I think we should simply ignore this for the very first update.

Calling functions with sideeffects in setting state seems like a bad idea
before ProcSignalInit has run, that's thinko on my part in this patch. Your
solution of reordering seems like the right way to handle this.

0006 in the attached version is what I have used when testing the above, along
with an update to the copyright year which I had missed doing earlier. It also
contains the fix in LocalProcessControlFile which I had in my local tree, I
think we need something like that at least.

--
Daniel Gustafsson

Attachments:

v20250314-0006-Reviewfixups.patchapplication/octet-stream; name=v20250314-0006-Reviewfixups.patch; x-unix-mode=0644Download
From 6ff59fbc04596f27cae2af30eda585f1bd8269f5 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Fri, 14 Mar 2025 15:00:51 +0100
Subject: [PATCH v20250314 6/6] Reviewfixups

---
 src/backend/access/transam/xlog.c            | 26 +++++++++++++++++++-
 src/backend/postmaster/datachecksumsworker.c |  2 +-
 src/backend/utils/init/postinit.c            |  4 +--
 src/include/postmaster/datachecksumsworker.h |  2 +-
 src/test/checksum/Makefile                   |  2 +-
 5 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 61da6d583cd..e4c72f985e4 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -660,6 +660,16 @@ static bool updateMinRecoveryPoint = true;
  */
 static uint32 LocalDataChecksumVersion = 0;
 
+/*
+ * Flag to remember if the procsignalbarrier being absorbed for enabling
+ * checksums is the first one or not. The first procsignalbarrier can in rare
+ * circumstances cause a transition from 'on' to 'on' when a new process is
+ * spawned between the update of XLogCtl->data_checksum_version and the
+ * barrier being emitted.  This can only happen on the very first barrier so
+ * mark that with this flag.
+ */
+static bool InitialDataChecksumTransition = true;
+
 /*
  * Variable backing the GUC, keep it in sync with LocalDataChecksumVersion.
  * See SetLocalDataChecksumVersion().
@@ -4935,7 +4945,20 @@ AbsorbChecksumsOnInProgressBarrier(void)
 bool
 AbsorbChecksumsOnBarrier(void)
 {
-	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	/*
+	 * If the process was spawned between updating XLogCtl and emitting the
+	 * barrier it will have seen the updated value, so for the first barrier
+	 * we accept both "on" and "inprogress-on".
+	 */
+	if (InitialDataChecksumTransition)
+	{
+		Assert((LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION) ||
+			   (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION));
+		InitialDataChecksumTransition = false;
+	}
+	else
+		Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+
 	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_VERSION);
 	return true;
 }
@@ -5319,6 +5342,7 @@ LocalProcessControlFile(bool reset)
 	Assert(reset || ControlFile == NULL);
 	ControlFile = palloc(sizeof(ControlFileData));
 	ReadControlFile();
+	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
 }
 
 /*
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
index 6a201dca8de..81be2808895 100644
--- a/src/backend/postmaster/datachecksumsworker.c
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -150,7 +150,7 @@
  *     online operation).
  *
  *
- * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
  *
  *
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index bae18b449aa..692570eb0f1 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -746,13 +746,13 @@ InitPostgres(const char *in_dbname, Oid dboid,
 	 */
 	SharedInvalBackendInit(false);
 
+	ProcSignalInit(MyCancelKeyValid, MyCancelKey);
+
 	/*
 	 * Set up backend local cache of Controldata values.
 	 */
 	InitLocalControldata();
 
-	ProcSignalInit(MyCancelKeyValid, MyCancelKey);
-
 	/*
 	 * Also set up timeout handlers needed for backend operation.  We need
 	 * these in every case except bootstrap.
diff --git a/src/include/postmaster/datachecksumsworker.h b/src/include/postmaster/datachecksumsworker.h
index 59c9000d646..0649232723d 100644
--- a/src/include/postmaster/datachecksumsworker.h
+++ b/src/include/postmaster/datachecksumsworker.h
@@ -4,7 +4,7 @@
  *	  header file for checksum helper background worker
  *
  *
- * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
  *
  * src/include/postmaster/datachecksumsworker.h
diff --git a/src/test/checksum/Makefile b/src/test/checksum/Makefile
index fd03bf73df4..f287001301e 100644
--- a/src/test/checksum/Makefile
+++ b/src/test/checksum/Makefile
@@ -2,7 +2,7 @@
 #
 # Makefile for src/test/checksum
 #
-# Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+# Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
 # Portions Copyright (c) 1994, Regents of the University of California
 #
 # src/test/checksum/Makefile
-- 
2.39.3 (Apple Git-146)

v20250314-0005-data_checksum_version-reworks.patchapplication/octet-stream; name=v20250314-0005-data_checksum_version-reworks.patch; x-unix-mode=0644Download
From ec9160192aeb513fda53a3609d37eba78e69756e Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Tue, 11 Mar 2025 19:16:23 +0100
Subject: [PATCH v20250314 5/6] data_checksum_version reworks

---
 src/backend/access/transam/xlog.c       | 133 +++++++++++++++++-------
 src/backend/postmaster/launch_backend.c |  10 ++
 src/include/catalog/pg_control.h        |   5 +-
 src/include/miscadmin.h                 |   4 +
 4 files changed, 116 insertions(+), 36 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index f137cdc6d42..61da6d583cd 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -550,6 +550,9 @@ typedef struct XLogCtlData
 	 */
 	XLogRecPtr	lastFpwDisableRecPtr;
 
+	/* last data_checksum_version we've seen */
+	uint32		data_checksum_version;
+
 	slock_t		info_lck;		/* locks shared variables shown above */
 } XLogCtlData;
 
@@ -4262,6 +4265,12 @@ InitControlFile(uint64 sysidentifier, uint32 data_checksum_version)
 	ControlFile->wal_log_hints = wal_log_hints;
 	ControlFile->track_commit_timestamp = track_commit_timestamp;
 	ControlFile->data_checksum_version = data_checksum_version;
+
+	/*
+	 * Set the data_checksum_version value into XLogCtl, which is where all
+	 * processes get the current value from. (Maybe it should go just there?)
+	 */
+	XLogCtl->data_checksum_version = data_checksum_version;
 }
 
 static void
@@ -4601,8 +4610,6 @@ ReadControlFile(void)
 		(SizeOfXLogLongPHD - SizeOfXLogShortPHD);
 
 	CalculateCheckpointSegments();
-
-	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
 }
 
 /*
@@ -4734,9 +4741,9 @@ SetDataChecksumsOnInProgress(void)
 
 	XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
 
-	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
-	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
-	LWLockRelease(ControlFileLock);
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	SpinLockRelease(&XLogCtl->info_lck);
 
 	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
 
@@ -4780,28 +4787,28 @@ SetDataChecksumsOn(void)
 
 	Assert(ControlFile != NULL);
 
-	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	SpinLockAcquire(&XLogCtl->info_lck);
 
 	/*
 	 * The only allowed state transition to "on" is from "inprogress-on" since
 	 * that state ensures that all pages will have data checksums written.
 	 */
-	if (ControlFile->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	if (XLogCtl->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
 	{
-		LWLockRelease(ControlFileLock);
+		SpinLockRelease(&XLogCtl->info_lck);
 		elog(ERROR, "checksums not in \"inprogress-on\" mode");
 	}
 
-	LWLockRelease(ControlFileLock);
+	SpinLockRelease(&XLogCtl->info_lck);
 
 	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
 	START_CRIT_SECTION();
 
 	XLogChecksums(PG_DATA_CHECKSUM_VERSION);
 
-	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
-	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
-	LWLockRelease(ControlFileLock);
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
+	SpinLockRelease(&XLogCtl->info_lck);
 
 	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
 
@@ -4836,12 +4843,12 @@ SetDataChecksumsOff(void)
 
 	Assert(ControlFile);
 
-	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	SpinLockAcquire(&XLogCtl->info_lck);
 
 	/* If data checksums are already disabled there is nothing to do */
-	if (ControlFile->data_checksum_version == 0)
+	if (XLogCtl->data_checksum_version == 0)
 	{
-		LWLockRelease(ControlFileLock);
+		SpinLockRelease(&XLogCtl->info_lck);
 		return;
 	}
 
@@ -4852,18 +4859,18 @@ SetDataChecksumsOff(void)
 	 * "inprogress-off" the next transition to "off" can be performed, after
 	 * which all data checksum processing is disabled.
 	 */
-	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
 	{
-		LWLockRelease(ControlFileLock);
+		SpinLockRelease(&XLogCtl->info_lck);
 
 		MyProc->delayChkptFlags |= DELAY_CHKPT_START;
 		START_CRIT_SECTION();
 
 		XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
 
-		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
-		ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
-		LWLockRelease(ControlFileLock);
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+		SpinLockRelease(&XLogCtl->info_lck);
 
 		barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
 
@@ -4890,7 +4897,7 @@ SetDataChecksumsOff(void)
 		 * or "inprogress-off" and we can transition directly to "off" from
 		 * there.
 		 */
-		LWLockRelease(ControlFileLock);
+		SpinLockRelease(&XLogCtl->info_lck);
 	}
 
 	/*
@@ -4901,9 +4908,9 @@ SetDataChecksumsOff(void)
 
 	XLogChecksums(0);
 
-	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
-	ControlFile->data_checksum_version = 0;
-	LWLockRelease(ControlFileLock);
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = 0;
+	SpinLockRelease(&XLogCtl->info_lck);
 
 	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
 
@@ -4958,9 +4965,9 @@ AbsorbChecksumsOffBarrier(void)
 void
 InitLocalControldata(void)
 {
-	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
-	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
-	LWLockRelease(ControlFileLock);
+	SpinLockAcquire(&XLogCtl->info_lck);
+	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+	SpinLockRelease(&XLogCtl->info_lck);
 }
 
 /*
@@ -4994,6 +5001,40 @@ SetLocalDataChecksumVersion(uint32 data_checksum_version)
 	}
 }
 
+/*
+ * Initialize the various data checksum values - GUC, local, ....
+ */
+void
+InitLocalDataChecksumVersion(void)
+{
+	uint32	data_checksum_version;
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	data_checksum_version = XLogCtl->data_checksum_version;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	SetLocalDataChecksumVersion(data_checksum_version);
+}
+
+/*
+ * Get the local data_checksum_version (cached XLogCtl value).
+ */
+uint32
+GetLocalDataChecksumVersion(void)
+{
+	return LocalDataChecksumVersion;
+}
+
+/*
+ * Get the *current* data_checksum_version (might not be written to control
+ * file yet).
+ */
+uint32
+GetCurrentDataChecksumVersion(void)
+{
+	return XLogCtl->data_checksum_version;
+}
+
 /* guc hook */
 const char *
 show_data_checksums(void)
@@ -5447,6 +5488,11 @@ XLOGShmemInit(void)
 	XLogCtl->InstallXLogFileSegmentActive = false;
 	XLogCtl->WalWriterSleeping = false;
 
+	/* use the checksum info from control file */
+	XLogCtl->data_checksum_version = ControlFile->data_checksum_version;
+
+	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+
 	SpinLockInit(&XLogCtl->Insert.insertpos_lck);
 	SpinLockInit(&XLogCtl->info_lck);
 	pg_atomic_init_u64(&XLogCtl->logInsertResult, InvalidXLogRecPtr);
@@ -6585,7 +6631,7 @@ StartupXLOG(void)
 	 * background worker directly from here, it has to be launched from a
 	 * regular backend.
 	 */
-	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
 		ereport(WARNING,
 				(errmsg("data checksums are being enabled, but no worker is running"),
 				 errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
@@ -6596,13 +6642,13 @@ StartupXLOG(void)
 	 * checksums and we can move to off instead of prompting the user to
 	 * perform any action.
 	 */
-	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
 	{
 		XLogChecksums(0);
 
-		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
-		ControlFile->data_checksum_version = 0;
-		LWLockRelease(ControlFileLock);
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = 0;
+		SpinLockRelease(&XLogCtl->info_lck);
 	}
 
 	/*
@@ -7460,6 +7506,12 @@ CreateCheckPoint(int flags)
 	checkPoint.fullPageWrites = Insert->fullPageWrites;
 	checkPoint.wal_level = wal_level;
 
+	/*
+	 * Get the current data_checksum_version value from xlogctl, valid at
+	 * the time of the checkpoint.
+	 */
+	checkPoint.data_checksum_version = XLogCtl->data_checksum_version;
+
 	if (shutdown)
 	{
 		XLogRecPtr	curInsert = XLogBytePosToRecPtr(Insert->CurrBytePos);
@@ -7715,6 +7767,9 @@ CreateCheckPoint(int flags)
 	ControlFile->minRecoveryPoint = InvalidXLogRecPtr;
 	ControlFile->minRecoveryPointTLI = 0;
 
+	/* make sure we start with the checksum version as of the checkpoint */
+	ControlFile->data_checksum_version = checkPoint.data_checksum_version;
+
 	/*
 	 * Persist unloggedLSN value. It's reset on crash recovery, so this goes
 	 * unused on non-shutdown checkpoints, but seems useful to store it always
@@ -7861,6 +7916,10 @@ CreateEndOfRecoveryRecord(void)
 	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
 	ControlFile->minRecoveryPoint = recptr;
 	ControlFile->minRecoveryPointTLI = xlrec.ThisTimeLineID;
+
+	/* start with the latest checksum version (as of the end of recovery) */
+	ControlFile->data_checksum_version = XLogCtl->data_checksum_version;
+
 	UpdateControlFile();
 	LWLockRelease(ControlFileLock);
 
@@ -8203,6 +8262,10 @@ CreateRestartPoint(int flags)
 			if (flags & CHECKPOINT_IS_SHUTDOWN)
 				ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY;
 		}
+
+		/* we shall start with the latest checksum version */
+		ControlFile->data_checksum_version = lastCheckPoint.data_checksum_version;
+
 		UpdateControlFile();
 	}
 	LWLockRelease(ControlFileLock);
@@ -9065,9 +9128,9 @@ xlog_redo(XLogReaderState *record)
 
 		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
 
-		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
-		ControlFile->data_checksum_version = state.new_checksumtype;
-		LWLockRelease(ControlFileLock);
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = state.new_checksumtype;
+		SpinLockRelease(&XLogCtl->info_lck);
 
 		/*
 		 * Block on a procsignalbarrier to await all processes having seen the
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index b06b5fb45dd..65bbf770d28 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -285,6 +285,16 @@ postmaster_child_launch(BackendType child_type, int child_slot,
 			memcpy(MyClientSocket, client_sock, sizeof(ClientSocket));
 		}
 
+		/*
+		 * update the LocalProcessControlFile to match XLogCtl->data_checksum_version
+		 *
+		 * XXX It seems the postmaster (which is what gets forked into the new
+		 * child process) does not absorb the checksum barriers, therefore it
+		 * does not update the value (except after a restart). Not sure if there
+		 * is some sort of race condition.
+		 */
+		InitLocalDataChecksumVersion();
+
 		/*
 		 * Run the appropriate Main function
 		 */
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 9b5a50adf54..727042b73be 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -62,6 +62,9 @@ typedef struct CheckPoint
 	 * set to InvalidTransactionId.
 	 */
 	TransactionId oldestActiveXid;
+
+	/* data checksums at the time of the checkpoint  */
+	uint32		data_checksum_version;
 } CheckPoint;
 
 /* XLOG info values for XLOG rmgr */
@@ -220,7 +223,7 @@ typedef struct ControlFileData
 	bool		float8ByVal;	/* float8, int8, etc pass-by-value? */
 
 	/* Are data pages protected by checksums? Zero if no checksum version */
-	uint32		data_checksum_version;
+	uint32		data_checksum_version;		/* persistent */
 
 	/*
 	 * True if the default signedness of char is "signed" on a platform where
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 677d0b45bd4..133e7fde290 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -540,6 +540,10 @@ extern Size EstimateClientConnectionInfoSpace(void);
 extern void SerializeClientConnectionInfo(Size maxsize, char *start_address);
 extern void RestoreClientConnectionInfo(char *conninfo);
 
+extern uint32 GetLocalDataChecksumVersion(void);
+extern uint32 GetCurrentDataChecksumVersion(void);
+extern void InitLocalDataChecksumVersion(void);
+
 /* in executor/nodeHash.c */
 extern size_t get_hash_memory_limit(void);
 
-- 
2.39.3 (Apple Git-146)

v20250314-0004-perltidy.patchapplication/octet-stream; name=v20250314-0004-perltidy.patch; x-unix-mode=0644Download
From 9ed00280ee490fdc95cf25a7993521616aeae368 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Mon, 10 Mar 2025 15:05:43 +0100
Subject: [PATCH v20250314 4/6] perltidy

---
 src/test/perl/PostgreSQL/Test/Cluster.pm | 6 ++++--
 src/test/subscription/t/013_partition.pl | 3 +--
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 666bd2a2d4c..1c66360c16c 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3761,7 +3761,8 @@ sub checksum_enable_offline
 	my ($self) = @_;
 
 	print "### Enabling checksums in \"$self->data_dir\"\n";
-	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-e');
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-e');
 	return;
 }
 
@@ -3778,7 +3779,8 @@ sub checksum_disable_offline
 	my ($self) = @_;
 
 	print "### Disabling checksums in \"$self->data_dir\"\n";
-	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-d');
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-d');
 	return;
 }
 
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
index 61b0cb4aa1a..4f78dd48815 100644
--- a/src/test/subscription/t/013_partition.pl
+++ b/src/test/subscription/t/013_partition.pl
@@ -51,8 +51,7 @@ $node_subscriber1->safe_psql('postgres',
 );
 # make a BRIN index to test aminsertcleanup logic in subscriber
 $node_subscriber1->safe_psql('postgres',
-	"CREATE INDEX tab1_c_brin_idx ON tab1 USING brin (c)"
-);
+	"CREATE INDEX tab1_c_brin_idx ON tab1 USING brin (c)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1_1 (b text, c text DEFAULT 'sub1_tab1', a int NOT NULL)"
 );
-- 
2.39.3 (Apple Git-146)

v20250314-0003-pgindent.patchapplication/octet-stream; name=v20250314-0003-pgindent.patch; x-unix-mode=0644Download
From d75280c7dcae980b06450dba302dbac7f35f24e5 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Mon, 10 Mar 2025 14:57:05 +0100
Subject: [PATCH v20250314 3/6] pgindent

---
 src/backend/postmaster/datachecksumsworker.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
index d9833cd4c98..6a201dca8de 100644
--- a/src/backend/postmaster/datachecksumsworker.c
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -921,7 +921,7 @@ ProcessAllDatabases(bool immediate_checkpoint)
 			PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
 		};
 
-		int64	vals[6];
+		int64		vals[6];
 
 		vals[0] = list_length(DatabaseList);
 		vals[1] = 0;
@@ -974,7 +974,8 @@ ProcessAllDatabases(bool immediate_checkpoint)
 			processed_databases++;
 
 			/*
-			 * Update the number of processed databases in the progress report.
+			 * Update the number of processed databases in the progress
+			 * report.
 			 */
 			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_DBS_DONE,
 										 processed_databases);
@@ -1126,9 +1127,9 @@ DataChecksumsWorkerShmemInit(void)
 		MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
 
 		/*
-		 * Even if this is a redundant assignment, we want to be explicit about
-		 * our intent for readability, since we want to be able to query this
-		 * state in case of restartability.
+		 * Even if this is a redundant assignment, we want to be explicit
+		 * about our intent for readability, since we want to be able to query
+		 * this state in case of restartability.
 		 */
 		DataChecksumsWorkerShmem->launch_enable_checksums = false;
 		DataChecksumsWorkerShmem->launcher_running = false;
@@ -1339,7 +1340,7 @@ DataChecksumsWorkerMain(Datum arg)
 			PROGRESS_DATACHECKSUMS_RELS_DONE
 		};
 
-		int64	vals[2];
+		int64		vals[2];
 
 		vals[0] = list_length(RelationList);
 		vals[1] = 0;
-- 
2.39.3 (Apple Git-146)

v20250314-0002-review-fixes.patchapplication/octet-stream; name=v20250314-0002-review-fixes.patch; x-unix-mode=0644Download
From 2fd9b67b61fc1ded059500f468f0fa1824102a06 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Mon, 10 Mar 2025 14:56:14 +0100
Subject: [PATCH v20250314 2/6] review fixes

---
 doc/src/sgml/monitoring.sgml                 | 13 +++----------
 src/backend/catalog/system_views.sql         |  7 +++----
 src/backend/postmaster/datachecksumsworker.c | 11 +----------
 src/include/commands/progress.h              |  5 ++---
 src/include/miscadmin.h                      |  4 +++-
 src/test/regress/expected/rules.out          |  7 +++----
 6 files changed, 15 insertions(+), 32 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index e51bf902dc2..8fec0277f97 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6911,7 +6911,7 @@ FROM pg_stat_get_backend_idset() AS backendid;
        <para>
         The total number of databases which will be processed. Only the
         launcher worker has this value set, the other worker processes
-        have this <literal>NULL</literal>.
+        have this set to <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -6924,7 +6924,7 @@ FROM pg_stat_get_backend_idset() AS backendid;
        <para>
         The number of databases which have been processed. Only the
         launcher worker has this value set, the other worker processes
-        have this <literal>NULL</literal>.
+        have this set to <literal>NULL</literal>.
        </para>
       </entry>
      </row>
@@ -7009,13 +7009,6 @@ FROM pg_stat_get_backend_idset() AS backendid;
        The command is currently disabling data checksums on the cluster.
       </entry>
      </row>
-     <row>
-      <entry><literal>waiting on backends</literal></entry>
-      <entry>
-       The command is currently waiting for backends to acknowledge the data
-       checksum operation.
-      </entry>
-     </row>
      <row>
       <entry><literal>waiting on temporary tables</literal></entry>
       <entry>
@@ -7027,7 +7020,7 @@ FROM pg_stat_get_backend_idset() AS backendid;
       <entry><literal>waiting on checkpoint</literal></entry>
       <entry>
        The command is currently waiting for a checkpoint to update the checksum
-       state at the end.
+       state before finishing.
       </entry>
      </row>
     </tbody>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 4330d0ad656..6ffd31ce39c 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1340,10 +1340,9 @@ CREATE VIEW pg_stat_progress_data_checksums AS
         CASE S.param1 WHEN 0 THEN 'enabling'
                       WHEN 1 THEN 'disabling'
                       WHEN 2 THEN 'waiting'
-                      WHEN 3 THEN 'waiting on backends'
-                      WHEN 4 THEN 'waiting on temporary tables'
-                      WHEN 5 THEN 'waiting on checkpoint'
-                      WHEN 6 THEN 'done'
+                      WHEN 3 THEN 'waiting on temporary tables'
+                      WHEN 4 THEN 'waiting on checkpoint'
+                      WHEN 5 THEN 'done'
                       END AS phase,
         CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
         S.param3 AS databases_done,
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
index bbbce61cfa6..d9833cd4c98 100644
--- a/src/backend/postmaster/datachecksumsworker.c
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -801,15 +801,6 @@ again:
 		}
 		RESUME_INTERRUPTS();
 
-		/*
-		 * Initialize progress and indicate that we are waiting on the other
-		 * backends to clear the procsignalbarrier.
-		 */
-		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
-									 PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS);
-
-		/* XXX isn't it weird there's no wait between the phase updates? */
-
 		/*
 		 * Set the state to inprogress-on and wait on the procsignal barrier.
 		 */
@@ -1083,7 +1074,7 @@ ProcessAllDatabases(bool immediate_checkpoint)
 
 	/*
 	 * When enabling checksums, we have to wait for a checkpoint for the
-	 * checksums to e.
+	 * checksums to change from in-progress to on.
 	 */
 	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
 								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT);
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 94b478a6cc9..b172a5f24ce 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -170,8 +170,7 @@
 #define PROGRESS_DATACHECKSUMS_PHASE_ENABLING			0
 #define PROGRESS_DATACHECKSUMS_PHASE_DISABLING			1
 #define PROGRESS_DATACHECKSUMS_PHASE_WAITING			2
-#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS	3
-#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL	4
-#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT 5
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL	3
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT 4
 
 #endif
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index c238c31604c..677d0b45bd4 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -392,7 +392,9 @@ extern PGDLLIMPORT BackendType MyBackendType;
 #define AmWalReceiverProcess()		(MyBackendType == B_WAL_RECEIVER)
 #define AmWalSummarizerProcess()	(MyBackendType == B_WAL_SUMMARIZER)
 #define AmWalWriterProcess()		(MyBackendType == B_WAL_WRITER)
-#define AmDataChecksumsWorkerProcess()	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
+#define AmDataChecksumsWorkerProcess() \
+	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || \
+	 MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
 
 #define AmSpecialWorkerProcess() \
 	(AmAutoVacuumLauncherProcess() || \
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2cfea837554..0dd383a76dd 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2048,10 +2048,9 @@ pg_stat_progress_data_checksums| SELECT s.pid,
             WHEN 0 THEN 'enabling'::text
             WHEN 1 THEN 'disabling'::text
             WHEN 2 THEN 'waiting'::text
-            WHEN 3 THEN 'waiting on backends'::text
-            WHEN 4 THEN 'waiting on temporary tables'::text
-            WHEN 5 THEN 'waiting on checkpoint'::text
-            WHEN 6 THEN 'done'::text
+            WHEN 3 THEN 'waiting on temporary tables'::text
+            WHEN 4 THEN 'waiting on checkpoint'::text
+            WHEN 5 THEN 'done'::text
             ELSE NULL::text
         END AS phase,
         CASE s.param2
-- 
2.39.3 (Apple Git-146)

v20250314-0001-Online-enabling-and-disabling-of-data-chec.patchapplication/octet-stream; name=v20250314-0001-Online-enabling-and-disabling-of-data-chec.patch; x-unix-mode=0644Download
From cab1b5b23489df3bc4b2604158c3f600ec56d1a1 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Fri, 7 Mar 2025 19:21:22 +0100
Subject: [PATCH v20250314 1/6] Online enabling and disabling of data checksums

This allows data checksums to be enabled, or disabled, in a running
cluster without restricting access to the cluster during processing.

Data checksums could prior to this only be enabled during initdb or
when the cluster is offline using the pg_checksums app. This commit
introduce functionality to enable, or disable, data checksums while
the cluster is running regardless of how it was initialized.

A background worker launcher process is responsible for launching a
dynamic per-database background worker which will mark all buffers
dirty for all relation with storage in order for them to have data
checksums calcuated on write.  Once all relations in all databases
have been processed, the data_checksums state will be set to on and
the cluster will at that point be identical to one which had data
checksums enabled during initialization or via offline processing.

When data checksums are being enabled, concurrent I/O operations
from backends other than the data checksums worker will write the
checksums but not verify them on reading.  Only when all backends
have absorbed the procsignalbarrier for setting data_checksums to
on will they also start verifying checksums on reading.  The same
process is repeated during disabling; all backends write checksums
but do not verify them until the barrier for setting the state to
off has been absorbed by all.  This in-progress state is used to
ensure there are no false negatives (or positives) due to reading
a checksum which is not in sync with the page.

This work is based on an earlier version of this patch which was
reviewed by among others Heikki Linnakangas, Robert Haas, Andres
Freund, Tomas Vondra, Michael Banck and Andrey Borodin.

Author: Daniel Gustafsson <daniel@yesql.se>
Author: Magnus Hagander <magnus@hagander.net>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com
Discussion: https://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de
Discussion: https://postgr.es/m/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com
---
 doc/src/sgml/func.sgml                        |   71 +
 doc/src/sgml/glossary.sgml                    |   23 +
 doc/src/sgml/monitoring.sgml                  |  215 ++-
 doc/src/sgml/ref/pg_checksums.sgml            |    6 +
 doc/src/sgml/wal.sgml                         |   59 +-
 src/backend/access/rmgrdesc/xlogdesc.c        |   24 +
 src/backend/access/transam/xlog.c             |  489 +++++-
 src/backend/access/transam/xlogfuncs.c        |   41 +
 src/backend/backup/basebackup.c               |    6 +-
 src/backend/catalog/system_functions.sql      |   11 +
 src/backend/catalog/system_views.sql          |   21 +
 src/backend/postmaster/Makefile               |    1 +
 src/backend/postmaster/bgworker.c             |    7 +
 src/backend/postmaster/datachecksumsworker.c  | 1447 +++++++++++++++++
 src/backend/postmaster/launch_backend.c       |    3 +
 src/backend/postmaster/meson.build            |    1 +
 src/backend/postmaster/postmaster.c           |    5 +
 src/backend/replication/logical/decode.c      |    1 +
 src/backend/storage/ipc/ipci.c                |    4 +
 src/backend/storage/ipc/procsignal.c          |   13 +
 src/backend/storage/page/README               |    4 +-
 src/backend/storage/page/bufpage.c            |    6 +-
 src/backend/utils/activity/pgstat_backend.c   |    2 +
 src/backend/utils/activity/pgstat_io.c        |    2 +
 .../utils/activity/wait_event_names.txt       |    3 +
 src/backend/utils/adt/pgstatfuncs.c           |    8 +-
 src/backend/utils/init/miscinit.c             |    9 +-
 src/backend/utils/init/postinit.c             |    7 +-
 src/backend/utils/misc/guc_tables.c           |   31 +-
 src/bin/pg_checksums/pg_checksums.c           |    2 +-
 src/bin/pg_upgrade/controldata.c              |    9 +
 src/include/access/xlog.h                     |   17 +-
 src/include/access/xlog_internal.h            |    7 +
 src/include/catalog/pg_control.h              |    1 +
 src/include/catalog/pg_proc.dat               |   17 +
 src/include/commands/progress.h               |   17 +
 src/include/miscadmin.h                       |    4 +
 src/include/postmaster/datachecksumsworker.h  |   31 +
 src/include/storage/bufpage.h                 |   10 +
 src/include/storage/checksum.h                |    8 +
 src/include/storage/lwlocklist.h              |    1 +
 src/include/storage/proc.h                    |    5 +-
 src/include/storage/procsignal.h              |    5 +
 src/include/utils/backend_progress.h          |    1 +
 src/test/Makefile                             |   11 +-
 src/test/checksum/.gitignore                  |    2 +
 src/test/checksum/Makefile                    |   23 +
 src/test/checksum/README                      |   22 +
 src/test/checksum/meson.build                 |   15 +
 src/test/checksum/t/001_basic.pl              |   88 +
 src/test/checksum/t/002_restarts.pl           |   92 ++
 src/test/checksum/t/003_standby_restarts.pl   |  139 ++
 src/test/checksum/t/004_offline.pl            |  101 ++
 src/test/meson.build                          |    1 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |   34 +
 src/test/regress/expected/rules.out           |   37 +
 src/test/regress/expected/stats.out           |   18 +-
 src/tools/pgindent/typedefs.list              |    6 +
 58 files changed, 3194 insertions(+), 50 deletions(-)
 create mode 100644 src/backend/postmaster/datachecksumsworker.c
 create mode 100644 src/include/postmaster/datachecksumsworker.h
 create mode 100644 src/test/checksum/.gitignore
 create mode 100644 src/test/checksum/Makefile
 create mode 100644 src/test/checksum/README
 create mode 100644 src/test/checksum/meson.build
 create mode 100644 src/test/checksum/t/001_basic.pl
 create mode 100644 src/test/checksum/t/002_restarts.pl
 create mode 100644 src/test/checksum/t/003_standby_restarts.pl
 create mode 100644 src/test/checksum/t/004_offline.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 1c3810e1a04..0df5067c2c6 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29882,6 +29882,77 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
 
   </sect2>
 
+  <sect2 id="functions-admin-checksum">
+   <title>Data Checksum Functions</title>
+
+   <para>
+    The functions shown in <xref linkend="functions-checksums-table" /> can
+    be used to enable or disable data checksums in a running cluster.
+    See <xref linkend="checksums" /> for details.
+   </para>
+
+   <table id="functions-checksums-table">
+    <title>Data Checksum Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_enable_data_checksums</primary>
+        </indexterm>
+        <function>pg_enable_data_checksums</function> ( <optional><parameter>cost_delay</parameter> <type>int</type>, <parameter>cost_limit</parameter> <type>int</type></optional> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Initiates data checksums for the cluster. This will switch the data
+        checksums mode to <literal>inprogress-on</literal> as well as start a
+        background worker that will process all pages in the database and
+        enable checksums on them. When all data pages have had checksums
+        enabled, the cluster will automatically switch data checksums mode to
+        <literal>on</literal>.
+       </para>
+       <para>
+        If <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter> are
+        specified, the speed of the process is throttled using the same principles as
+        <link linkend="runtime-config-resource-vacuum-cost">Cost-based Vacuum Delay</link>.
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_disable_data_checksums</primary>
+        </indexterm>
+        <function>pg_disable_data_checksums</function> ()
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Disables data checksum validation and calculation for the cluster. This
+        will switch the data checksum mode to <literal>inprogress-off</literal>
+        while data checksums are being disabled. When all active backends have
+        stopped validating data checksums, the data checksum mode will be
+        changed to <literal>off</literal>.  At this point the data pages will
+        still have checksums recorded but they are not updated when pages are
+        modified.
+       </para></entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+
+  </sect2>
+
   <sect2 id="functions-admin-dbobject">
    <title>Database Object Management Functions</title>
 
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index c0f812e3f5e..547e8586d4b 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -159,6 +159,8 @@
      (but not the autovacuum workers),
      the <glossterm linkend="glossary-background-writer">background writer</glossterm>,
      the <glossterm linkend="glossary-checkpointer">checkpointer</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker">data checksums worker</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker-launcher">data checksums worker launcher</glossterm>,
      the <glossterm linkend="glossary-logger">logger</glossterm>,
      the <glossterm linkend="glossary-startup-process">startup process</glossterm>,
      the <glossterm linkend="glossary-wal-archiver">WAL archiver</glossterm>,
@@ -548,6 +550,27 @@
    </glossdef>
   </glossentry>
 
+  <glossentry id="glossary-data-checksums-worker">
+   <glossterm>Data Checksums Worker</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which enables or disables data checksums in a specific database.
+    </para>
+   </glossdef>
+  </glossentry>
+
+  <glossentry id="glossary-data-checksums-worker-launcher">
+   <glossterm>Data Checksums Worker Launcher</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which starts <glossterm linkend="glossary-data-checksums-worker"> processes</glossterm>
+     for each database.
+    </para>
+   </glossdef>
+  </glossentry>
+
   <glossentry id="glossary-db-cluster">
    <glossterm>Database cluster</glossterm>
    <glossdef>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index aaa6586d3a4..e51bf902dc2 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3497,8 +3497,9 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Number of data page checksum failures detected in this
-       database (or on a shared object), or NULL if data checksums are
-       disabled.
+       database (or on a shared object).
+       Detected failures are reported regardless of the
+       <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -3508,8 +3509,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Time at which the last data page checksum failure was detected in
-       this database (or on a shared object), or NULL if data checksums are
-       disabled.
+       this database (or on a shared object). Last failure is reported
+       regardless of the <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -6828,6 +6829,212 @@ FROM pg_stat_get_backend_idset() AS backendid;
 
  </sect2>
 
+ <sect2 id="data-checksum-progress-reporting">
+  <title>Data Checksum Progress Reporting</title>
+
+  <indexterm>
+   <primary>pg_stat_progress_data_checksums</primary>
+  </indexterm>
+
+  <para>
+   When data checksums are being enabled on a running cluster, the
+   <structname>pg_stat_progress_data_checksums</structname> view will contain
+   a row for the launcher process, and one row for each worker process which
+   is currently calculating checksums for the data pages in one database.
+  </para>
+
+  <table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
+   <title><structname>pg_stat_progress_data_checksums</structname> View</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        Column Type
+       </para>
+       <para>
+        Description>
+       </para>
+      </entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>pid</structfield> <type>integer</type>
+       </para>
+       <para>
+        Process ID of a datachecksumworker process.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datid</structfield> <type>oid</type>
+      </para>
+      <para>
+       OID of this database, or 0 for the launcher process
+       relation
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datname</structfield> <type>name</type>
+      </para>
+      <para>
+       Name of this database, or <literal>NULL</literal> for the
+       launcher process.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>phase</structfield> <type>text</type>
+       </para>
+       <para>
+        Current processing phase, see <xref linkend="datachecksum-phases"/>
+        for description of the phases.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of databases which will be processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of databases which have been processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of relations which will be processed, or
+        <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of relations yet. The launcher process has
+        this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of relations which have been processed. The launcher
+        process has this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>blocks_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks in the current relation which will be processed,
+        or <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of blocks yet. The launcher process has
+        this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>blocks_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks in the current relation which have been processed.
+        The launcher process has this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+  <table id="datachecksum-phases">
+   <title>Data Checksum Phases</title>
+   <tgroup cols="2">
+    <colspec colname="col1" colwidth="1*"/>
+    <colspec colname="col2" colwidth="2*"/>
+    <thead>
+     <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+    <tbody>
+     <row>
+      <entry><literal>enabling</literal></entry>
+      <entry>
+       The command is currently enabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>disabling</literal></entry>
+      <entry>
+       The command is currently disabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on backends</literal></entry>
+      <entry>
+       The command is currently waiting for backends to acknowledge the data
+       checksum operation.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on temporary tables</literal></entry>
+      <entry>
+       The command is currently waiting for all temporary tables which existed
+       at the time the command was started to be removed.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on checkpoint</literal></entry>
+      <entry>
+       The command is currently waiting for a checkpoint to update the checksum
+       state at the end.
+      </entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+ </sect2>
+
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index 95043aa329c..0343710af53 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -45,6 +45,12 @@ PostgreSQL documentation
    exit status is nonzero if the operation failed.
   </para>
 
+  <para>
+   When enabling checksums, if checksums were in the process of being enabled
+   when the cluster was shut down, <application>pg_checksums</application>
+   will still process all relations regardless of the online processing.
+  </para>
+
   <para>
    When verifying checksums, every file in the cluster is scanned. When
    enabling checksums, each relation file block with a changed checksum is
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index f3b86b26be9..0ada90ca0b1 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -246,9 +246,10 @@
   <para>
    Checksums can be disabled when the cluster is initialized using <link
    linkend="app-initdb-data-checksums"><application>initdb</application></link>.
-   They can also be enabled or disabled at a later time as an offline
-   operation. Data checksums are enabled or disabled at the full cluster
-   level, and cannot be specified individually for databases or tables.
+   They can also be enabled or disabled at a later time either as an offline
+   operation or online in a running cluster allowing concurrent access. Data
+   checksums are enabled or disabled at the full cluster level, and cannot be
+   specified individually for databases or tables.
   </para>
 
   <para>
@@ -265,7 +266,7 @@
   </para>
 
   <sect2 id="checksums-offline-enable-disable">
-   <title>Off-line Enabling of Checksums</title>
+   <title>Offline Enabling of Checksums</title>
 
    <para>
     The <link linkend="app-pgchecksums"><application>pg_checksums</application></link>
@@ -274,6 +275,56 @@
    </para>
 
   </sect2>
+
+  <sect2 id="checksums-online-enable-disable">
+   <title>Online Enabling of Checksums</title>
+
+   <para>
+    Checksums can be enabled or disabled online, by calling the appropriate
+    <link linkend="functions-admin-checksum">functions</link>.
+   </para>
+
+   <para>
+    Enabling checksums will put the cluster checksum mode in
+    <literal>inprogress-on</literal> mode.  During this time, checksums will be
+    written but not verified. In addition to this, a background worker process
+    is started that enables checksums on all existing data in the cluster. Once
+    this worker has completed processing all databases in the cluster, the
+    checksum mode will automatically switch to <literal>on</literal>. The
+    processing will consume two background worker processes, make sure that
+    <varname>max_worker_processes</varname> allows for at least two more
+    additional processes.
+   </para>
+
+   <para>
+    The process will initially wait for all open transactions to finish before
+    it starts, so that it can be certain that there are no tables that have been
+    created inside a transaction that has not committed yet and thus would not
+    be visible to the process enabling checksums. It will also, for each database,
+    wait for all pre-existing temporary tables to get removed before it finishes.
+    If long-lived temporary tables are used in the application it may be necessary
+    to terminate these application connections to allow the process to complete.
+   </para>
+
+   <para>
+    If the cluster is stopped while in <literal>inprogress-on</literal> mode, for
+    any reason, then this process must be restarted manually. To do this,
+    re-execute the function <function>pg_enable_data_checksums()</function>
+    once the cluster has been restarted. The process will start over, there is
+    no support for resuming work from where it was interrupted.
+   </para>
+
+   <note>
+    <para>
+     Enabling checksums can cause significant I/O to the system, as most of the
+     database pages will need to be rewritten, and will be written both to the
+     data files and the WAL. The impact may be limited by throttling using the
+     <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter>
+     parameters of the <function>pg_enable_data_checksums</function> function.
+    </para>
+   </note>
+
+  </sect2>
  </sect1>
 
   <sect1 id="wal-intro">
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 58040f28656..1e3ad2ab68d 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -18,6 +18,7 @@
 #include "access/xlog.h"
 #include "access/xlog_internal.h"
 #include "catalog/pg_control.h"
+#include "storage/bufpage.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -167,6 +168,26 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
 		memcpy(&wal_level, rec, sizeof(int));
 		appendStringInfo(buf, "wal_level %s", get_wal_level_string(wal_level));
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state xlrec;
+
+		memcpy(&xlrec, rec, sizeof(xl_checksum_state));
+		switch (xlrec.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_VERSION:
+				appendStringInfoString(buf, "on");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				appendStringInfoString(buf, "inprogress-off");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				appendStringInfoString(buf, "inprogress-on");
+				break;
+			default:
+				appendStringInfoString(buf, "off");
+		}
+	}
 }
 
 const char *
@@ -218,6 +239,9 @@ xlog_identify(uint8 info)
 		case XLOG_CHECKPOINT_REDO:
 			id = "CHECKPOINT_REDO";
 			break;
+		case XLOG_CHECKSUMS:
+			id = "CHECKSUMS";
+			break;
 	}
 
 	return id;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 799fc739e18..f137cdc6d42 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -647,6 +647,24 @@ static XLogRecPtr LocalMinRecoveryPoint;
 static TimeLineID LocalMinRecoveryPointTLI;
 static bool updateMinRecoveryPoint = true;
 
+/*
+ * Local state fror Controlfile data_checksum_version. After initialization
+ * this is only updated when absorbing a procsignal barrier during interrupt
+ * processing. The reason for keeping a copy in backend-private memory is to
+ * avoid locking for interrogating checksum state. Possible values are the
+ * checksum versions defined in storage/bufpage.h as well as zero when data
+ * checksums are disabled.
+ */
+static uint32 LocalDataChecksumVersion = 0;
+
+/*
+ * Variable backing the GUC, keep it in sync with LocalDataChecksumVersion.
+ * See SetLocalDataChecksumVersion().
+ */
+int			data_checksums = 0;
+
+static void SetLocalDataChecksumVersion(uint32 data_checksum_version);
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -715,6 +733,8 @@ static void WALInsertLockAcquireExclusive(void);
 static void WALInsertLockRelease(void);
 static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);
 
+static void XLogChecksums(ChecksumType new_type);
+
 /*
  * Insert an XLOG record represented by an already-constructed chain of data
  * chunks.  This is a low-level routine; to construct the WAL record header
@@ -828,9 +848,10 @@ XLogInsertRecord(XLogRecData *rdata,
 		 * only happen just after a checkpoint, so it's better to be slow in
 		 * this case and fast otherwise.
 		 *
-		 * Also check to see if fullPageWrites was just turned on or there's a
-		 * running backup (which forces full-page writes); if we weren't
-		 * already doing full-page writes then go back and recompute.
+		 * Also check to see if fullPageWrites was just turned on, there's a
+		 * running backup or if checksums are enabled (all of which forces
+		 * full-page writes); if we weren't already doing full-page writes
+		 * then go back and recompute.
 		 *
 		 * If we aren't doing full-page writes then RedoRecPtr doesn't
 		 * actually affect the contents of the XLOG record, so we'll update
@@ -843,7 +864,9 @@ XLogInsertRecord(XLogRecData *rdata,
 			Assert(RedoRecPtr < Insert->RedoRecPtr);
 			RedoRecPtr = Insert->RedoRecPtr;
 		}
-		doPageWrites = (Insert->fullPageWrites || Insert->runningBackups > 0);
+		doPageWrites = (Insert->fullPageWrites ||
+						Insert->runningBackups > 0 ||
+						DataChecksumsNeedWrite());
 
 		if (doPageWrites &&
 			(!prevDoPageWrites ||
@@ -4579,9 +4602,7 @@ ReadControlFile(void)
 
 	CalculateCheckpointSegments();
 
-	/* Make the initdb settings visible as GUC variables, too */
-	SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no",
-					PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
+	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
 }
 
 /*
@@ -4615,13 +4636,376 @@ GetMockAuthenticationNonce(void)
 }
 
 /*
- * Are checksums enabled for data pages?
+ * DataChecksumsNeedWrite
+ *		Returns whether data checksums must be written or not
+ *
+ * Returns true iff data checksums are enabled or are in the process of being
+ * enabled. During "inprogress-on" and "inprogress-off" states checksums must
+ * be written even though they are not verified (see datachecksumsworker.c for
+ * a longer discussion).
+ *
+ * This function is intended for callsites which are about to write a data page
+ * to storage, and need to know whether to re-calculate the checksum for the
+ * page header. Calling this function must be performed as close to the write
+ * operation as possible to keep the critical section short.
+ */
+bool
+DataChecksumsNeedWrite(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * DataChecksumsNeedVerify
+ *		Returns whether data checksums must be verified or not
+ *
+ * Data checksums are only verified if they are fully enabled in the cluster.
+ * During the "inprogress-on" and "inprogress-off" states they are only
+ * updated, not verified (see datachecksumsworker.c for a longer discussion).
+ *
+ * This function is intended for callsites which have read data and are about
+ * to perform checksum validation based on the result of this.  Calling this
+ * function must be performed as close to the validation call as possible to
+ * keep the critical section short. This is in order to protect against time of
+ * check/time of use situations around data checksum validation.
  */
 bool
-DataChecksumsEnabled(void)
+DataChecksumsNeedVerify(void)
 {
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
+}
+
+/*
+ * DataChecksumsOnInProgress
+ *		Returns whether data checksums are being enabled
+ *
+ * Most operations don't need to worry about the "inprogress" states, and
+ * should use DataChecksumsNeedVerify() or DataChecksumsNeedWrite(). The
+ * "inprogress-on" state for enabling checksums is used when the checksum
+ * worker is setting checksums on all pages, it can thus be used to check for
+ * aborted checksum processing which need to be restarted.
+ */
+inline bool
+DataChecksumsOnInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+}
+
+/*
+ * DataChecksumsOffInProgress
+ *		Returns whether data checksums are being disabled
+ *
+ * The "inprogress-off" state for disabling checksums is used for when the
+ * worker resets the catalog state.  DataChecksumsNeedVerify() or
+ * DataChecksumsNeedWrite() should be used for deciding whether to read/write
+ * checksums.
+ */
+bool
+DataChecksumsOffInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * SetDataChecksumsOnInProgress
+ *		Sets the data checksum state to "inprogress-on" to enable checksums
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on". See
+ * SetDataChecksumsOn below for a description on how this state change works.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOnInProgress(void)
+{
+	uint64		barrier;
+
 	Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+
+	/*
+	 * The state transition is performed in a critical section with
+	 * checkpoints held off to provide crash safety.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state change in all backends to ensure that all backends are in
+	 * "inprogress-on". Once done we know that all backends are writing data
+	 * checksums.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOn
+ *		Enables data checksums cluster-wide
+ *
+ * Enabling data checksums is performed using two barriers, the first one to
+ * set the checksums state to "inprogress-on" (which is performed by
+ * SetDataChecksumsOnInProgress()) and the second one to set the state to "on"
+ * (performed here).
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on".  This state
+ * requires data checksums to be written but not verified. This ensures that
+ * all data pages can be checksummed without the risk of false negatives in
+ * validation during the process.  When all existing pages are guaranteed to
+ * have checksums, and all new pages will be initiated with checksums, the
+ * state can be changed to "on". Once the state is "on" checksums will be both
+ * written and verified. See datachecksumsworker.c for a longer discussion on
+ * how data checksums can be enabled in a running cluster.
+ *
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOn(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile != NULL);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/*
+	 * The only allowed state transition to "on" is from "inprogress-on" since
+	 * that state ensures that all pages will have data checksums written.
+	 */
+	if (ControlFile->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+		elog(ERROR, "checksums not in \"inprogress-on\" mode");
+	}
+
+	LWLockRelease(ControlFileLock);
+
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_VERSION);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state transition of "on" in all backends. When done we know that
+	 * data checksums are enabled in all backends and data checksums are both
+	 * written and verified.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOff
+ *		Disables data checksums cluster-wide
+ *
+ * Disabling data checksums must be performed with two sets of barriers, each
+ * carrying a different state. The state is first set to "inprogress-off"
+ * during which checksums are still written but not verified. This ensures that
+ * backends which have yet to observe the state change from "on" won't get
+ * validation errors on concurrently modified pages. Once all backends have
+ * changed to "inprogress-off", the barrier for moving to "off" can be emitted.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOff(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+
+	/* If data checksums are already disabled there is nothing to do */
+	if (ControlFile->data_checksum_version == 0)
+	{
+		LWLockRelease(ControlFileLock);
+		return;
+	}
+
+	/*
+	 * If data checksums are currently enabled we first transition to the
+	 * "inprogress-off" state during which backends continue to write
+	 * checksums without verifying them. When all backends are in
+	 * "inprogress-off" the next transition to "off" can be performed, after
+	 * which all data checksum processing is disabled.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
+	{
+		LWLockRelease(ControlFileLock);
+
+		MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+		START_CRIT_SECTION();
+
+		XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+		LWLockRelease(ControlFileLock);
+
+		barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+
+		END_CRIT_SECTION();
+		MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+
+		/*
+		 * Update local state in all backends to ensure that any backend in
+		 * "on" state is changed to "inprogress-off".
+		 */
+		WaitForProcSignalBarrier(barrier);
+
+		/*
+		 * At this point we know that no backends are verifying data checksums
+		 * during reading. Next, we can safely move to state "off" to also
+		 * stop writing checksums.
+		 */
+	}
+	else
+	{
+		/*
+		 * Ending up here implies that the checksums state is "inprogress-on"
+		 * or "inprogress-off" and we can transition directly to "off" from
+		 * there.
+		 */
+		LWLockRelease(ControlFileLock);
+	}
+
+	/*
+	 * Ensure that we don't incur a checkpoint during disabling checksums.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(0);
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	ControlFile->data_checksum_version = 0;
+	LWLockRelease(ControlFileLock);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * ProcSignalBarrier absorption functions for enabling and disabling data
+ * checksums in a running cluster. The procsignalbarriers are emitted in the
+ * SetDataChecksums* functions.
+ */
+bool
+AbsorbChecksumsOnInProgressBarrier(void)
+{
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	return true;
+}
+
+bool
+AbsorbChecksumsOnBarrier(void)
+{
+	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_VERSION);
+	return true;
+}
+
+bool
+AbsorbChecksumsOffInProgressBarrier(void)
+{
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+	return true;
+}
+
+bool
+AbsorbChecksumsOffBarrier(void)
+{
+	SetLocalDataChecksumVersion(0);
+	return true;
+}
+
+/*
+ * InitLocalControlData
+ *
+ * Set up backend local caches of controldata variables which may change at
+ * any point during runtime and thus require special cased locking. So far
+ * this only applies to data_checksum_version, but it's intended to be general
+ * purpose enough to handle future cases.
+ */
+void
+InitLocalControldata(void)
+{
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
+	LWLockRelease(ControlFileLock);
+}
+
+/*
+ * XXX probably should be called in all places that modify the value of
+ * LocalDataChecksumVersion (to make sure data_checksums GUC is in sync)
+ *
+ * XXX aren't PG_DATA_ and DATA_ constants the same? why do we need both?
+ */
+void
+SetLocalDataChecksumVersion(uint32 data_checksum_version)
+{
+	LocalDataChecksumVersion = data_checksum_version;
+
+	switch (LocalDataChecksumVersion)
+	{
+		case PG_DATA_CHECKSUM_VERSION:
+			data_checksums = DATA_CHECKSUMS_ON;
+			break;
+
+		case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+			data_checksums = DATA_CHECKSUMS_INPROGRESS_ON;
+			break;
+
+		case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+			data_checksums = DATA_CHECKSUMS_INPROGRESS_OFF;
+			break;
+
+		default:
+			data_checksums = DATA_CHECKSUMS_OFF;
+			break;
+	}
+}
+
+/* guc hook */
+const char *
+show_data_checksums(void)
+{
+	if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)
+		return "on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		return "inprogress-on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+		return "inprogress-off";
+	else
+		return "off";
 }
 
 /*
@@ -6194,6 +6578,33 @@ StartupXLOG(void)
 	 */
 	CompleteCommitTsInitialization();
 
+	/*
+	 * If we reach this point with checksums being enabled ("inprogress-on"
+	 * state), we notify the user that they need to manually restart the
+	 * process to enable checksums. This is because we cannot launch a dynamic
+	 * background worker directly from here, it has to be launched from a
+	 * regular backend.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		ereport(WARNING,
+				(errmsg("data checksums are being enabled, but no worker is running"),
+				 errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
+
+	/*
+	 * If data checksums were being disabled when the cluster was shut down,
+	 * we know that we have a state where all backends have stopped validating
+	 * checksums and we can move to off instead of prompting the user to
+	 * perform any action.
+	 */
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+	{
+		XLogChecksums(0);
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = 0;
+		LWLockRelease(ControlFileLock);
+	}
+
 	/*
 	 * All done with end-of-recovery actions.
 	 *
@@ -8201,6 +8612,24 @@ XLogReportParameters(void)
 	}
 }
 
+/*
+ * Log the new state of checksums
+ */
+static void
+XLogChecksums(ChecksumType new_type)
+{
+	xl_checksum_state xlrec;
+	XLogRecPtr	recptr;
+
+	xlrec.new_checksumtype = new_type;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, sizeof(xl_checksum_state));
+
+	recptr = XLogInsert(RM_XLOG_ID, XLOG_CHECKSUMS);
+	XLogFlush(recptr);
+}
+
 /*
  * Update full_page_writes in shared memory, and write an
  * XLOG_FPW_CHANGE record if necessary.
@@ -8629,6 +9058,46 @@ xlog_redo(XLogReaderState *record)
 	{
 		/* nothing to do here, just for informational purposes */
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state state;
+		uint64		barrier;
+
+		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
+
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		ControlFile->data_checksum_version = state.new_checksumtype;
+		LWLockRelease(ControlFileLock);
+
+		/*
+		 * Block on a procsignalbarrier to await all processes having seen the
+		 * change to checksum status. Once the barrier has been passed we can
+		 * initiate the corresponding processing.
+		 */
+		switch (state.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			default:
+				Assert(state.new_checksumtype == 0);
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+		}
+	}
 }
 
 /*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 8c3090165f0..6b1c2ed85ea 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -26,6 +26,7 @@
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "postmaster/datachecksumsworker.h"
 #include "replication/walreceiver.h"
 #include "storage/fd.h"
 #include "storage/latch.h"
@@ -748,3 +749,43 @@ pg_promote(PG_FUNCTION_ARGS)
 						   wait_seconds)));
 	PG_RETURN_BOOL(false);
 }
+
+/*
+ * Disables data checksums for the cluster, if applicable. Starts a background
+ * worker which turns off the data checksums.
+ */
+Datum
+disable_data_checksums(PG_FUNCTION_ARGS)
+{
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	StartDataChecksumsWorkerLauncher(false, 0, 0, false);
+	PG_RETURN_VOID();
+}
+
+/*
+ * Enables data checksums for the cluster, if applicable.  Supports vacuum-
+ * like cost based throttling to limit system load. Starts a background worker
+ * which updates data checksums on existing data.
+ */
+Datum
+enable_data_checksums(PG_FUNCTION_ARGS)
+{
+	int			cost_delay = PG_GETARG_INT32(0);
+	int			cost_limit = PG_GETARG_INT32(1);
+	bool		fast = PG_GETARG_BOOL(2);
+
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	if (cost_delay < 0)
+		ereport(ERROR, errmsg("cost delay cannot be a negative value"));
+
+	if (cost_limit <= 0)
+		ereport(ERROR, errmsg("cost limit must be greater than zero"));
+
+	StartDataChecksumsWorkerLauncher(true, cost_delay, cost_limit, fast);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 3f8a3c55725..598dd41bae6 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1613,7 +1613,8 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	 * enabled for this cluster, and if this is a relation file, then verify
 	 * the checksum.
 	 */
-	if (!noverify_checksums && DataChecksumsEnabled() &&
+	if (!noverify_checksums &&
+		DataChecksumsNeedWrite() &&
 		RelFileNumberIsValid(relfilenumber))
 		verify_checksum = true;
 
@@ -2006,6 +2007,9 @@ verify_page_checksum(Page page, XLogRecPtr start_lsn, BlockNumber blkno,
 	if (PageIsNew(page) || PageGetLSN(page) >= start_lsn)
 		return true;
 
+	if (!DataChecksumsNeedVerify())
+		return true;
+
 	/* Perform the actual checksum calculation. */
 	checksum = pg_checksum_page(page, blkno);
 
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 566f308e443..b18db2b5dde 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -650,6 +650,13 @@ LANGUAGE INTERNAL
 CALLED ON NULL INPUT VOLATILE PARALLEL SAFE
 AS 'pg_stat_reset_slru';
 
+CREATE OR REPLACE FUNCTION
+  pg_enable_data_checksums(cost_delay integer DEFAULT 0,
+                           cost_limit integer DEFAULT 100,
+						   fast boolean DEFAULT false)
+  RETURNS void STRICT VOLATILE LANGUAGE internal AS 'enable_data_checksums'
+  PARALLEL RESTRICTED;
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
@@ -775,6 +782,10 @@ REVOKE EXECUTE ON FUNCTION pg_ls_logicalmapdir() FROM PUBLIC;
 
 REVOKE EXECUTE ON FUNCTION pg_ls_replslotdir(text) FROM PUBLIC;
 
+REVOKE EXECUTE ON FUNCTION pg_enable_data_checksums(integer, integer, boolean) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_disable_data_checksums() FROM public;
+
 --
 -- We also set up some things as accessible to standard roles.
 --
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index a4d2cfdcaf5..4330d0ad656 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1334,6 +1334,27 @@ CREATE VIEW pg_stat_progress_copy AS
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_data_checksums AS
+    SELECT
+        S.pid AS pid, S.datid, D.datname AS datname,
+        CASE S.param1 WHEN 0 THEN 'enabling'
+                      WHEN 1 THEN 'disabling'
+                      WHEN 2 THEN 'waiting'
+                      WHEN 3 THEN 'waiting on backends'
+                      WHEN 4 THEN 'waiting on temporary tables'
+                      WHEN 5 THEN 'waiting on checkpoint'
+                      WHEN 6 THEN 'done'
+                      END AS phase,
+        CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+        S.param3 AS databases_done,
+        CASE S.param4 WHEN -1 THEN NULL ELSE S.param4 END AS relations_total,
+        CASE S.param5 WHEN -1 THEN NULL ELSE S.param5 END AS relations_done,
+        CASE S.param6 WHEN -1 THEN NULL ELSE S.param6 END AS blocks_total,
+        CASE S.param7 WHEN -1 THEN NULL ELSE S.param7 END AS blocks_done
+    FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid
+    ORDER BY S.datid; -- return the launcher process first
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/postmaster/Makefile b/src/backend/postmaster/Makefile
index 0f4435d2d97..0c36765acfe 100644
--- a/src/backend/postmaster/Makefile
+++ b/src/backend/postmaster/Makefile
@@ -18,6 +18,7 @@ OBJS = \
 	bgworker.o \
 	bgwriter.o \
 	checkpointer.o \
+	datachecksumsworker.o \
 	fork_process.o \
 	interrupt.o \
 	launch_backend.o \
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 116ddf7b835..5720f66b0fd 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -18,6 +18,7 @@
 #include "pgstat.h"
 #include "port/atomics.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "replication/logicallauncher.h"
 #include "replication/logicalworker.h"
@@ -132,6 +133,12 @@ static const struct
 	},
 	{
 		"TablesyncWorkerMain", TablesyncWorkerMain
+	},
+	{
+		"DataChecksumsWorkerLauncherMain", DataChecksumsWorkerLauncherMain
+	},
+	{
+		"DataChecksumsWorkerMain", DataChecksumsWorkerMain
 	}
 };
 
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
new file mode 100644
index 00000000000..bbbce61cfa6
--- /dev/null
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -0,0 +1,1447 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.c
+ *	  Background worker for enabling or disabling data checksums online
+ *
+ * When enabling data checksums on a database at initdb time or when shut down
+ * with pg_checksums, no extra process is required as each page is checksummed,
+ * and verified, when accessed.  When enabling checksums on an already running
+ * cluster, this worker will ensure that all pages are checksummed before
+ * verification of the checksums is turned on. In the case of disabling
+ * checksums, the state transition is performed only in the control file, no
+ * changes are performed on the data pages.
+ *
+ * Checksums can be either enabled or disabled cluster-wide, with on/off being
+ * the end state for data_checksums.
+ *
+ * Enabling checksums
+ * ------------------
+ * When enabling checksums in an online cluster, data_checksums will be set to
+ * "inprogress-on" which signals that write operations MUST compute and write
+ * the checksum on the data page, but during reading the checksum SHALL NOT be
+ * verified. This ensures that all objects created during checksumming will
+ * have checksums set, but no reads will fail due to incorrect checksum. The
+ * DataChecksumsWorker will compile a list of databases which exist at the
+ * start of checksumming, and all of these which haven't been dropped during
+ * the processing MUST have been processed successfully in order for checksums
+ * to be enabled. Any new relation created during processing will see the
+ * in-progress state and will automatically be checksummed.
+ *
+ * For each database, all relations which have storage are read and every data
+ * page is marked dirty to force a write with the checksum. This will generate
+ * a lot of WAL as the entire database is read and written.
+ *
+ * If the processing is interrupted by a cluster restart, it will be restarted
+ * from the beginning again as state isn't persisted.
+ *
+ * Disabling checksums
+ * -------------------
+ * When disabling checksums, data_checksums will be set to "inprogress-off"
+ * which signals that checksums are written but no longer verified. This ensure
+ * that backends which have yet to move from the "on" state will still be able
+ * to process data checksum validation.
+ *
+ * Synchronization and Correctness
+ * -------------------------------
+ * The processes involved in enabling, or disabling, data checksums in an
+ * online cluster must be properly synchronized with the normal backends
+ * serving concurrent queries to ensure correctness. Correctness is defined
+ * as the following:
+ *
+ *    - Backends SHALL NOT violate local data_checksums state
+ *    - Data checksums SHALL NOT be considered enabled cluster-wide until all
+ *      currently connected backends have the local state "enabled"
+ *
+ * There are two levels of synchronization required for enabling data checksums
+ * in an online cluster: (i) changing state in the active backends ("on",
+ * "off", "inprogress-on" and "inprogress-off"), and (ii) ensuring no
+ * incompatible objects and processes are left in a database when workers end.
+ * The former deals with cluster-wide agreement on data checksum state and the
+ * latter with ensuring that any concurrent activity cannot break the data
+ * checksum contract during processing.
+ *
+ * Synchronizing the state change is done with procsignal barriers, where the
+ * WAL logging backend updating the global state in the controlfile will wait
+ * for all other backends to absorb the barrier. Barrier absorption will happen
+ * during interrupt processing, which means that connected backends will change
+ * state at different times. To prevent data checksum state changes when
+ * writing and verifying checksums, interrupts shall be held off before
+ * interrogating state and resumed when the IO operation has been performed.
+ *
+ *   When Enabling Data Checksums
+ *   ----------------------------
+ *   A process which fails to observe data checksums being enabled can induce
+ *   two types of errors: failing to write the checksum when modifying the page
+ *   and failing to validate the data checksum on the page when reading it.
+ *
+ *   When processing starts all backends belong to one of the below sets, with
+ *   one set being empty:
+ *
+ *   Bd: Backends in "off" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   If processing is started in an online cluster then all backends are in Bd.
+ *   If processing was halted by the cluster shutting down, the controlfile
+ *   state "inprogress-on" will be observed on system startup and all backends
+ *   will be in Bd. Backends transition Bd -> Bi via a procsignalbarrier.  When
+ *   the DataChecksumsWorker has finished writing checksums on all pages and
+ *   enables data checksums cluster-wide, there are four sets of backends where
+ *   Bd shall be an empty set:
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   Backends in Bi and Be will write checksums when modifying a page, but only
+ *   backends in Be will verify the checksum during reading. The Bg backend is
+ *   blocked waiting for all backends in Bi to process interrupts and move to
+ *   Be. Any backend starting while Bg is waiting on the procsignalbarrier will
+ *   observe the global state being "on" and will thus automatically belong to
+ *   Be.  Checksums are enabled cluster-wide when Bi is an empty set. Bi and Be
+ *   are compatible sets while still operating based on their local state as
+ *   both write data checksums.
+ *
+ *   When Disabling Data Checksums
+ *   -----------------------------
+ *   A process which fails to observe that data checksums have been disabled
+ *   can induce two types of errors: writing the checksum when modifying the
+ *   page and validating a data checksum which is no longer correct due to
+ *   modifications to the page.
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bo: Backends in "inprogress-off" state
+ *
+ *   Backends transition from the Be state to Bd like so: Be -> Bo -> Bd
+ *
+ *   The goal is to transition all backends to Bd making the others empty sets.
+ *   Backends in Bo write data checksums, but don't validate them, such that
+ *   backends still in Be can continue to validate pages until the barrier has
+ *   been absorbed such that they are in Bo. Once all backends are in Bo, the
+ *   barrier to transition to "off" can be raised and all backends can safely
+ *   stop writing data checksums as no backend is enforcing data checksum
+ *   validation any longer.
+ *
+ *
+ * Potential optimizations
+ * -----------------------
+ * Below are some potential optimizations and improvements which were brought
+ * up during reviews of this feature, but which weren't implemented in the
+ * initial version. These are ideas listed without any validation on their
+ * feasibility or potential payoff. More discussion on these can be found on
+ * the -hackers threads linked to in the commit message of this feature.
+ *
+ *   * Launching datachecksumsworker for resuming operation from the startup
+ *     process: Currently users have to restart processing manually after a
+ *     restart since dynamic background worker cannot be started from the
+ *     postmaster. Changing the startup process could make restarting the
+ *     processing automatic on cluster restart.
+ *   * Avoid dirtying the page when checksums already match: Iff the checksum
+ *     on the page happens to already match we still dirty the page. It should
+ *     be enough to only do the log_newpage_buffer() call in that case.
+ *   * Invent a lightweight WAL record that doesn't contain the full-page
+ *     image but just the block number: On replay, the redo routine would read
+ *     the page from disk.
+ *   * Teach pg_checksums to avoid checksummed pages when pg_checksums is used
+ *     to enable checksums on a cluster which is in inprogress-on state and
+ *     may have checksummed pages (make pg_checksums be able to resume an
+ *     online operation).
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/postmaster/datachecksumsworker.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/htup_details.h"
+#include "access/xact.h"
+#include "access/xloginsert.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_database.h"
+#include "commands/progress.h"
+#include "commands/vacuum.h"
+#include "common/relpath.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "postmaster/bgworker.h"
+#include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/bufmgr.h"
+#include "storage/checksum.h"
+#include "storage/ipc.h"
+#include "storage/lmgr.h"
+#include "storage/lwlock.h"
+#include "storage/procarray.h"
+#include "storage/smgr.h"
+#include "tcop/tcopprot.h"
+#include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
+#include "utils/ps_status.h"
+#include "utils/syscache.h"
+
+/*
+ * Number of times we retry to open a database before giving up and consider
+ * it to have failed processing.
+ */
+#define DATACHECKSUMSWORKER_MAX_DB_RETRIES 5
+
+typedef enum
+{
+	DATACHECKSUMSWORKER_SUCCESSFUL = 0,
+	DATACHECKSUMSWORKER_ABORTED,
+	DATACHECKSUMSWORKER_FAILED,
+	DATACHECKSUMSWORKER_RETRYDB,
+} DataChecksumsWorkerResult;
+
+/*
+ * Signaling between backends calling pg_enable/disable_data_checksums, the
+ * checksums launcher process, and the checksums worker process.
+ *
+ * This struct is protected by DataChecksumsWorkerLock
+ */
+typedef struct DataChecksumsWorkerShmemStruct
+{
+	/*
+	 * These are set by pg_enable/disable_data_checksums, to tell the launcher
+	 * what the target state is.
+	 */
+	bool		launch_enable_checksums;	/* True if checksums are being
+											 * enabled, else false */
+	int			launch_cost_delay;
+	int			launch_cost_limit;
+	bool		launch_fast;
+
+	/*
+	 * Is a launcher process is currently running?
+	 *
+	 * This is set by the launcher process, after it has read the above
+	 * launch_* parameters.
+	 */
+	bool		launcher_running;
+
+	/*
+	 * These fields indicate the target state that the launcher is currently
+	 * working towards. They can be different from the corresponding launch_*
+	 * fields, if a new pg_enable/disable_data_checksums() call was made while
+	 * the launcher/worker was already running.
+	 *
+	 * The below members are set when the launcher starts, and are only
+	 * accessed read-only by the single worker. Thus, we can access these
+	 * without a lock. If multiple workers, or dynamic cost parameters, are
+	 * supported at some point then this would need to be revisited.
+	 */
+	bool		enabling_checksums; /* True if checksums are being enabled,
+									 * else false */
+	int			cost_delay;
+	int			cost_limit;
+	bool		immediate_checkpoint;
+
+	/*
+	 * Signaling between the launcher and the worker process.
+	 *
+	 * As there is only a single worker, and the launcher won't read these
+	 * until the worker exits, they can be accessed without the need for a
+	 * lock. If multiple workers are supported then this will have to be
+	 * revisited.
+	 */
+
+	/* result, set by worker before exiting */
+	DataChecksumsWorkerResult success;
+
+	/*
+	 * tells the worker process whether it should also process the shared
+	 * catalogs
+	 */
+	bool		process_shared_catalogs;
+} DataChecksumsWorkerShmemStruct;
+
+/* Shared memory segment for datachecksumsworker */
+static DataChecksumsWorkerShmemStruct *DataChecksumsWorkerShmem;
+
+/* Bookkeeping for work to do */
+typedef struct DataChecksumsWorkerDatabase
+{
+	Oid			dboid;
+	char	   *dbname;
+} DataChecksumsWorkerDatabase;
+
+typedef struct DataChecksumsWorkerResultEntry
+{
+	Oid			dboid;
+	DataChecksumsWorkerResult result;
+	int			retries;
+} DataChecksumsWorkerResultEntry;
+
+
+/*
+ * Flag set by the interrupt handler
+ */
+static volatile sig_atomic_t abort_requested = false;
+
+/*
+ * Have we set the DataChecksumsWorkerShmemStruct->launcher_running flag?
+ * If we have, we need to clear it before exiting!
+ */
+static volatile sig_atomic_t launcher_running = false;
+
+/*
+ * Are we enabling data checksums, or disabling them?
+ */
+static bool enabling_checksums;
+
+/* Prototypes */
+static List *BuildDatabaseList(void);
+static List *BuildRelationList(bool temp_relations, bool include_shared);
+static void FreeDatabaseList(List *dblist);
+static DataChecksumsWorkerResult ProcessDatabase(DataChecksumsWorkerDatabase *db);
+static bool ProcessAllDatabases(bool immediate_checkpoint);
+static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy);
+static void launcher_cancel_handler(SIGNAL_ARGS);
+static void WaitForAllTransactionsToFinish(void);
+
+/*
+ * StartDataChecksumsWorkerLauncher
+ *		Main entry point for datachecksumsworker launcher process
+ *
+ * The main entrypoint for starting data checksums processing for enabling as
+ * well as disabling.
+ */
+void
+StartDataChecksumsWorkerLauncher(bool enable_checksums,
+								 int cost_delay,
+								 int cost_limit,
+								 bool fast)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	bool		launcher_running;
+
+	/* the cost delay settings have no effect when disabling */
+	Assert(enable_checksums || cost_delay == 0);
+	Assert(enable_checksums || cost_limit == 0);
+
+	/* store the desired state in shared memory */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	DataChecksumsWorkerShmem->launch_enable_checksums = enable_checksums;
+	DataChecksumsWorkerShmem->launch_cost_delay = cost_delay;
+	DataChecksumsWorkerShmem->launch_cost_limit = cost_limit;
+	DataChecksumsWorkerShmem->launch_fast = fast;
+
+	/* is the launcher already running? */
+	launcher_running = DataChecksumsWorkerShmem->launcher_running;
+
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * Launch a new launcher process, if it's not running already.
+	 *
+	 * If the launcher is currently busy enabling the checksums, and we want
+	 * them disabled (or vice versa), the launcher will notice that at latest
+	 * when it's about to exit, and will loop back process the new request. So
+	 * if the launcher is already running, we don't need to do anything more
+	 * here to abort it.
+	 *
+	 * If you call pg_enable/disable_data_checksums() twice in a row, before
+	 * the launcher has had a chance to start up, we still end up launching it
+	 * twice.  That's OK, the second invocation will see that a launcher is
+	 * already running and exit quickly.
+	 *
+	 * TODO: We could optimize here and skip launching the launcher, if we are
+	 * already in the desired state, i.e. if the checksums are already enabled
+	 * and you call pg_enable_data_checksums().
+	 */
+	if (!launcher_running)
+	{
+		/*
+		 * Prepare the BackgroundWorker and launch it.
+		 */
+		memset(&bgw, 0, sizeof(bgw));
+		bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+		bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+		snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+		snprintf(bgw.bgw_function_name, BGW_MAXLEN, "DataChecksumsWorkerLauncherMain");
+		snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker launcher");
+		snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker launcher");
+		bgw.bgw_restart_time = BGW_NEVER_RESTART;
+		bgw.bgw_notify_pid = MyProcPid;
+		bgw.bgw_main_arg = (Datum) 0;
+
+		if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+			ereport(ERROR,
+					(errmsg("failed to start background worker to process data checksums")));
+	}
+}
+
+/*
+ * ProcessSingleRelationFork
+ *		Enable data checksums in a single relation/fork.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy)
+{
+	BlockNumber numblocks = RelationGetNumberOfBlocksInFork(reln, forkNum);
+	BlockNumber blknum;
+	char		activity[NAMEDATALEN * 2 + 128];
+	char	   *relns;
+
+	relns = get_namespace_name(RelationGetNamespace(reln));
+
+	if (!relns)
+		return false;
+
+	/* Report the current relation to pgstat_activity */
+	snprintf(activity, sizeof(activity) - 1, "processing: %s.%s (%s, %dblocks)",
+			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	/* XXX only do this for main forks, maybe we should do it for all? */
+	if (forkNum == MAIN_FORKNUM)
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+									 numblocks);
+
+	/*
+	 * We are looping over the blocks which existed at the time of process
+	 * start, which is safe since new blocks are created with checksums set
+	 * already due to the state being "inprogress-on".
+	 */
+	for (blknum = 0; blknum < numblocks; blknum++)
+	{
+		Buffer		buf = ReadBufferExtended(reln, forkNum, blknum, RBM_NORMAL, strategy);
+
+		/* Need to get an exclusive lock before we can flag as dirty */
+		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+
+		/*
+		 * Mark the buffer as dirty and force a full page write.  We have to
+		 * re-write the page to WAL even if the checksum hasn't changed,
+		 * because if there is a replica it might have a slightly different
+		 * version of the page with an invalid checksum, caused by unlogged
+		 * changes (e.g. hintbits) on the master happening while checksums
+		 * were off. This can happen if there was a valid checksum on the page
+		 * at one point in the past, so only when checksums are first on, then
+		 * off, and then turned on again. Iff wal_level is set to "minimal",
+		 * this could be avoided iff the checksum is calculated to be correct.
+		 */
+		START_CRIT_SECTION();
+		MarkBufferDirty(buf);
+		log_newpage_buffer(buf, false);
+		END_CRIT_SECTION();
+
+		UnlockReleaseBuffer(buf);
+
+		/*
+		 * This is the only place where we check if we are asked to abort, the
+		 * abortion will bubble up from here. It's safe to check this without
+		 * a lock, because if we miss it being set, we will try again soon.
+		 */
+		Assert(enabling_checksums);
+		if (!DataChecksumsWorkerShmem->launch_enable_checksums)
+			abort_requested = true;
+		if (abort_requested)
+			return false;
+
+		/* XXX only do this for main forks, maybe we should do it for all? */
+		if (forkNum == MAIN_FORKNUM)
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+										 (blknum + 1));
+
+		vacuum_delay_point(false);
+	}
+
+	pfree(relns);
+	return true;
+}
+
+/*
+ * ProcessSingleRelationByOid
+ *		Process a single relation based on oid.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationByOid(Oid relationId, BufferAccessStrategy strategy)
+{
+	Relation	rel;
+	ForkNumber	fnum;
+	bool		aborted = false;
+
+	StartTransactionCommand();
+
+	rel = try_relation_open(relationId, AccessShareLock);
+	if (rel == NULL)
+	{
+		/*
+		 * Relation no longer exists. We don't consider this an error since
+		 * there are no pages in it that need data checksums, and thus return
+		 * true. The worker operates off a list of relations generated at the
+		 * start of processing, so relations being dropped in the meantime is
+		 * to be expected.
+		 */
+		CommitTransactionCommand();
+		pgstat_report_activity(STATE_IDLE, NULL);
+		return true;
+	}
+	RelationGetSmgr(rel);
+
+	for (fnum = 0; fnum <= MAX_FORKNUM; fnum++)
+	{
+		if (smgrexists(rel->rd_smgr, fnum))
+		{
+			if (!ProcessSingleRelationFork(rel, fnum, strategy))
+			{
+				aborted = true;
+				break;
+			}
+		}
+	}
+	relation_close(rel, AccessShareLock);
+	elog(DEBUG2,
+		 "data checksum processing done for relation with OID %u: %s",
+		 relationId, (aborted ? "aborted" : "finished"));
+
+	CommitTransactionCommand();
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return !aborted;
+}
+
+/*
+ * ProcessDatabase
+ *		Enable data checksums in a single database.
+ *
+ * We do this by launching a dynamic background worker into this database, and
+ * waiting for it to finish.  We have to do this in a separate worker, since
+ * each process can only be connected to one database during its lifetime.
+ */
+static DataChecksumsWorkerResult
+ProcessDatabase(DataChecksumsWorkerDatabase *db)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	BgwHandleStatus status;
+	pid_t		pid;
+	char		activity[NAMEDATALEN + 64];
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_FAILED;
+
+	memset(&bgw, 0, sizeof(bgw));
+	bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+	bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+	snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+	snprintf(bgw.bgw_function_name, BGW_MAXLEN, "%s", "DataChecksumsWorkerMain");
+	snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker worker");
+	snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker worker");
+	bgw.bgw_restart_time = BGW_NEVER_RESTART;
+	bgw.bgw_notify_pid = MyProcPid;
+	bgw.bgw_main_arg = ObjectIdGetDatum(db->dboid);
+
+	/*
+	 * If there are no worker slots available, make sure we retry processing
+	 * this database. This will make the datachecksumsworker move on to the
+	 * next database and quite likely fail with the same problem. TODO: Maybe
+	 * we need a backoff to avoid running through all the databases here in
+	 * short order.
+	 */
+	if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+	{
+		ereport(WARNING,
+				(errmsg("failed to start worker for enabling data checksums in database \"%s\", retrying",
+						db->dbname),
+				 errhint("The max_worker_processes setting might be too low.")));
+		return DATACHECKSUMSWORKER_RETRYDB;
+	}
+
+	status = WaitForBackgroundWorkerStartup(bgw_handle, &pid);
+	if (status == BGWH_STOPPED)
+	{
+		ereport(WARNING,
+				(errmsg("could not start background worker for enabling data checksums in database \"%s\"",
+						db->dbname),
+				 errhint("More details on the error might be found in the server log.")));
+		return DATACHECKSUMSWORKER_FAILED;
+	}
+
+	/*
+	 * If the postmaster crashed we cannot end up with a processed database so
+	 * we have no alternative other than exiting. When enabling checksums we
+	 * won't at this time have changed the pg_control version to enabled so
+	 * when the cluster comes back up processing will have to be restarted.
+	 * When disabling, the pg_control version will be set to off before this
+	 * so when the cluster comes up checksums will be off as expected.
+	 */
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("cannot enable data checksums without the postmaster process"),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	Assert(status == BGWH_STARTED);
+	ereport(DEBUG1,
+			(errmsg("initiating data checksum processing in database \"%s\"",
+					db->dbname)));
+
+	snprintf(activity, sizeof(activity) - 1,
+			 "Waiting for worker in database %s (pid %ld)", db->dbname, (long) pid);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	status = WaitForBackgroundWorkerShutdown(bgw_handle);
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("postmaster exited during data checksum processing in \"%s\"",
+						db->dbname),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	if (DataChecksumsWorkerShmem->success == DATACHECKSUMSWORKER_ABORTED)
+		ereport(LOG,
+				(errmsg("data checksums processing was aborted in database \"%s\"",
+						db->dbname)));
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return DataChecksumsWorkerShmem->success;
+}
+
+/*
+ * launcher_exit
+ *
+ * Internal routine for cleaning up state when the launcher process exits. We
+ * need to clean up the abort flag to ensure that processing can be restarted
+ * again after it was previously aborted.
+ */
+static void
+launcher_exit(int code, Datum arg)
+{
+	if (launcher_running)
+	{
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		launcher_running = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		LWLockRelease(DataChecksumsWorkerLock);
+	}
+}
+
+/*
+ * launcher_cancel_handler
+ *
+ * Internal routine for reacting to SIGINT and flagging the worker to abort.
+ * The worker won't be interrupted immediately but will check for abort flag
+ * between each block in a relation.
+ */
+static void
+launcher_cancel_handler(SIGNAL_ARGS)
+{
+	int			save_errno = errno;
+
+	abort_requested = true;
+
+	/*
+	 * There is no sleeping in the main loop, the flag will be checked
+	 * periodically in ProcessSingleRelationFork. The worker does however
+	 * sleep when waiting for concurrent transactions to end so we still need
+	 * to set the latch.
+	 */
+	SetLatch(MyLatch);
+
+	errno = save_errno;
+}
+
+/*
+ * WaitForAllTransactionsToFinish
+ *		Blocks awaiting all current transactions to finish
+ *
+ * Returns when all transactions which are active at the call of the function
+ * have ended, or if the postmaster dies while waiting. If the postmaster dies
+ * the abort flag will be set to indicate that the caller of this shouldn't
+ * proceed.
+ *
+ * NB: this will return early, if aborted by SIGINT or if the target state
+ * is changed while we're running.
+ */
+static void
+WaitForAllTransactionsToFinish(void)
+{
+	TransactionId waitforxid;
+
+	LWLockAcquire(XidGenLock, LW_SHARED);
+	waitforxid = XidFromFullTransactionId(TransamVariables->nextXid);
+	LWLockRelease(XidGenLock);
+
+	while (TransactionIdPrecedes(GetOldestActiveTransactionId(), waitforxid))
+	{
+		char		activity[64];
+		int			rc;
+
+		/* Oldest running xid is older than us, so wait */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for current transactions to finish (waiting for %u)",
+				 waitforxid);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		rc = WaitLatch(MyLatch,
+					   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+					   5000,
+					   WAIT_EVENT_CHECKSUM_ENABLE_STARTCONDITION);
+
+		/*
+		 * If the postmaster died we won't be able to enable checksums
+		 * cluster-wide so abort and hope to continue when restarted.
+		 */
+		if (rc & WL_POSTMASTER_DEATH)
+			ereport(FATAL,
+					(errmsg("postmaster exited during data checksum processing"),
+					 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_SHARED);
+		if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			abort_requested = true;
+		LWLockRelease(DataChecksumsWorkerLock);
+		if (abort_requested)
+			break;
+	}
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+	return;
+}
+
+/*
+ * DataChecksumsWorkerLauncherMain
+ *
+ * Main function for launching dynamic background workers for processing data
+ * checksums in databases. This function has the bgworker management, with
+ * ProcessAllDatabases being responsible for looping over the databases and
+ * initiating processing.
+ */
+void
+DataChecksumsWorkerLauncherMain(Datum arg)
+{
+	on_shmem_exit(launcher_exit, 0);
+
+	ereport(DEBUG1,
+			errmsg("background worker \"datachecksumsworker\" launcher started"));
+
+	pqsignal(SIGTERM, die);
+	pqsignal(SIGINT, launcher_cancel_handler);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_LAUNCHER;
+	init_ps_display(NULL);
+
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	if (DataChecksumsWorkerShmem->launcher_running)
+	{
+		/* Launcher was already running, let it finish */
+		LWLockRelease(DataChecksumsWorkerLock);
+		return;
+	}
+
+	launcher_running = true;
+
+	/*
+	 * Initialize a connection to shared catalogs only.
+	 */
+	BackgroundWorkerInitializeConnectionByOid(InvalidOid, InvalidOid, 0);
+
+	enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+	DataChecksumsWorkerShmem->launcher_running = true;
+	DataChecksumsWorkerShmem->enabling_checksums = enabling_checksums;
+	DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+	DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+	DataChecksumsWorkerShmem->immediate_checkpoint = DataChecksumsWorkerShmem->launch_fast;
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * The target state can change while we are busy enabling/disabling
+	 * checksums, if the user calls pg_disable/enable_data_checksums() before
+	 * we are finished with the previous request. In that case, we will loop
+	 * back here, to process the new request.
+	 */
+again:
+
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	/*
+	 * If we're asked to enable checksums, we need to check if processing was
+	 * previously interrupted such that we should resume rather than start
+	 * from scratch.
+	 */
+	if (enabling_checksums)
+	{
+		/*
+		 * If we are asked to enable checksums in a cluster which already has
+		 * checksums enabled, exit immediately as there is nothing more to do.
+		 * Hold interrupts to make sure state doesn't change during checking.
+		 */
+		HOLD_INTERRUPTS();
+		if (DataChecksumsNeedVerify())
+		{
+			RESUME_INTERRUPTS();
+			goto done;
+		}
+		RESUME_INTERRUPTS();
+
+		/*
+		 * Initialize progress and indicate that we are waiting on the other
+		 * backends to clear the procsignalbarrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS);
+
+		/* XXX isn't it weird there's no wait between the phase updates? */
+
+		/*
+		 * Set the state to inprogress-on and wait on the procsignal barrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_ENABLING);
+		SetDataChecksumsOnInProgress();
+
+		if (!ProcessAllDatabases(DataChecksumsWorkerShmem->immediate_checkpoint))
+		{
+			/*
+			 * If the target state changed during processing then it's not a
+			 * failure, so restart processing instead.
+			 */
+			LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+			if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			{
+				LWLockRelease(DataChecksumsWorkerLock);
+				goto done;
+			}
+			LWLockRelease(DataChecksumsWorkerLock);
+			ereport(ERROR,
+					(errmsg("unable to enable data checksums in cluster")));
+		}
+
+		SetDataChecksumsOn();
+	}
+	else
+	{
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_DISABLING);
+		SetDataChecksumsOff();
+	}
+
+done:
+
+	/*
+	 * All done. But before we exit, check if the target state was changed
+	 * while we were running. In that case we will have to start all over
+	 * again.
+	 */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+	if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+	{
+		DataChecksumsWorkerShmem->enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+		DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+		LWLockRelease(DataChecksumsWorkerLock);
+		goto again;
+	}
+
+	/* Shut down progress reporting as we are done */
+	pgstat_progress_end_command();
+
+	launcher_running = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	LWLockRelease(DataChecksumsWorkerLock);
+}
+
+/*
+ * ProcessAllDatabases
+ *		Compute the list of all databases and process checksums in each
+ *
+ * This will repeatedly generate a list of databases to process for enabling
+ * checksums. Until no new databases are found, this will loop around computing
+ * a new list and comparing it to the already seen ones.
+ *
+ * If immediate_checkpoint is set to true then a CHECKPOINT_IMMEDIATE will be
+ * issued. This is useful for testing but should be avoided in production use
+ * as it may affect cluster performance drastically.
+ */
+static bool
+ProcessAllDatabases(bool immediate_checkpoint)
+{
+	List	   *DatabaseList;
+	HTAB	   *ProcessedDatabases = NULL;
+	HASHCTL		hash_ctl;
+	bool		found_failed = false;
+	int			flags;
+
+	/* Initialize a hash tracking all processed databases */
+	memset(&hash_ctl, 0, sizeof(hash_ctl));
+	hash_ctl.keysize = sizeof(Oid);
+	hash_ctl.entrysize = sizeof(DataChecksumsWorkerResultEntry);
+	ProcessedDatabases = hash_create("Processed databases",
+									 64,
+									 &hash_ctl,
+									 HASH_ELEM | HASH_BLOBS);
+
+	/*
+	 * Set up so first run processes shared catalogs, but not once in every
+	 * db.
+	 */
+	DataChecksumsWorkerShmem->process_shared_catalogs = true;
+
+	/*
+	 * Get a list of all databases to process. This may include databases that
+	 * were created during our runtime.  Since a database can be created as a
+	 * copy of any other database (which may not have existed in our last
+	 * run), we have to repeat this loop until no new databases show up in the
+	 * list.
+	 */
+	DatabaseList = BuildDatabaseList();
+
+	/*
+	 * Update progress reporting with the total number of databases we need to
+	 * process. This number should not be changed during processing, the
+	 * columns for processed databases is instead increased such that it can
+	 * be compared against the total.
+	 */
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_DBS_TOTAL,
+			PROGRESS_DATACHECKSUMS_DBS_DONE,
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE,
+			PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+			PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+		};
+
+		int64	vals[6];
+
+		vals[0] = list_length(DatabaseList);
+		vals[1] = 0;
+
+		/* translated to NULL */
+		vals[2] = -1;
+		vals[3] = -1;
+		vals[4] = -1;
+		vals[5] = -1;
+
+		pgstat_progress_update_multi_param(6, index, vals);
+	}
+
+	while (true)
+	{
+		int			processed_databases = 0;
+
+		foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+		{
+			DataChecksumsWorkerResult result;
+			DataChecksumsWorkerResultEntry *entry;
+			bool		found;
+
+			/*
+			 * Check if this database has been processed already, and if so
+			 * whether it should be retried or skipped.
+			 */
+			entry = (DataChecksumsWorkerResultEntry *) hash_search(ProcessedDatabases, &db->dboid,
+																   HASH_FIND, NULL);
+
+			if (entry)
+			{
+				if (entry->result == DATACHECKSUMSWORKER_RETRYDB)
+				{
+					/*
+					 * Limit the number of retries to avoid infinite looping
+					 * in case there simply won't be enough workers in the
+					 * cluster to finish this operation.
+					 */
+					if (entry->retries > DATACHECKSUMSWORKER_MAX_DB_RETRIES)
+						entry->result = DATACHECKSUMSWORKER_FAILED;
+				}
+
+				/* Skip if this database has been processed already */
+				if (entry->result != DATACHECKSUMSWORKER_RETRYDB)
+					continue;
+			}
+
+			result = ProcessDatabase(db);
+			processed_databases++;
+
+			/*
+			 * Update the number of processed databases in the progress report.
+			 */
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_DBS_DONE,
+										 processed_databases);
+
+			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
+			{
+				/*
+				 * If one database has completed shared catalogs, we don't
+				 * have to process them again.
+				 */
+				if (DataChecksumsWorkerShmem->process_shared_catalogs)
+					DataChecksumsWorkerShmem->process_shared_catalogs = false;
+			}
+			else if (result == DATACHECKSUMSWORKER_ABORTED)
+			{
+				/* Abort flag set, so exit the whole process */
+				return false;
+			}
+
+			entry = hash_search(ProcessedDatabases, &db->dboid, HASH_ENTER, &found);
+			entry->dboid = db->dboid;
+			entry->result = result;
+			if (!found)
+				entry->retries = 0;
+			else
+				entry->retries++;
+		}
+
+		elog(DEBUG1,
+			 "%i databases processed for data checksum enabling, %s",
+			 processed_databases,
+			 (processed_databases ? "process with restart" : "process completed"));
+
+		FreeDatabaseList(DatabaseList);
+
+		/*
+		 * If no databases were processed in this run of the loop, we have now
+		 * finished all databases and no concurrently created ones can exist.
+		 */
+		if (processed_databases == 0)
+			break;
+
+		/*
+		 * Re-generate the list of databases for another pass. Since we wait
+		 * for all pre-existing transactions finish, this way we can be
+		 * certain that there are no databases left without checksums.
+		 */
+		WaitForAllTransactionsToFinish();
+		DatabaseList = BuildDatabaseList();
+	}
+
+	/*
+	 * ProcessedDatabases now has all databases and the results of their
+	 * processing. Failure to enable checksums for a database can be because
+	 * they actually failed for some reason, or because the database was
+	 * dropped between us getting the database list and trying to process it.
+	 * Get a fresh list of databases to detect the second case where the
+	 * database was dropped before we had started processing it. If a database
+	 * still exists, but enabling checksums failed then we fail the entire
+	 * checksumming process and exit with an error.
+	 */
+	WaitForAllTransactionsToFinish();
+	DatabaseList = BuildDatabaseList();
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+	{
+		DataChecksumsWorkerResult *entry;
+		bool		found;
+
+		entry = hash_search(ProcessedDatabases, (void *) &db->dboid,
+							HASH_FIND, &found);
+
+		/*
+		 * We are only interested in the databases where the failed database
+		 * still exists.
+		 */
+		if (found && *entry == DATACHECKSUMSWORKER_FAILED)
+		{
+			ereport(WARNING,
+					(errmsg("failed to enable data checksums in \"%s\"",
+							db->dbname)));
+			found_failed = found;
+			continue;
+		}
+	}
+
+	FreeDatabaseList(DatabaseList);
+
+	if (found_failed)
+	{
+		/* Disable checksums on cluster, because we failed */
+		SetDataChecksumsOff();
+		ereport(ERROR,
+				(errmsg("data checksums failed to get enabled in all databases, aborting"),
+				 errhint("The server log might have more information on the cause of the error.")));
+	}
+
+	/*
+	 * When enabling checksums, we have to wait for a checkpoint for the
+	 * checksums to e.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT);
+
+	/*
+	 * Force a checkpoint to get everything out to disk. The use of immediate
+	 * checkpoints is for running tests, as they would otherwise not execute
+	 * in such a way that they can reliably be placed under timeout control.
+	 */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_IMMEDIATE;
+	RequestCheckpoint(flags);
+
+	return true;
+}
+
+/*
+ * DataChecksumsWorkerShmemSize
+ *		Compute required space for datachecksumsworker-related shared memory
+ */
+Size
+DataChecksumsWorkerShmemSize(void)
+{
+	Size		size;
+
+	size = sizeof(DataChecksumsWorkerShmemStruct);
+	size = MAXALIGN(size);
+
+	return size;
+}
+
+/*
+ * DataChecksumsWorkerShmemInit
+ *		Allocate and initialize datachecksumsworker-related shared memory
+ */
+void
+DataChecksumsWorkerShmemInit(void)
+{
+	bool		found;
+
+	DataChecksumsWorkerShmem = (DataChecksumsWorkerShmemStruct *)
+		ShmemInitStruct("DataChecksumsWorker Data",
+						DataChecksumsWorkerShmemSize(),
+						&found);
+
+	if (!found)
+	{
+		MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
+
+		/*
+		 * Even if this is a redundant assignment, we want to be explicit about
+		 * our intent for readability, since we want to be able to query this
+		 * state in case of restartability.
+		 */
+		DataChecksumsWorkerShmem->launch_enable_checksums = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		DataChecksumsWorkerShmem->launch_fast = false;
+	}
+}
+
+/*
+ * BuildDatabaseList
+ *		Compile a list of all currently available databases in the cluster
+ *
+ * This creates the list of databases for the datachecksumsworker workers to
+ * add checksums to. If the caller wants to ensure that no concurrently
+ * running CREATE DATABASE calls exist, this needs to be preceded by a call
+ * to WaitForAllTransactionsToFinish().
+ */
+static List *
+BuildDatabaseList(void)
+{
+	List	   *DatabaseList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(DatabaseRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_database pgdb = (Form_pg_database) GETSTRUCT(tup);
+		DataChecksumsWorkerDatabase *db;
+
+		oldctx = MemoryContextSwitchTo(ctx);
+
+		db = (DataChecksumsWorkerDatabase *) palloc0(sizeof(DataChecksumsWorkerDatabase));
+
+		db->dboid = pgdb->oid;
+		db->dbname = pstrdup(NameStr(pgdb->datname));
+
+		DatabaseList = lappend(DatabaseList, db);
+
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return DatabaseList;
+}
+
+static void
+FreeDatabaseList(List *dblist)
+{
+	if (!dblist)
+		return;
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, dblist)
+	{
+		if (db->dbname != NULL)
+			pfree(db->dbname);
+	}
+
+	list_free_deep(dblist);
+}
+
+/*
+ * BuildRelationList
+ *		Compile a list of relations in the database
+ *
+ * Returns a list of OIDs for the request relation types. If temp_relations
+ * is True then only temporary relations are returned. If temp_relations is
+ * False then non-temporary relations which have data checksums are returned.
+ * If include_shared is True then shared relations are included as well in a
+ * non-temporary list. include_shared has no relevance when building a list of
+ * temporary relations.
+ */
+static List *
+BuildRelationList(bool temp_relations, bool include_shared)
+{
+	List	   *RelationList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(RelationRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_class pgc = (Form_pg_class) GETSTRUCT(tup);
+
+		/*
+		 * Only include temporary relations when asked for a temp relation
+		 * list.
+		 */
+		if (pgc->relpersistence == RELPERSISTENCE_TEMP)
+		{
+			if (!temp_relations)
+				continue;
+		}
+		else
+		{
+			/*
+			 * If we are only interested in temp relations then continue
+			 * immediately as the current relation isn't a temp relation.
+			 */
+			if (temp_relations)
+				continue;
+
+			if (!RELKIND_HAS_STORAGE(pgc->relkind))
+				continue;
+
+			if (pgc->relisshared && !include_shared)
+				continue;
+		}
+
+		oldctx = MemoryContextSwitchTo(ctx);
+		RelationList = lappend_oid(RelationList, pgc->oid);
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return RelationList;
+}
+
+/*
+ * DataChecksumsWorkerMain
+ *
+ * Main function for enabling checksums in a single database, This is the
+ * function set as the bgw_function_name in the dynamic background worker
+ * process initiated for each database by the worker launcher. After enabling
+ * data checksums in each applicable relation in the database, it will wait for
+ * all temporary relations that were present when the function started to
+ * disappear before returning. This is required since we cannot rewrite
+ * existing temporary relations with data checksums.
+ */
+void
+DataChecksumsWorkerMain(Datum arg)
+{
+	Oid			dboid = DatumGetObjectId(arg);
+	List	   *RelationList = NIL;
+	List	   *InitialTempTableList = NIL;
+	BufferAccessStrategy strategy;
+	bool		aborted = false;
+	int64		rels_done;
+
+	enabling_checksums = true;
+
+	pqsignal(SIGTERM, die);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_WORKER;
+	init_ps_display(NULL);
+
+	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
+											  BGWORKER_BYPASS_ALLOWCONN);
+
+	/* worker will have a separate entry in pg_stat_progress_data_checksums */
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	/*
+	 * Get a list of all temp tables present as we start in this database. We
+	 * need to wait until they are all gone until we are done, since we cannot
+	 * access these relations and modify them.
+	 */
+	InitialTempTableList = BuildRelationList(true, false);
+
+	/*
+	 * Enable vacuum cost delay, if any.
+	 */
+	Assert(DataChecksumsWorkerShmem->enabling_checksums);
+	VacuumCostDelay = DataChecksumsWorkerShmem->cost_delay;
+	VacuumCostLimit = DataChecksumsWorkerShmem->cost_limit;
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumCostPageHit = 0;
+	VacuumCostPageMiss = 0;
+	VacuumCostPageDirty = 0;
+
+	/*
+	 * Create and set the vacuum strategy as our buffer strategy.
+	 */
+	strategy = GetAccessStrategy(BAS_VACUUM);
+
+	RelationList = BuildRelationList(false,
+									 DataChecksumsWorkerShmem->process_shared_catalogs);
+
+	/* Update the total number of relations to be processed in this DB. */
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE
+		};
+
+		int64	vals[2];
+
+		vals[0] = list_length(RelationList);
+		vals[1] = 0;
+
+		pgstat_progress_update_multi_param(2, index, vals);
+	}
+
+	/* process the relations */
+	rels_done = 0;
+	foreach_oid(reloid, RelationList)
+	{
+		if (!ProcessSingleRelationByOid(reloid, strategy))
+		{
+			aborted = true;
+			break;
+		}
+
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_RELS_DONE,
+									 ++rels_done);
+	}
+	list_free(RelationList);
+
+	if (aborted)
+	{
+		DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+		ereport(DEBUG1,
+				(errmsg("data checksum processing aborted in database OID %u",
+						dboid)));
+		return;
+	}
+
+	/* The worker is about to wait for temporary tables to go away. */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL);
+
+	/*
+	 * Wait for all temp tables that existed when we started to go away. This
+	 * is necessary since we cannot "reach" them to enable checksums. Any temp
+	 * tables created after we started will already have checksums in them
+	 * (due to the "inprogress-on" state), so no need to wait for those.
+	 */
+	for (;;)
+	{
+		List	   *CurrentTempTables;
+		int			numleft;
+		char		activity[64];
+
+		CurrentTempTables = BuildRelationList(true, false);
+		numleft = 0;
+		foreach_oid(tmptbloid, InitialTempTableList)
+		{
+			if (list_member_oid(CurrentTempTables, tmptbloid))
+				numleft++;
+		}
+		list_free(CurrentTempTables);
+
+		if (numleft == 0)
+			break;
+
+		/*
+		 * At least one temp table is left to wait for, indicate in pgstat
+		 * activity and progress reporting.
+		 */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for %d temp tables to be removed", numleft);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		(void) WaitLatch(MyLatch,
+						 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+						 5000,
+						 WAIT_EVENT_CHECKSUM_ENABLE_FINISHCONDITION);
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		aborted = DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums;
+		LWLockRelease(DataChecksumsWorkerLock);
+
+		if (aborted || abort_requested)
+		{
+			DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+			ereport(DEBUG1,
+					(errmsg("data checksum processing aborted in database OID %u",
+							dboid)));
+			return;
+		}
+	}
+
+	list_free(InitialTempTableList);
+
+	/* worker done */
+	pgstat_progress_end_command();
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
+}
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 77fb877dbad..b06b5fb45dd 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -202,6 +202,9 @@ static child_process_kind child_process_kinds[] = {
 	[B_WAL_SUMMARIZER] = {"wal_summarizer", WalSummarizerMain, true},
 	[B_WAL_WRITER] = {"wal_writer", WalWriterMain, true},
 
+	[B_DATACHECKSUMSWORKER_LAUNCHER] = {"datachecksum launcher", NULL, false},
+	[B_DATACHECKSUMSWORKER_WORKER] = {"datachecksum worker", NULL, false},
+
 	[B_LOGGER] = {"syslogger", SysLoggerMain, false},
 };
 
diff --git a/src/backend/postmaster/meson.build b/src/backend/postmaster/meson.build
index 0008603cfee..ce10ef1059a 100644
--- a/src/backend/postmaster/meson.build
+++ b/src/backend/postmaster/meson.build
@@ -6,6 +6,7 @@ backend_sources += files(
   'bgworker.c',
   'bgwriter.c',
   'checkpointer.c',
+  'datachecksumsworker.c',
   'fork_process.c',
   'interrupt.c',
   'launch_backend.c',
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index d13846298bd..81eb84e7efb 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2946,6 +2946,11 @@ PostmasterStateMachine(void)
 									B_INVALID,
 									B_STANDALONE_BACKEND);
 
+			/* also add checksumming processes */
+			remainMask = btmask_add(remainMask,
+									B_DATACHECKSUMSWORKER_LAUNCHER,
+									B_DATACHECKSUMSWORKER_WORKER);
+
 			/* All types should be included in targetMask or remainMask */
 			Assert((remainMask.mask | targetMask.mask) == BTYPE_MASK_ALL.mask);
 		}
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 78f9a0a11c4..fe1922868b9 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -186,6 +186,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 		case XLOG_FPW_CHANGE:
 		case XLOG_FPI_FOR_HINT:
 		case XLOG_FPI:
+		case XLOG_CHECKSUMS:
 		case XLOG_OVERWRITE_CONTRECORD:
 		case XLOG_CHECKPOINT_REDO:
 			break;
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 174eed70367..e7761a3ddb6 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -30,6 +30,8 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "postmaster/postmaster.h"
 #include "postmaster/walsummarizer.h"
 #include "replication/logicallauncher.h"
 #include "replication/origin.h"
@@ -148,6 +150,7 @@ CalculateShmemSize(int *num_semaphores)
 	size = add_size(size, WaitEventCustomShmemSize());
 	size = add_size(size, InjectionPointShmemSize());
 	size = add_size(size, SlotSyncShmemSize());
+	size = add_size(size, DataChecksumsWorkerShmemSize());
 
 	/* include additional requested shmem from preload libraries */
 	size = add_size(size, total_addin_request);
@@ -330,6 +333,7 @@ CreateOrAttachShmemStructs(void)
 	PgArchShmemInit();
 	ApplyLauncherShmemInit();
 	SlotSyncShmemInit();
+	DataChecksumsWorkerShmemInit();
 
 	/*
 	 * Set up other modules that need some shared memory space
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 7d201965503..2b13a8cd260 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -18,6 +18,7 @@
 #include <unistd.h>
 
 #include "access/parallel.h"
+#include "access/xlog.h"
 #include "commands/async.h"
 #include "miscadmin.h"
 #include "pgstat.h"
@@ -575,6 +576,18 @@ ProcessProcSignalBarrier(void)
 					case PROCSIGNAL_BARRIER_SMGRRELEASE:
 						processed = ProcessBarrierSmgrRelease();
 						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON:
+						processed = AbsorbChecksumsOnInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_ON:
+						processed = AbsorbChecksumsOnBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF:
+						processed = AbsorbChecksumsOffInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_OFF:
+						processed = AbsorbChecksumsOffBarrier();
+						break;
 				}
 
 				/*
diff --git a/src/backend/storage/page/README b/src/backend/storage/page/README
index e30d7ac59ad..73c36a63908 100644
--- a/src/backend/storage/page/README
+++ b/src/backend/storage/page/README
@@ -10,7 +10,9 @@ http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf, discussed
 2010/12/22 on -hackers list.
 
 Current implementation requires this be enabled system-wide at initdb time, or
-by using the pg_checksums tool on an offline cluster.
+by using the pg_checksums tool on an offline cluster.  Checksums can also be
+enabled at runtime using pg_enable_data_checksums(), and disabled by using
+pg_disable_data_checksums().
 
 The checksum is not valid at all times on a data page!!
 The checksum is valid when the page leaves the shared pool and is checked
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index ecc81aacfc3..8bd5fed8c85 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -98,7 +98,7 @@ PageIsVerifiedExtended(PageData *page, BlockNumber blkno, int flags)
 	 */
 	if (!PageIsNew(page))
 	{
-		if (DataChecksumsEnabled())
+		if (DataChecksumsNeedVerify())
 		{
 			checksum = pg_checksum_page(page, blkno);
 
@@ -1501,7 +1501,7 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 	static char *pageCopy = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return page;
 
 	/*
@@ -1531,7 +1531,7 @@ void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
 	/* If we don't need a checksum, just return */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return;
 
 	((PageHeader) page)->pd_checksum = pg_checksum_page(page, blkno);
diff --git a/src/backend/utils/activity/pgstat_backend.c b/src/backend/utils/activity/pgstat_backend.c
index a8cb54a7732..7c9ff5c19c5 100644
--- a/src/backend/utils/activity/pgstat_backend.c
+++ b/src/backend/utils/activity/pgstat_backend.c
@@ -376,6 +376,8 @@ pgstat_tracks_backend_bktype(BackendType bktype)
 		case B_BG_WRITER:
 		case B_CHECKPOINTER:
 		case B_STARTUP:
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 			return false;
 
 		case B_AUTOVAC_WORKER:
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index eb575025596..e7b418a000e 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -370,6 +370,8 @@ pgstat_tracks_io_bktype(BackendType bktype)
 		case B_LOGGER:
 			return false;
 
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 		case B_AUTOVAC_LAUNCHER:
 		case B_AUTOVAC_WORKER:
 		case B_BACKEND:
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index 3c594415bfd..b1b5cdcf36c 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -115,6 +115,8 @@ CHECKPOINT_DELAY_COMPLETE	"Waiting for a backend that blocks a checkpoint from c
 CHECKPOINT_DELAY_START	"Waiting for a backend that blocks a checkpoint from starting."
 CHECKPOINT_DONE	"Waiting for a checkpoint to complete."
 CHECKPOINT_START	"Waiting for a checkpoint to start."
+CHECKSUM_ENABLE_STARTCONDITION	"Waiting for data checksums enabling to start."
+CHECKSUM_ENABLE_FINISHCONDITION	"Waiting for data checksums to be enabled."
 EXECUTE_GATHER	"Waiting for activity from a child process while executing a <literal>Gather</literal> plan node."
 HASH_BATCH_ALLOCATE	"Waiting for an elected Parallel Hash participant to allocate a hash table."
 HASH_BATCH_ELECT	"Waiting to elect a Parallel Hash participant to allocate a hash table."
@@ -346,6 +348,7 @@ WALSummarizer	"Waiting to read or update WAL summarization state."
 DSMRegistry	"Waiting to read or update the dynamic shared memory registry."
 InjectionPoint	"Waiting to read or update information related to injection points."
 SerialControl	"Waiting to read or update shared <filename>pg_serial</filename> state."
+DataChecksumsWorker	"Waiting for data checksumsworker."
 
 #
 # END OF PREDEFINED LWLOCKS (DO NOT CHANGE THIS LINE)
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 662ce46cbc2..2cb7766c1e2 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -274,6 +274,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		cmdtype = PROGRESS_COMMAND_BASEBACKUP;
 	else if (pg_strcasecmp(cmd, "COPY") == 0)
 		cmdtype = PROGRESS_COMMAND_COPY;
+	else if (pg_strcasecmp(cmd, "DATACHECKSUMS") == 0)
+		cmdtype = PROGRESS_COMMAND_DATACHECKSUMS;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -1146,9 +1148,6 @@ pg_stat_get_db_checksum_failures(PG_FUNCTION_ARGS)
 	int64		result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
@@ -1164,9 +1163,6 @@ pg_stat_get_db_checksum_last_failure(PG_FUNCTION_ARGS)
 	TimestampTz result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index dc3521457c7..a071ba6f455 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -293,6 +293,12 @@ GetBackendTypeDesc(BackendType backendType)
 		case B_CHECKPOINTER:
 			backendDesc = gettext_noop("checkpointer");
 			break;
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+			backendDesc = "datachecksumsworker launcher";
+			break;
+		case B_DATACHECKSUMSWORKER_WORKER:
+			backendDesc = "datachecksumsworker worker";
+			break;
 		case B_LOGGER:
 			backendDesc = gettext_noop("logger");
 			break;
@@ -892,7 +898,8 @@ InitializeSessionUserIdStandalone(void)
 	 * workers, in slot sync worker and in background workers.
 	 */
 	Assert(!IsUnderPostmaster || AmAutoVacuumWorkerProcess() ||
-		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess());
+		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess() ||
+		   AmDataChecksumsWorkerProcess());
 
 	/* call only once */
 	Assert(!OidIsValid(AuthenticatedUserId));
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 4b2faf1ba9d..bae18b449aa 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -746,6 +746,11 @@ InitPostgres(const char *in_dbname, Oid dboid,
 	 */
 	SharedInvalBackendInit(false);
 
+	/*
+	 * Set up backend local cache of Controldata values.
+	 */
+	InitLocalControldata();
+
 	ProcSignalInit(MyCancelKeyValid, MyCancelKey);
 
 	/*
@@ -876,7 +881,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
 					 errhint("You should immediately run CREATE USER \"%s\" SUPERUSER;.",
 							 username != NULL ? username : "postgres")));
 	}
-	else if (AmBackgroundWorkerProcess())
+	else if (AmBackgroundWorkerProcess() || AmDataChecksumsWorkerProcess())
 	{
 		if (username == NULL && !OidIsValid(useroid))
 		{
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 508970680d1..2f5573a8184 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -476,6 +476,14 @@ static const struct config_enum_entry wal_compression_options[] = {
 	{NULL, 0, false}
 };
 
+static const struct config_enum_entry data_checksums_options[] = {
+	{"on", DATA_CHECKSUMS_ON, true},
+	{"off", DATA_CHECKSUMS_OFF, true},
+	{"inprogress-on", DATA_CHECKSUMS_INPROGRESS_ON, true},
+	{"inprogress-off", DATA_CHECKSUMS_INPROGRESS_OFF, true},
+	{NULL, 0, false}
+};
+
 /*
  * Options for enum values stored in other modules
  */
@@ -601,7 +609,6 @@ static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
 static int	num_os_semaphores;
-static bool data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
@@ -1943,17 +1950,6 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
-	{
-		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
-			gettext_noop("Shows whether data checksums are turned on for this cluster."),
-			NULL,
-			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
-		},
-		&data_checksums,
-		false,
-		NULL, NULL, NULL
-	},
-
 	{
 		{"syslog_sequence_numbers", PGC_SIGHUP, LOGGING_WHERE,
 			gettext_noop("Add sequence number to syslog messages to avoid duplicate suppression."),
@@ -5302,6 +5298,17 @@ struct config_enum ConfigureNamesEnum[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows whether data checksums are turned on for this cluster."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
+		},
+		&data_checksums,
+		DATA_CHECKSUMS_OFF, data_checksums_options,
+		NULL, NULL, show_data_checksums
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 867aeddc601..79c3f86357e 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -576,7 +576,7 @@ main(int argc, char *argv[])
 		mode == PG_MODE_DISABLE)
 		pg_fatal("data checksums are already disabled in cluster");
 
-	if (ControlFile->data_checksum_version > 0 &&
+	if (ControlFile->data_checksum_version == DATA_CHECKSUMS_ON &&
 		mode == PG_MODE_ENABLE)
 		pg_fatal("data checksums are already enabled in cluster");
 
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index bd49ea867bf..f48936d3b00 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -14,6 +14,7 @@
 
 #include "common/string.h"
 #include "pg_upgrade.h"
+#include "storage/bufpage.h"
 
 
 /*
@@ -735,6 +736,14 @@ check_control_data(ControlData *oldctrl,
 	 * check_for_isn_and_int8_passing_mismatch().
 	 */
 
+	/*
+	 * If data checksums are in any in-progress state then disallow the
+	 * upgrade. The user should either let the process finish, or turn off
+	 * data checksums, before retrying.
+	 */
+	if (oldctrl->data_checksum_version > PG_DATA_CHECKSUM_VERSION)
+		pg_fatal("checksums are being enabled in the old cluster");
+
 	/*
 	 * We might eventually allow upgrades from checksum to no-checksum
 	 * clusters.
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index d313099c027..aec3ea0bc63 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -56,6 +56,7 @@ extern PGDLLIMPORT int CommitDelay;
 extern PGDLLIMPORT int CommitSiblings;
 extern PGDLLIMPORT bool track_wal_io_timing;
 extern PGDLLIMPORT int wal_decode_buffer_size;
+extern PGDLLIMPORT int data_checksums;
 
 extern PGDLLIMPORT int CheckPointSegments;
 
@@ -117,7 +118,7 @@ extern PGDLLIMPORT int wal_level;
  * of the bits make it to disk, but the checksum wouldn't match.  Also WAL-log
  * them if forced by wal_log_hints=on.
  */
-#define XLogHintBitIsNeeded() (DataChecksumsEnabled() || wal_log_hints)
+#define XLogHintBitIsNeeded() (wal_log_hints || DataChecksumsNeedWrite())
 
 /* Do we need to WAL-log information required only for Hot Standby and logical replication? */
 #define XLogStandbyInfoActive() (wal_level >= WAL_LEVEL_REPLICA)
@@ -230,7 +231,19 @@ extern XLogRecPtr GetXLogWriteRecPtr(void);
 
 extern uint64 GetSystemIdentifier(void);
 extern char *GetMockAuthenticationNonce(void);
-extern bool DataChecksumsEnabled(void);
+extern bool DataChecksumsNeedWrite(void);
+extern bool DataChecksumsNeedVerify(void);
+extern bool DataChecksumsOnInProgress(void);
+extern bool DataChecksumsOffInProgress(void);
+extern void SetDataChecksumsOnInProgress(void);
+extern void SetDataChecksumsOn(void);
+extern void SetDataChecksumsOff(void);
+extern bool AbsorbChecksumsOnInProgressBarrier(void);
+extern bool AbsorbChecksumsOffInProgressBarrier(void);
+extern bool AbsorbChecksumsOnBarrier(void);
+extern bool AbsorbChecksumsOffBarrier(void);
+extern const char *show_data_checksums(void);
+extern void InitLocalControldata(void);
 extern bool GetDefaultCharSignedness(void);
 extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
 extern Size XLOGShmemSize(void);
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index 2cf8d55d706..ee82d4ce73d 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,6 +25,7 @@
 #include "lib/stringinfo.h"
 #include "pgtime.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/relfilelocator.h"
 
 
@@ -289,6 +290,12 @@ typedef struct xl_restore_point
 	char		rp_name[MAXFNAMELEN];
 } xl_restore_point;
 
+/* Information logged when data checksum level is changed */
+typedef struct xl_checksum_state
+{
+	ChecksumType new_checksumtype;
+} xl_checksum_state;
+
 /* Overwrite of prior contrecord */
 typedef struct xl_overwrite_contrecord
 {
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 63e834a6ce4..9b5a50adf54 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -80,6 +80,7 @@ typedef struct CheckPoint
 /* 0xC0 is used in Postgres 9.5-11 */
 #define XLOG_OVERWRITE_CONTRECORD		0xD0
 #define XLOG_CHECKPOINT_REDO			0xE0
+#define XLOG_CHECKSUMS					0xF0
 
 
 /*
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 890822eaf79..6987bfb3ab9 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12256,6 +12256,23 @@
   proname => 'jsonb_subscript_handler', prorettype => 'internal',
   proargtypes => 'internal', prosrc => 'jsonb_subscript_handler' },
 
+# data checksum management functions
+{ oid => '9258',
+  descr => 'disable data checksums',
+  proname => 'pg_disable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => '',
+  prosrc => 'disable_data_checksums' },
+
+{ oid => '9257',
+  descr => 'enable data checksums',
+  proname => 'pg_enable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'int4 int4 bool', proallargtypes => '{int4,int4,bool}',
+  proargmodes => '{i,i,i}',
+  proargnames => '{cost_delay,cost_limit,fast}',
+  prosrc => 'enable_data_checksums' },
+
 # collation management functions
 { oid => '3445', descr => 'import collations from operating system',
   proname => 'pg_import_system_collations', procost => '100',
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 7c736e7b03b..94b478a6cc9 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -157,4 +157,21 @@
 #define PROGRESS_COPY_TYPE_PIPE 3
 #define PROGRESS_COPY_TYPE_CALLBACK 4
 
+/* Progress parameters for PROGRESS_DATACHECKSUMS */
+#define PROGRESS_DATACHECKSUMS_PHASE		0
+#define PROGRESS_DATACHECKSUMS_DBS_TOTAL	1
+#define PROGRESS_DATACHECKSUMS_DBS_DONE		2
+#define PROGRESS_DATACHECKSUMS_RELS_TOTAL	3
+#define PROGRESS_DATACHECKSUMS_RELS_DONE	4
+#define PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL	5
+#define PROGRESS_DATACHECKSUMS_BLOCKS_DONE	6
+
+/* Phases of datachecksumsworker operation */
+#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING			0
+#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING			1
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING			2
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BACKENDS	3
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL	4
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT 5
+
 #endif
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 6f16794eb63..c238c31604c 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -365,6 +365,9 @@ typedef enum BackendType
 	B_WAL_SUMMARIZER,
 	B_WAL_WRITER,
 
+	B_DATACHECKSUMSWORKER_LAUNCHER,
+	B_DATACHECKSUMSWORKER_WORKER,
+
 	/*
 	 * Logger is not connected to shared memory and does not have a PGPROC
 	 * entry.
@@ -389,6 +392,7 @@ extern PGDLLIMPORT BackendType MyBackendType;
 #define AmWalReceiverProcess()		(MyBackendType == B_WAL_RECEIVER)
 #define AmWalSummarizerProcess()	(MyBackendType == B_WAL_SUMMARIZER)
 #define AmWalWriterProcess()		(MyBackendType == B_WAL_WRITER)
+#define AmDataChecksumsWorkerProcess()	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
 
 #define AmSpecialWorkerProcess() \
 	(AmAutoVacuumLauncherProcess() || \
diff --git a/src/include/postmaster/datachecksumsworker.h b/src/include/postmaster/datachecksumsworker.h
new file mode 100644
index 00000000000..59c9000d646
--- /dev/null
+++ b/src/include/postmaster/datachecksumsworker.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.h
+ *	  header file for checksum helper background worker
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/postmaster/datachecksumsworker.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef DATACHECKSUMSWORKER_H
+#define DATACHECKSUMSWORKER_H
+
+/* Shared memory */
+extern Size DataChecksumsWorkerShmemSize(void);
+extern void DataChecksumsWorkerShmemInit(void);
+
+/* Start the background processes for enabling or disabling checksums */
+void		StartDataChecksumsWorkerLauncher(bool enable_checksums,
+											 int cost_delay,
+											 int cost_limit,
+											 bool fast);
+
+/* Background worker entrypoints */
+void		DataChecksumsWorkerLauncherMain(Datum arg);
+void		DataChecksumsWorkerMain(Datum arg);
+
+#endif							/* DATACHECKSUMSWORKER_H */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 6646b6f6371..5326782171f 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -205,7 +205,17 @@ typedef PageHeaderData *PageHeader;
  * handling pages.
  */
 #define PG_PAGE_LAYOUT_VERSION		4
+
+/*
+ * Checksum version 0 is used for when data checksums are disabled.
+ * PG_DATA_CHECKSUM_VERSION defines that data checksums are enabled in the
+ * cluster and PG_DATA_CHECKSUM_INPROGRESS_{ON|OFF}_VERSION defines that data
+ * checksums are either currently being enabled or disabled.
+ */
 #define PG_DATA_CHECKSUM_VERSION	1
+#define PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION	2
+#define PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION	3
+
 
 /* ----------------------------------------------------------------
  *						page support functions
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 25d13a798d1..6faff962ef0 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -15,6 +15,14 @@
 
 #include "storage/block.h"
 
+typedef enum ChecksumType
+{
+	DATA_CHECKSUMS_OFF = 0,
+	DATA_CHECKSUMS_ON,
+	DATA_CHECKSUMS_INPROGRESS_ON,
+	DATA_CHECKSUMS_INPROGRESS_OFF
+} ChecksumType;
+
 /*
  * Compute the checksum for a Postgres page.  The page must be aligned on a
  * 4-byte boundary.
diff --git a/src/include/storage/lwlocklist.h b/src/include/storage/lwlocklist.h
index cf565452382..afe39db753c 100644
--- a/src/include/storage/lwlocklist.h
+++ b/src/include/storage/lwlocklist.h
@@ -83,3 +83,4 @@ PG_LWLOCK(49, WALSummarizer)
 PG_LWLOCK(50, DSMRegistry)
 PG_LWLOCK(51, InjectionPoint)
 PG_LWLOCK(52, SerialControl)
+PG_LWLOCK(53, DataChecksumsWorker)
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 114eb1f8f76..61a3e3af9d4 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -447,9 +447,10 @@ extern PGDLLIMPORT PGPROC *PreparedXactProcs;
  * Background writer, checkpointer, WAL writer, WAL summarizer, and archiver
  * run during normal operation.  Startup process and WAL receiver also consume
  * 2 slots, but WAL writer is launched only after startup has exited, so we
- * only need 6 slots.
+ * only need 6 slots to cover these. The DataChecksums worker and launcher
+ * can consume 2 slots when data checksums are enabled or disabled.
  */
-#define NUM_AUXILIARY_PROCS		6
+#define NUM_AUXILIARY_PROCS		8
 
 /* configurable options */
 extern PGDLLIMPORT int DeadlockTimeout;
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 022fd8ed933..8937fa6ed3d 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -54,6 +54,11 @@ typedef enum
 typedef enum
 {
 	PROCSIGNAL_BARRIER_SMGRRELEASE, /* ask smgr to close files */
+
+	PROCSIGNAL_BARRIER_CHECKSUM_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_ON,
 } ProcSignalBarrierType;
 
 /*
diff --git a/src/include/utils/backend_progress.h b/src/include/utils/backend_progress.h
index dda813ab407..c664e92dbfe 100644
--- a/src/include/utils/backend_progress.h
+++ b/src/include/utils/backend_progress.h
@@ -28,6 +28,7 @@ typedef enum ProgressCommandType
 	PROGRESS_COMMAND_CREATE_INDEX,
 	PROGRESS_COMMAND_BASEBACKUP,
 	PROGRESS_COMMAND_COPY,
+	PROGRESS_COMMAND_DATACHECKSUMS,
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	20
diff --git a/src/test/Makefile b/src/test/Makefile
index 511a72e6238..278ce3e8a86 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,16 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = perl postmaster regress isolation modules authentication recovery subscription
+SUBDIRS = \
+		perl \
+		postmaster \
+		regress \
+		isolation \
+		modules \
+		authentication \
+		recovery \
+		subscription \
+		checksum
 
 ifeq ($(with_icu),yes)
 SUBDIRS += icu
diff --git a/src/test/checksum/.gitignore b/src/test/checksum/.gitignore
new file mode 100644
index 00000000000..871e943d50e
--- /dev/null
+++ b/src/test/checksum/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/checksum/Makefile b/src/test/checksum/Makefile
new file mode 100644
index 00000000000..fd03bf73df4
--- /dev/null
+++ b/src/test/checksum/Makefile
@@ -0,0 +1,23 @@
+#-------------------------------------------------------------------------
+#
+# Makefile for src/test/checksum
+#
+# Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/checksum/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/test/checksum
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
+
+clean distclean maintainer-clean:
+	rm -rf tmp_check
diff --git a/src/test/checksum/README b/src/test/checksum/README
new file mode 100644
index 00000000000..0f0317060b3
--- /dev/null
+++ b/src/test/checksum/README
@@ -0,0 +1,22 @@
+src/test/checksum/README
+
+Regression tests for data checksums
+===================================
+
+This directory contains a test suite for enabling data checksums
+in a running cluster.
+
+Running the tests
+=================
+
+    make check
+
+or
+
+    make installcheck
+
+NOTE: This creates a temporary installation (in the case of "check"),
+with multiple nodes, be they master or standby(s) for the purpose of
+the tests.
+
+NOTE: This requires the --enable-tap-tests argument to configure.
diff --git a/src/test/checksum/meson.build b/src/test/checksum/meson.build
new file mode 100644
index 00000000000..5f96b5c246d
--- /dev/null
+++ b/src/test/checksum/meson.build
@@ -0,0 +1,15 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+tests += {
+  'name': 'checksums',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'tests': [
+      't/001_basic.pl',
+      't/002_restarts.pl',
+      't/003_standby_restarts.pl',
+      't/004_offline.pl',
+    ],
+  },
+}
diff --git a/src/test/checksum/t/001_basic.pl b/src/test/checksum/t/001_basic.pl
new file mode 100644
index 00000000000..4c64f6a14fc
--- /dev/null
+++ b/src/test/checksum/t/001_basic.pl
@@ -0,0 +1,88 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are turned off
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable data checksums
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+# Wait for checksums to become enabled
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Enable data checksums again which should be a no-op..
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+# ..and make sure we can still read/write data
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums again
+$node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are disabled');
+
+# Test reading again
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure previously checksummed pages can be read back');
+
+# Re-enable checksums and make sure that the underlying data has changed to
+# ensure that checksums will be different.
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back the data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$node->stop;
+
+done_testing();
diff --git a/src/test/checksum/t/002_restarts.pl b/src/test/checksum/t/002_restarts.pl
new file mode 100644
index 00000000000..2697b722257
--- /dev/null
+++ b/src/test/checksum/t/002_restarts.pl
@@ -0,0 +1,92 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# restarting the processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, '1', "ensure checksums aren't enabled yet");
+
+$bsession->quit;
+$node->stop;
+$node->start;
+
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'inprogress-on', "ensure checksums aren't enabled yet");
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are turned on');
+
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+$result = $node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are turned off');
+
+done_testing();
diff --git a/src/test/checksum/t/003_standby_restarts.pl b/src/test/checksum/t/003_standby_restarts.pl
new file mode 100644
index 00000000000..6782664f4e6
--- /dev/null
+++ b/src/test/checksum/t/003_standby_restarts.pl
@@ -0,0 +1,139 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# streaming replication
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize primary node
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+$node_primary->init(allows_streaming => 1, no_data_checksums => 1);
+$node_primary->start;
+
+my $slotname = 'physical_slot';
+$node_primary->safe_psql('postgres',
+	"SELECT pg_create_physical_replication_slot('$slotname')");
+
+# Take backup
+my $backup_name = 'my_backup';
+$node_primary->backup($backup_name);
+
+# Create streaming standby linking to primary
+my $node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $backup_name,
+	has_streaming => 1);
+$node_standby_1->append_conf(
+	'postgresql.conf', qq[
+primary_slot_name = '$slotname'
+]);
+$node_standby_1->start;
+
+# Create some content on the primary to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Wait for standbys to catch up
+$node_primary->wait_for_catchup($node_standby_1, 'replay',
+	$node_primary->lsn('insert'));
+
+# Check that checksums are turned off on all nodes
+my $result = $node_primary->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on primary');
+
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on standby_1');
+
+# Enable checksums for the cluster
+$node_primary->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+# Ensure that the primary switches to "inprogress-on"
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	"inprogress-on");
+is($result, 1, 'ensure checksums are in progress on primary');
+
+# Wait for checksum enable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Ensure that the standby has switched to "inprogress-on" or "on".  Normally it
+# would be "inprogress-on", but it is theoretically possible for the primary to
+# complete the checksum enabling *and* have the standby replay that record
+# before we reach the check below.
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting = 'off' FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'f');
+is($result, 1, 'ensure standby has absorbed the inprogress-on barrier');
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+
+is(($result eq 'inprogress-on' || $result eq 'on'),
+	1, 'ensure checksums are on, or in progress, on standby_1');
+
+# Insert some more data which should be checksummed on INSERT
+$node_primary->safe_psql('postgres',
+	"INSERT INTO t VALUES (generate_series(1, 10000));");
+
+# Wait for checksums enabled on the primary
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the primary');
+
+# Wait for checksums enabled on the standby
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the standby');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, '20000', 'ensure we can safely read all data with checksums');
+
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+# Disable checksums and ensure it's propagated to standby and that we can
+# still read all data
+$node_primary->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+# Wait for checksum disable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure data checksums are disabled on the primary 2');
+
+# Ensure that the standby has switched to off
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are off on standby_1');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, "20000", 'ensure we can safely read all data without checksums');
+
+done_testing();
diff --git a/src/test/checksum/t/004_offline.pl b/src/test/checksum/t/004_offline.pl
new file mode 100644
index 00000000000..9cee62c9b52
--- /dev/null
+++ b/src/test/checksum/t/004_offline.pl
@@ -0,0 +1,101 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums offline from various states
+# of checksum processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable checksums offline using pg_checksums
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are enabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums offline again using pg_checksums
+$node->stop;
+$node->checksum_disable_offline;
+$node->start;
+
+# Ensure that checksums are disabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, 1, 'ensure checksums are in the process of being enabled');
+
+# Turn the cluster off and enable checksums offline, then start back up
+$bsession->quit;
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are now enabled even though processing wasn't
+# restarted
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+done_testing();
diff --git a/src/test/meson.build b/src/test/meson.build
index ccc31d6a86a..c4bfef2c0e2 100644
--- a/src/test/meson.build
+++ b/src/test/meson.build
@@ -8,6 +8,7 @@ subdir('postmaster')
 subdir('recovery')
 subdir('subscription')
 subdir('modules')
+subdir('checksum')
 
 if ssl.found()
   subdir('ssl')
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index b105cba05a6..666bd2a2d4c 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3748,6 +3748,40 @@ sub advance_wal
 	}
 }
 
+=item $node->checksum_enable_offline()
+
+Enable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_enable_offline
+{
+	my ($self) = @_;
+
+	print "### Enabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-e');
+	return;
+}
+
+=item checksum_disable_offline
+
+Disable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_disable_offline
+{
+	my ($self) = @_;
+
+	print "### Disabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D', $self->data_dir, '-d');
+	return;
+}
+
 =pod
 
 =back
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 62f69ac20b2..2cfea837554 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2041,6 +2041,43 @@ pg_stat_progress_create_index| SELECT s.pid,
     s.param15 AS partitions_done
    FROM (pg_stat_get_progress_info('CREATE INDEX'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
+pg_stat_progress_data_checksums| SELECT s.pid,
+    s.datid,
+    d.datname,
+        CASE s.param1
+            WHEN 0 THEN 'enabling'::text
+            WHEN 1 THEN 'disabling'::text
+            WHEN 2 THEN 'waiting'::text
+            WHEN 3 THEN 'waiting on backends'::text
+            WHEN 4 THEN 'waiting on temporary tables'::text
+            WHEN 5 THEN 'waiting on checkpoint'::text
+            WHEN 6 THEN 'done'::text
+            ELSE NULL::text
+        END AS phase,
+        CASE s.param2
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param2
+        END AS databases_total,
+    s.param3 AS databases_done,
+        CASE s.param4
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param4
+        END AS relations_total,
+        CASE s.param5
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param5
+        END AS relations_done,
+        CASE s.param6
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param6
+        END AS blocks_total,
+        CASE s.param7
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param7
+        END AS blocks_done
+   FROM (pg_stat_get_progress_info('DATACHECKSUMS'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)))
+  ORDER BY s.datid;
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,
diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index f77caacc17d..85e15d4cc2b 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -51,6 +51,22 @@ client backend|relation|vacuum
 client backend|temp relation|normal
 client backend|wal|init
 client backend|wal|normal
+datachecksumsworker launcher|relation|bulkread
+datachecksumsworker launcher|relation|bulkwrite
+datachecksumsworker launcher|relation|init
+datachecksumsworker launcher|relation|normal
+datachecksumsworker launcher|relation|vacuum
+datachecksumsworker launcher|temp relation|normal
+datachecksumsworker launcher|wal|init
+datachecksumsworker launcher|wal|normal
+datachecksumsworker worker|relation|bulkread
+datachecksumsworker worker|relation|bulkwrite
+datachecksumsworker worker|relation|init
+datachecksumsworker worker|relation|normal
+datachecksumsworker worker|relation|vacuum
+datachecksumsworker worker|temp relation|normal
+datachecksumsworker worker|wal|init
+datachecksumsworker worker|wal|normal
 slotsync worker|relation|bulkread
 slotsync worker|relation|bulkwrite
 slotsync worker|relation|init
@@ -87,7 +103,7 @@ walsummarizer|wal|init
 walsummarizer|wal|normal
 walwriter|wal|init
 walwriter|wal|normal
-(71 rows)
+(87 rows)
 \a
 -- ensure that both seqscan and indexscan plans are allowed
 SET enable_seqscan TO on;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 93339ef3c58..3ed3f7d4216 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -403,6 +403,7 @@ CheckPointStmt
 CheckpointStatsData
 CheckpointerRequest
 CheckpointerShmemStruct
+ChecksumType
 Chromosome
 CkptSortItem
 CkptTsStatus
@@ -594,6 +595,10 @@ DataPageDeleteStack
 DataTypesUsageChecks
 DataTypesUsageVersionCheck
 DatabaseInfo
+DataChecksumsWorkerDatabase
+DataChecksumsWorkerResult
+DataChecksumsWorkerResultEntry
+DataChecksumsWorkerShmemStruct
 DateADT
 DateTimeErrorExtra
 Datum
@@ -4146,6 +4151,7 @@ xl_btree_split
 xl_btree_unlink_page
 xl_btree_update
 xl_btree_vacuum
+xl_checksum_state
 xl_clog_truncate
 xl_commit_ts_truncate
 xl_dbase_create_file_copy_rec
-- 
2.39.3 (Apple Git-146)

#41Tomas Vondra
tomas@vondra.me
In reply to: Daniel Gustafsson (#40)
5 attachment(s)
Re: Changing the state of data checksums in a running cluster

On 3/14/25 15:06, Daniel Gustafsson wrote:

On 14 Mar 2025, at 14:38, Daniel Gustafsson <daniel@yesql.se> wrote:

On 14 Mar 2025, at 13:20, Tomas Vondra <tomas@vondra.me> wrote:

This is "ephemeral" in the sense that setting the value to "on" again
would be harmless, and indeed a non-assert build will run just fine.

As mentioned off-list, being able to loosen the restriction for the first
barrier seen seem like a good way to keep this assertion. Removing it is of
course the alternative solution, as it's not causing any issues, but given how
handy it's been to find actual issues it would be good to be able to keep it.

i.e. to first register into procsignal, and then read the new value.
AFAICS this guarantees we won't lose any checksum version updates. It
does mean we still can get a barrier for a value we've already seen, but
I think we should simply ignore this for the very first update.

Calling functions with sideeffects in setting state seems like a bad idea
before ProcSignalInit has run, that's thinko on my part in this patch. Your
solution of reordering seems like the right way to handle this.

0006 in the attached version is what I have used when testing the above, along
with an update to the copyright year which I had missed doing earlier. It also
contains the fix in LocalProcessControlFile which I had in my local tree, I
think we need something like that at least.

Thanks, here's an updated patch version - I squashed all the earlier
parts, but I kept your changes and my adjustments separate, for clarity.
A couple comments:

1) I don't think the comment before InitialDataChecksumTransition was
entirely accurate, because it said we can see the duplicate state only
for "on" state. But AFAICS we can see duplicate values for any states,
except that we only have an assert for the "on" so we don't notice the
other cases. I wonder if we could strengthen this a bit, by adding some
asserts for the other states too.

2) I admit it's rather subjective, but I didn't like how you did the
assert in AbsorbChecksumsOnBarrier. But looking at it now in the diff,
maybe it was more readable ...

3) I renamed InitLocalControldata() to InitLocalDataChecksumVersion().
The name was entirely misleading, because it now initializes the flag in
XLogCtl, it has nothing to do with control file.

4) I realized AuxiliaryProcessMainCommon() may be a better place to
initialize the checksum flag for non-backend processes. In fact, doing
it in postmaster_child_launch() had the same race condition because it
happened before ProcSignalInit().

I'm sure there's cleanup possible in a bunch of places, but the really
bad thing is I realized the handling on a standby is not quite correct.
I don't know what exactly is happening, there's too many moving bits,
but here's what I see ...

Every now and then, after restarting the standby, it logs a bunch of
page verification failures. Stuff like this:

WARNING: page verification failed, calculated checksum 9856 but
expected 0
CONTEXT: WAL redo at 0/3447BA8 for Heap2/VISIBLE:
snapshotConflictHorizon: 0, flags: 0x03; blkref #0: rel
1663/16384/16401, fork 2, blk 0 FPW; blkref #1: rel
1663/16384/16401, blk 0
WARNING: page verification failed, LSN 0/CF54C10

This is after an immediate shutdown, but I've seen similar failures for
fast shutdowns too (the root causes may be different / may need a
different fix, not sure).

The instance restarts, and the "startup" process starts recovery

LOG: redo starts at 0/2000028

This matches LSN from the very first start of the standby - there were
no restart points since then, apparently. And since then the primary did
this with the checksums (per pg_waldump):

lsn: 0/0ECCFC48, prev 0/0ECCFBA0, desc: CHECKSUMS inprogress-off
lsn: 0/0ECD0168, prev 0/0ECD0128, desc: CHECKSUMS off

The instance already saw both records before the immediate shutdown (per
the logging in patch 0004), but after the restart the instance goes back
to having checksums enabled again

data_checksum_version = 1

Which is correct, because it starts at 0/2000028, which is before either
of the XLOG_CHECKSUMS records. But then at 0/3447BA8 (which is *before*
either of the checksum changes) it tries to read a page from disk, and
hits a checksum error. That page is from the future (per the page LSN
logged by patch 0004), but it's still before both XLOG_CHECKSUMS
messages. So how come the page has pd_checksum 0?

I'd have understood if the page came "broken" from the primary, but I've
not seen a single page verification failure on that side (and it's
subject to the same fast/immediate restarts, etc).

I wonder if this might be related to how we enforce checkpoints only
when setting the checksums to "on" on the primary. Maybe that's safe on
primary but not on a standby?

FWIW I've seen similar issues for "fast" shutdowns too - at least the
symptoms are similar, but the mechanism might be a bit different. In
particular, I suspect there's some sort of thinko in updating the
data_checksum_version in the control file, but I can't put my finger on
it yet.

Another thing I noticed is this comment in CreateRestartPoint(), before
one of the early exits:

/*
* If the last checkpoint record we've replayed is already our last
* restartpoint, we can't perform a new restart point. We still update
* minRecoveryPoint in that case, so that if this is a shutdown restart
* point, we won't start up earlier than before. That's not strictly
* necessary, but when hot standby is enabled, it would be rather weird
* if the database opened up for read-only connections at a
* point-in-time before the last shutdown. Such time travel is still
* possible in case of immediate shutdown, though.
* ...

I wonder if this "time travel backwards" might be an issue for this too,
because it might mean we end up picking the wrong data_checksum_version
from the control file. In any case, if this happens, we don't get to the
ControlFile->data_checksum_version update a bit further down. And
there's another condition that can skip that.

I'll continue investigating this next week, but at this point I'm quite
confused and would be grateful for any insights ...

regards

--
Tomas Vondra

Attachments:

v20250315-0001-Online-enabling-and-disabling-of-data-chec.patchtext/x-patch; charset=UTF-8; name=v20250315-0001-Online-enabling-and-disabling-of-data-chec.patchDownload
From f72f5aa466fb308ee5816b72968677d85537cb57 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Fri, 7 Mar 2025 19:21:22 +0100
Subject: [PATCH v20250315 1/4] Online enabling and disabling of data checksums

This allows data checksums to be enabled, or disabled, in a running
cluster without restricting access to the cluster during processing.

Data checksums could prior to this only be enabled during initdb or
when the cluster is offline using the pg_checksums app. This commit
introduce functionality to enable, or disable, data checksums while
the cluster is running regardless of how it was initialized.

A background worker launcher process is responsible for launching a
dynamic per-database background worker which will mark all buffers
dirty for all relation with storage in order for them to have data
checksums calcuated on write.  Once all relations in all databases
have been processed, the data_checksums state will be set to on and
the cluster will at that point be identical to one which had data
checksums enabled during initialization or via offline processing.

When data checksums are being enabled, concurrent I/O operations
from backends other than the data checksums worker will write the
checksums but not verify them on reading.  Only when all backends
have absorbed the procsignalbarrier for setting data_checksums to
on will they also start verifying checksums on reading.  The same
process is repeated during disabling; all backends write checksums
but do not verify them until the barrier for setting the state to
off has been absorbed by all.  This in-progress state is used to
ensure there are no false negatives (or positives) due to reading
a checksum which is not in sync with the page.

This work is based on an earlier version of this patch which was
reviewed by among others Heikki Linnakangas, Robert Haas, Andres
Freund, Tomas Vondra, Michael Banck and Andrey Borodin.

Author: Daniel Gustafsson <daniel@yesql.se>
Author: Magnus Hagander <magnus@hagander.net>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com
Discussion: https://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de
Discussion: https://postgr.es/m/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com
---
 doc/src/sgml/func.sgml                        |   71 +
 doc/src/sgml/glossary.sgml                    |   23 +
 doc/src/sgml/monitoring.sgml                  |  208 ++-
 doc/src/sgml/ref/pg_checksums.sgml            |    6 +
 doc/src/sgml/wal.sgml                         |   59 +-
 src/backend/access/rmgrdesc/xlogdesc.c        |   24 +
 src/backend/access/transam/xlog.c             |  554 ++++++-
 src/backend/access/transam/xlogfuncs.c        |   41 +
 src/backend/backup/basebackup.c               |    6 +-
 src/backend/catalog/system_functions.sql      |   11 +
 src/backend/catalog/system_views.sql          |   20 +
 src/backend/postmaster/Makefile               |    1 +
 src/backend/postmaster/bgworker.c             |    7 +
 src/backend/postmaster/datachecksumsworker.c  | 1439 +++++++++++++++++
 src/backend/postmaster/launch_backend.c       |   13 +
 src/backend/postmaster/meson.build            |    1 +
 src/backend/postmaster/postmaster.c           |    5 +
 src/backend/replication/logical/decode.c      |    1 +
 src/backend/storage/ipc/ipci.c                |    4 +
 src/backend/storage/ipc/procsignal.c          |   13 +
 src/backend/storage/page/README               |    4 +-
 src/backend/storage/page/bufpage.c            |    6 +-
 src/backend/utils/activity/pgstat_backend.c   |    2 +
 src/backend/utils/activity/pgstat_io.c        |    2 +
 .../utils/activity/wait_event_names.txt       |    3 +
 src/backend/utils/adt/pgstatfuncs.c           |    8 +-
 src/backend/utils/init/miscinit.c             |    9 +-
 src/backend/utils/init/postinit.c             |    7 +-
 src/backend/utils/misc/guc_tables.c           |   31 +-
 src/bin/pg_checksums/pg_checksums.c           |    2 +-
 src/bin/pg_upgrade/controldata.c              |    9 +
 src/include/access/xlog.h                     |   17 +-
 src/include/access/xlog_internal.h            |    7 +
 src/include/catalog/pg_control.h              |    6 +-
 src/include/catalog/pg_proc.dat               |   17 +
 src/include/commands/progress.h               |   16 +
 src/include/miscadmin.h                       |   10 +
 src/include/postmaster/datachecksumsworker.h  |   31 +
 src/include/storage/bufpage.h                 |   10 +
 src/include/storage/checksum.h                |    8 +
 src/include/storage/lwlocklist.h              |    1 +
 src/include/storage/proc.h                    |    5 +-
 src/include/storage/procsignal.h              |    5 +
 src/include/utils/backend_progress.h          |    1 +
 src/test/Makefile                             |   11 +-
 src/test/checksum/.gitignore                  |    2 +
 src/test/checksum/Makefile                    |   23 +
 src/test/checksum/README                      |   22 +
 src/test/checksum/meson.build                 |   15 +
 src/test/checksum/t/001_basic.pl              |   88 +
 src/test/checksum/t/002_restarts.pl           |   92 ++
 src/test/checksum/t/003_standby_restarts.pl   |  139 ++
 src/test/checksum/t/004_offline.pl            |  101 ++
 src/test/meson.build                          |    1 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |   36 +
 src/test/regress/expected/rules.out           |   36 +
 src/test/regress/expected/stats.out           |   18 +-
 src/test/subscription/t/013_partition.pl      |    3 +-
 src/tools/pgindent/typedefs.list              |    6 +
 59 files changed, 3263 insertions(+), 54 deletions(-)
 create mode 100644 src/backend/postmaster/datachecksumsworker.c
 create mode 100644 src/include/postmaster/datachecksumsworker.h
 create mode 100644 src/test/checksum/.gitignore
 create mode 100644 src/test/checksum/Makefile
 create mode 100644 src/test/checksum/README
 create mode 100644 src/test/checksum/meson.build
 create mode 100644 src/test/checksum/t/001_basic.pl
 create mode 100644 src/test/checksum/t/002_restarts.pl
 create mode 100644 src/test/checksum/t/003_standby_restarts.pl
 create mode 100644 src/test/checksum/t/004_offline.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 1c3810e1a04..0df5067c2c6 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -29882,6 +29882,77 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
 
   </sect2>
 
+  <sect2 id="functions-admin-checksum">
+   <title>Data Checksum Functions</title>
+
+   <para>
+    The functions shown in <xref linkend="functions-checksums-table" /> can
+    be used to enable or disable data checksums in a running cluster.
+    See <xref linkend="checksums" /> for details.
+   </para>
+
+   <table id="functions-checksums-table">
+    <title>Data Checksum Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_enable_data_checksums</primary>
+        </indexterm>
+        <function>pg_enable_data_checksums</function> ( <optional><parameter>cost_delay</parameter> <type>int</type>, <parameter>cost_limit</parameter> <type>int</type></optional> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Initiates data checksums for the cluster. This will switch the data
+        checksums mode to <literal>inprogress-on</literal> as well as start a
+        background worker that will process all pages in the database and
+        enable checksums on them. When all data pages have had checksums
+        enabled, the cluster will automatically switch data checksums mode to
+        <literal>on</literal>.
+       </para>
+       <para>
+        If <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter> are
+        specified, the speed of the process is throttled using the same principles as
+        <link linkend="runtime-config-resource-vacuum-cost">Cost-based Vacuum Delay</link>.
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_disable_data_checksums</primary>
+        </indexterm>
+        <function>pg_disable_data_checksums</function> ()
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Disables data checksum validation and calculation for the cluster. This
+        will switch the data checksum mode to <literal>inprogress-off</literal>
+        while data checksums are being disabled. When all active backends have
+        stopped validating data checksums, the data checksum mode will be
+        changed to <literal>off</literal>.  At this point the data pages will
+        still have checksums recorded but they are not updated when pages are
+        modified.
+       </para></entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+
+  </sect2>
+
   <sect2 id="functions-admin-dbobject">
    <title>Database Object Management Functions</title>
 
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index c0f812e3f5e..547e8586d4b 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -159,6 +159,8 @@
      (but not the autovacuum workers),
      the <glossterm linkend="glossary-background-writer">background writer</glossterm>,
      the <glossterm linkend="glossary-checkpointer">checkpointer</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker">data checksums worker</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker-launcher">data checksums worker launcher</glossterm>,
      the <glossterm linkend="glossary-logger">logger</glossterm>,
      the <glossterm linkend="glossary-startup-process">startup process</glossterm>,
      the <glossterm linkend="glossary-wal-archiver">WAL archiver</glossterm>,
@@ -548,6 +550,27 @@
    </glossdef>
   </glossentry>
 
+  <glossentry id="glossary-data-checksums-worker">
+   <glossterm>Data Checksums Worker</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which enables or disables data checksums in a specific database.
+    </para>
+   </glossdef>
+  </glossentry>
+
+  <glossentry id="glossary-data-checksums-worker-launcher">
+   <glossterm>Data Checksums Worker Launcher</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which starts <glossterm linkend="glossary-data-checksums-worker"> processes</glossterm>
+     for each database.
+    </para>
+   </glossdef>
+  </glossentry>
+
   <glossentry id="glossary-db-cluster">
    <glossterm>Database cluster</glossterm>
    <glossdef>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index aaa6586d3a4..8fec0277f97 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3497,8 +3497,9 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Number of data page checksum failures detected in this
-       database (or on a shared object), or NULL if data checksums are
-       disabled.
+       database (or on a shared object).
+       Detected failures are reported regardless of the
+       <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -3508,8 +3509,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Time at which the last data page checksum failure was detected in
-       this database (or on a shared object), or NULL if data checksums are
-       disabled.
+       this database (or on a shared object). Last failure is reported
+       regardless of the <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -6828,6 +6829,205 @@ FROM pg_stat_get_backend_idset() AS backendid;
 
  </sect2>
 
+ <sect2 id="data-checksum-progress-reporting">
+  <title>Data Checksum Progress Reporting</title>
+
+  <indexterm>
+   <primary>pg_stat_progress_data_checksums</primary>
+  </indexterm>
+
+  <para>
+   When data checksums are being enabled on a running cluster, the
+   <structname>pg_stat_progress_data_checksums</structname> view will contain
+   a row for the launcher process, and one row for each worker process which
+   is currently calculating checksums for the data pages in one database.
+  </para>
+
+  <table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
+   <title><structname>pg_stat_progress_data_checksums</structname> View</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        Column Type
+       </para>
+       <para>
+        Description>
+       </para>
+      </entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>pid</structfield> <type>integer</type>
+       </para>
+       <para>
+        Process ID of a datachecksumworker process.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datid</structfield> <type>oid</type>
+      </para>
+      <para>
+       OID of this database, or 0 for the launcher process
+       relation
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datname</structfield> <type>name</type>
+      </para>
+      <para>
+       Name of this database, or <literal>NULL</literal> for the
+       launcher process.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>phase</structfield> <type>text</type>
+       </para>
+       <para>
+        Current processing phase, see <xref linkend="datachecksum-phases"/>
+        for description of the phases.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of databases which will be processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this set to <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of databases which have been processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this set to <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of relations which will be processed, or
+        <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of relations yet. The launcher process has
+        this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of relations which have been processed. The launcher
+        process has this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>blocks_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks in the current relation which will be processed,
+        or <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of blocks yet. The launcher process has
+        this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>blocks_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks in the current relation which have been processed.
+        The launcher process has this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+  <table id="datachecksum-phases">
+   <title>Data Checksum Phases</title>
+   <tgroup cols="2">
+    <colspec colname="col1" colwidth="1*"/>
+    <colspec colname="col2" colwidth="2*"/>
+    <thead>
+     <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+    <tbody>
+     <row>
+      <entry><literal>enabling</literal></entry>
+      <entry>
+       The command is currently enabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>disabling</literal></entry>
+      <entry>
+       The command is currently disabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on temporary tables</literal></entry>
+      <entry>
+       The command is currently waiting for all temporary tables which existed
+       at the time the command was started to be removed.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on checkpoint</literal></entry>
+      <entry>
+       The command is currently waiting for a checkpoint to update the checksum
+       state before finishing.
+      </entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+ </sect2>
+
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index 95043aa329c..0343710af53 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -45,6 +45,12 @@ PostgreSQL documentation
    exit status is nonzero if the operation failed.
   </para>
 
+  <para>
+   When enabling checksums, if checksums were in the process of being enabled
+   when the cluster was shut down, <application>pg_checksums</application>
+   will still process all relations regardless of the online processing.
+  </para>
+
   <para>
    When verifying checksums, every file in the cluster is scanned. When
    enabling checksums, each relation file block with a changed checksum is
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index f3b86b26be9..0ada90ca0b1 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -246,9 +246,10 @@
   <para>
    Checksums can be disabled when the cluster is initialized using <link
    linkend="app-initdb-data-checksums"><application>initdb</application></link>.
-   They can also be enabled or disabled at a later time as an offline
-   operation. Data checksums are enabled or disabled at the full cluster
-   level, and cannot be specified individually for databases or tables.
+   They can also be enabled or disabled at a later time either as an offline
+   operation or online in a running cluster allowing concurrent access. Data
+   checksums are enabled or disabled at the full cluster level, and cannot be
+   specified individually for databases or tables.
   </para>
 
   <para>
@@ -265,7 +266,7 @@
   </para>
 
   <sect2 id="checksums-offline-enable-disable">
-   <title>Off-line Enabling of Checksums</title>
+   <title>Offline Enabling of Checksums</title>
 
    <para>
     The <link linkend="app-pgchecksums"><application>pg_checksums</application></link>
@@ -274,6 +275,56 @@
    </para>
 
   </sect2>
+
+  <sect2 id="checksums-online-enable-disable">
+   <title>Online Enabling of Checksums</title>
+
+   <para>
+    Checksums can be enabled or disabled online, by calling the appropriate
+    <link linkend="functions-admin-checksum">functions</link>.
+   </para>
+
+   <para>
+    Enabling checksums will put the cluster checksum mode in
+    <literal>inprogress-on</literal> mode.  During this time, checksums will be
+    written but not verified. In addition to this, a background worker process
+    is started that enables checksums on all existing data in the cluster. Once
+    this worker has completed processing all databases in the cluster, the
+    checksum mode will automatically switch to <literal>on</literal>. The
+    processing will consume two background worker processes, make sure that
+    <varname>max_worker_processes</varname> allows for at least two more
+    additional processes.
+   </para>
+
+   <para>
+    The process will initially wait for all open transactions to finish before
+    it starts, so that it can be certain that there are no tables that have been
+    created inside a transaction that has not committed yet and thus would not
+    be visible to the process enabling checksums. It will also, for each database,
+    wait for all pre-existing temporary tables to get removed before it finishes.
+    If long-lived temporary tables are used in the application it may be necessary
+    to terminate these application connections to allow the process to complete.
+   </para>
+
+   <para>
+    If the cluster is stopped while in <literal>inprogress-on</literal> mode, for
+    any reason, then this process must be restarted manually. To do this,
+    re-execute the function <function>pg_enable_data_checksums()</function>
+    once the cluster has been restarted. The process will start over, there is
+    no support for resuming work from where it was interrupted.
+   </para>
+
+   <note>
+    <para>
+     Enabling checksums can cause significant I/O to the system, as most of the
+     database pages will need to be rewritten, and will be written both to the
+     data files and the WAL. The impact may be limited by throttling using the
+     <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter>
+     parameters of the <function>pg_enable_data_checksums</function> function.
+    </para>
+   </note>
+
+  </sect2>
  </sect1>
 
   <sect1 id="wal-intro">
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index 58040f28656..1e3ad2ab68d 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -18,6 +18,7 @@
 #include "access/xlog.h"
 #include "access/xlog_internal.h"
 #include "catalog/pg_control.h"
+#include "storage/bufpage.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -167,6 +168,26 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
 		memcpy(&wal_level, rec, sizeof(int));
 		appendStringInfo(buf, "wal_level %s", get_wal_level_string(wal_level));
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state xlrec;
+
+		memcpy(&xlrec, rec, sizeof(xl_checksum_state));
+		switch (xlrec.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_VERSION:
+				appendStringInfoString(buf, "on");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				appendStringInfoString(buf, "inprogress-off");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				appendStringInfoString(buf, "inprogress-on");
+				break;
+			default:
+				appendStringInfoString(buf, "off");
+		}
+	}
 }
 
 const char *
@@ -218,6 +239,9 @@ xlog_identify(uint8 info)
 		case XLOG_CHECKPOINT_REDO:
 			id = "CHECKPOINT_REDO";
 			break;
+		case XLOG_CHECKSUMS:
+			id = "CHECKSUMS";
+			break;
 	}
 
 	return id;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 799fc739e18..61da6d583cd 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -550,6 +550,9 @@ typedef struct XLogCtlData
 	 */
 	XLogRecPtr	lastFpwDisableRecPtr;
 
+	/* last data_checksum_version we've seen */
+	uint32		data_checksum_version;
+
 	slock_t		info_lck;		/* locks shared variables shown above */
 } XLogCtlData;
 
@@ -647,6 +650,24 @@ static XLogRecPtr LocalMinRecoveryPoint;
 static TimeLineID LocalMinRecoveryPointTLI;
 static bool updateMinRecoveryPoint = true;
 
+/*
+ * Local state fror Controlfile data_checksum_version. After initialization
+ * this is only updated when absorbing a procsignal barrier during interrupt
+ * processing. The reason for keeping a copy in backend-private memory is to
+ * avoid locking for interrogating checksum state. Possible values are the
+ * checksum versions defined in storage/bufpage.h as well as zero when data
+ * checksums are disabled.
+ */
+static uint32 LocalDataChecksumVersion = 0;
+
+/*
+ * Variable backing the GUC, keep it in sync with LocalDataChecksumVersion.
+ * See SetLocalDataChecksumVersion().
+ */
+int			data_checksums = 0;
+
+static void SetLocalDataChecksumVersion(uint32 data_checksum_version);
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -715,6 +736,8 @@ static void WALInsertLockAcquireExclusive(void);
 static void WALInsertLockRelease(void);
 static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);
 
+static void XLogChecksums(ChecksumType new_type);
+
 /*
  * Insert an XLOG record represented by an already-constructed chain of data
  * chunks.  This is a low-level routine; to construct the WAL record header
@@ -828,9 +851,10 @@ XLogInsertRecord(XLogRecData *rdata,
 		 * only happen just after a checkpoint, so it's better to be slow in
 		 * this case and fast otherwise.
 		 *
-		 * Also check to see if fullPageWrites was just turned on or there's a
-		 * running backup (which forces full-page writes); if we weren't
-		 * already doing full-page writes then go back and recompute.
+		 * Also check to see if fullPageWrites was just turned on, there's a
+		 * running backup or if checksums are enabled (all of which forces
+		 * full-page writes); if we weren't already doing full-page writes
+		 * then go back and recompute.
 		 *
 		 * If we aren't doing full-page writes then RedoRecPtr doesn't
 		 * actually affect the contents of the XLOG record, so we'll update
@@ -843,7 +867,9 @@ XLogInsertRecord(XLogRecData *rdata,
 			Assert(RedoRecPtr < Insert->RedoRecPtr);
 			RedoRecPtr = Insert->RedoRecPtr;
 		}
-		doPageWrites = (Insert->fullPageWrites || Insert->runningBackups > 0);
+		doPageWrites = (Insert->fullPageWrites ||
+						Insert->runningBackups > 0 ||
+						DataChecksumsNeedWrite());
 
 		if (doPageWrites &&
 			(!prevDoPageWrites ||
@@ -4239,6 +4265,12 @@ InitControlFile(uint64 sysidentifier, uint32 data_checksum_version)
 	ControlFile->wal_log_hints = wal_log_hints;
 	ControlFile->track_commit_timestamp = track_commit_timestamp;
 	ControlFile->data_checksum_version = data_checksum_version;
+
+	/*
+	 * Set the data_checksum_version value into XLogCtl, which is where all
+	 * processes get the current value from. (Maybe it should go just there?)
+	 */
+	XLogCtl->data_checksum_version = data_checksum_version;
 }
 
 static void
@@ -4578,10 +4610,6 @@ ReadControlFile(void)
 		(SizeOfXLogLongPHD - SizeOfXLogShortPHD);
 
 	CalculateCheckpointSegments();
-
-	/* Make the initdb settings visible as GUC variables, too */
-	SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no",
-					PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
 }
 
 /*
@@ -4615,13 +4643,410 @@ GetMockAuthenticationNonce(void)
 }
 
 /*
- * Are checksums enabled for data pages?
+ * DataChecksumsNeedWrite
+ *		Returns whether data checksums must be written or not
+ *
+ * Returns true iff data checksums are enabled or are in the process of being
+ * enabled. During "inprogress-on" and "inprogress-off" states checksums must
+ * be written even though they are not verified (see datachecksumsworker.c for
+ * a longer discussion).
+ *
+ * This function is intended for callsites which are about to write a data page
+ * to storage, and need to know whether to re-calculate the checksum for the
+ * page header. Calling this function must be performed as close to the write
+ * operation as possible to keep the critical section short.
+ */
+bool
+DataChecksumsNeedWrite(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * DataChecksumsNeedVerify
+ *		Returns whether data checksums must be verified or not
+ *
+ * Data checksums are only verified if they are fully enabled in the cluster.
+ * During the "inprogress-on" and "inprogress-off" states they are only
+ * updated, not verified (see datachecksumsworker.c for a longer discussion).
+ *
+ * This function is intended for callsites which have read data and are about
+ * to perform checksum validation based on the result of this.  Calling this
+ * function must be performed as close to the validation call as possible to
+ * keep the critical section short. This is in order to protect against time of
+ * check/time of use situations around data checksum validation.
  */
 bool
-DataChecksumsEnabled(void)
+DataChecksumsNeedVerify(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
+}
+
+/*
+ * DataChecksumsOnInProgress
+ *		Returns whether data checksums are being enabled
+ *
+ * Most operations don't need to worry about the "inprogress" states, and
+ * should use DataChecksumsNeedVerify() or DataChecksumsNeedWrite(). The
+ * "inprogress-on" state for enabling checksums is used when the checksum
+ * worker is setting checksums on all pages, it can thus be used to check for
+ * aborted checksum processing which need to be restarted.
+ */
+inline bool
+DataChecksumsOnInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+}
+
+/*
+ * DataChecksumsOffInProgress
+ *		Returns whether data checksums are being disabled
+ *
+ * The "inprogress-off" state for disabling checksums is used for when the
+ * worker resets the catalog state.  DataChecksumsNeedVerify() or
+ * DataChecksumsNeedWrite() should be used for deciding whether to read/write
+ * checksums.
+ */
+bool
+DataChecksumsOffInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * SetDataChecksumsOnInProgress
+ *		Sets the data checksum state to "inprogress-on" to enable checksums
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on". See
+ * SetDataChecksumsOn below for a description on how this state change works.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOnInProgress(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile != NULL);
+
+	/*
+	 * The state transition is performed in a critical section with
+	 * checkpoints held off to provide crash safety.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state change in all backends to ensure that all backends are in
+	 * "inprogress-on". Once done we know that all backends are writing data
+	 * checksums.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOn
+ *		Enables data checksums cluster-wide
+ *
+ * Enabling data checksums is performed using two barriers, the first one to
+ * set the checksums state to "inprogress-on" (which is performed by
+ * SetDataChecksumsOnInProgress()) and the second one to set the state to "on"
+ * (performed here).
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on".  This state
+ * requires data checksums to be written but not verified. This ensures that
+ * all data pages can be checksummed without the risk of false negatives in
+ * validation during the process.  When all existing pages are guaranteed to
+ * have checksums, and all new pages will be initiated with checksums, the
+ * state can be changed to "on". Once the state is "on" checksums will be both
+ * written and verified. See datachecksumsworker.c for a longer discussion on
+ * how data checksums can be enabled in a running cluster.
+ *
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOn(void)
 {
+	uint64		barrier;
+
 	Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+
+	/*
+	 * The only allowed state transition to "on" is from "inprogress-on" since
+	 * that state ensures that all pages will have data checksums written.
+	 */
+	if (XLogCtl->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		SpinLockRelease(&XLogCtl->info_lck);
+		elog(ERROR, "checksums not in \"inprogress-on\" mode");
+	}
+
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_VERSION);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state transition of "on" in all backends. When done we know that
+	 * data checksums are enabled in all backends and data checksums are both
+	 * written and verified.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOff
+ *		Disables data checksums cluster-wide
+ *
+ * Disabling data checksums must be performed with two sets of barriers, each
+ * carrying a different state. The state is first set to "inprogress-off"
+ * during which checksums are still written but not verified. This ensures that
+ * backends which have yet to observe the state change from "on" won't get
+ * validation errors on concurrently modified pages. Once all backends have
+ * changed to "inprogress-off", the barrier for moving to "off" can be emitted.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOff(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+
+	/* If data checksums are already disabled there is nothing to do */
+	if (XLogCtl->data_checksum_version == 0)
+	{
+		SpinLockRelease(&XLogCtl->info_lck);
+		return;
+	}
+
+	/*
+	 * If data checksums are currently enabled we first transition to the
+	 * "inprogress-off" state during which backends continue to write
+	 * checksums without verifying them. When all backends are in
+	 * "inprogress-off" the next transition to "off" can be performed, after
+	 * which all data checksum processing is disabled.
+	 */
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
+	{
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+		START_CRIT_SECTION();
+
+		XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+
+		END_CRIT_SECTION();
+		MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+
+		/*
+		 * Update local state in all backends to ensure that any backend in
+		 * "on" state is changed to "inprogress-off".
+		 */
+		WaitForProcSignalBarrier(barrier);
+
+		/*
+		 * At this point we know that no backends are verifying data checksums
+		 * during reading. Next, we can safely move to state "off" to also
+		 * stop writing checksums.
+		 */
+	}
+	else
+	{
+		/*
+		 * Ending up here implies that the checksums state is "inprogress-on"
+		 * or "inprogress-off" and we can transition directly to "off" from
+		 * there.
+		 */
+		SpinLockRelease(&XLogCtl->info_lck);
+	}
+
+	/*
+	 * Ensure that we don't incur a checkpoint during disabling checksums.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(0);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = 0;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * ProcSignalBarrier absorption functions for enabling and disabling data
+ * checksums in a running cluster. The procsignalbarriers are emitted in the
+ * SetDataChecksums* functions.
+ */
+bool
+AbsorbChecksumsOnInProgressBarrier(void)
+{
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	return true;
+}
+
+bool
+AbsorbChecksumsOnBarrier(void)
+{
+	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_VERSION);
+	return true;
+}
+
+bool
+AbsorbChecksumsOffInProgressBarrier(void)
+{
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+	return true;
+}
+
+bool
+AbsorbChecksumsOffBarrier(void)
+{
+	SetLocalDataChecksumVersion(0);
+	return true;
+}
+
+/*
+ * InitLocalControlData
+ *
+ * Set up backend local caches of controldata variables which may change at
+ * any point during runtime and thus require special cased locking. So far
+ * this only applies to data_checksum_version, but it's intended to be general
+ * purpose enough to handle future cases.
+ */
+void
+InitLocalControldata(void)
+{
+	SpinLockAcquire(&XLogCtl->info_lck);
+	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+	SpinLockRelease(&XLogCtl->info_lck);
+}
+
+/*
+ * XXX probably should be called in all places that modify the value of
+ * LocalDataChecksumVersion (to make sure data_checksums GUC is in sync)
+ *
+ * XXX aren't PG_DATA_ and DATA_ constants the same? why do we need both?
+ */
+void
+SetLocalDataChecksumVersion(uint32 data_checksum_version)
+{
+	LocalDataChecksumVersion = data_checksum_version;
+
+	switch (LocalDataChecksumVersion)
+	{
+		case PG_DATA_CHECKSUM_VERSION:
+			data_checksums = DATA_CHECKSUMS_ON;
+			break;
+
+		case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+			data_checksums = DATA_CHECKSUMS_INPROGRESS_ON;
+			break;
+
+		case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+			data_checksums = DATA_CHECKSUMS_INPROGRESS_OFF;
+			break;
+
+		default:
+			data_checksums = DATA_CHECKSUMS_OFF;
+			break;
+	}
+}
+
+/*
+ * Initialize the various data checksum values - GUC, local, ....
+ */
+void
+InitLocalDataChecksumVersion(void)
+{
+	uint32	data_checksum_version;
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	data_checksum_version = XLogCtl->data_checksum_version;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	SetLocalDataChecksumVersion(data_checksum_version);
+}
+
+/*
+ * Get the local data_checksum_version (cached XLogCtl value).
+ */
+uint32
+GetLocalDataChecksumVersion(void)
+{
+	return LocalDataChecksumVersion;
+}
+
+/*
+ * Get the *current* data_checksum_version (might not be written to control
+ * file yet).
+ */
+uint32
+GetCurrentDataChecksumVersion(void)
+{
+	return XLogCtl->data_checksum_version;
+}
+
+/* guc hook */
+const char *
+show_data_checksums(void)
+{
+	if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)
+		return "on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		return "inprogress-on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+		return "inprogress-off";
+	else
+		return "off";
 }
 
 /*
@@ -5063,6 +5488,11 @@ XLOGShmemInit(void)
 	XLogCtl->InstallXLogFileSegmentActive = false;
 	XLogCtl->WalWriterSleeping = false;
 
+	/* use the checksum info from control file */
+	XLogCtl->data_checksum_version = ControlFile->data_checksum_version;
+
+	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+
 	SpinLockInit(&XLogCtl->Insert.insertpos_lck);
 	SpinLockInit(&XLogCtl->info_lck);
 	pg_atomic_init_u64(&XLogCtl->logInsertResult, InvalidXLogRecPtr);
@@ -6194,6 +6624,33 @@ StartupXLOG(void)
 	 */
 	CompleteCommitTsInitialization();
 
+	/*
+	 * If we reach this point with checksums being enabled ("inprogress-on"
+	 * state), we notify the user that they need to manually restart the
+	 * process to enable checksums. This is because we cannot launch a dynamic
+	 * background worker directly from here, it has to be launched from a
+	 * regular backend.
+	 */
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		ereport(WARNING,
+				(errmsg("data checksums are being enabled, but no worker is running"),
+				 errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
+
+	/*
+	 * If data checksums were being disabled when the cluster was shut down,
+	 * we know that we have a state where all backends have stopped validating
+	 * checksums and we can move to off instead of prompting the user to
+	 * perform any action.
+	 */
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+	{
+		XLogChecksums(0);
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = 0;
+		SpinLockRelease(&XLogCtl->info_lck);
+	}
+
 	/*
 	 * All done with end-of-recovery actions.
 	 *
@@ -7049,6 +7506,12 @@ CreateCheckPoint(int flags)
 	checkPoint.fullPageWrites = Insert->fullPageWrites;
 	checkPoint.wal_level = wal_level;
 
+	/*
+	 * Get the current data_checksum_version value from xlogctl, valid at
+	 * the time of the checkpoint.
+	 */
+	checkPoint.data_checksum_version = XLogCtl->data_checksum_version;
+
 	if (shutdown)
 	{
 		XLogRecPtr	curInsert = XLogBytePosToRecPtr(Insert->CurrBytePos);
@@ -7304,6 +7767,9 @@ CreateCheckPoint(int flags)
 	ControlFile->minRecoveryPoint = InvalidXLogRecPtr;
 	ControlFile->minRecoveryPointTLI = 0;
 
+	/* make sure we start with the checksum version as of the checkpoint */
+	ControlFile->data_checksum_version = checkPoint.data_checksum_version;
+
 	/*
 	 * Persist unloggedLSN value. It's reset on crash recovery, so this goes
 	 * unused on non-shutdown checkpoints, but seems useful to store it always
@@ -7450,6 +7916,10 @@ CreateEndOfRecoveryRecord(void)
 	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
 	ControlFile->minRecoveryPoint = recptr;
 	ControlFile->minRecoveryPointTLI = xlrec.ThisTimeLineID;
+
+	/* start with the latest checksum version (as of the end of recovery) */
+	ControlFile->data_checksum_version = XLogCtl->data_checksum_version;
+
 	UpdateControlFile();
 	LWLockRelease(ControlFileLock);
 
@@ -7792,6 +8262,10 @@ CreateRestartPoint(int flags)
 			if (flags & CHECKPOINT_IS_SHUTDOWN)
 				ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY;
 		}
+
+		/* we shall start with the latest checksum version */
+		ControlFile->data_checksum_version = lastCheckPoint.data_checksum_version;
+
 		UpdateControlFile();
 	}
 	LWLockRelease(ControlFileLock);
@@ -8201,6 +8675,24 @@ XLogReportParameters(void)
 	}
 }
 
+/*
+ * Log the new state of checksums
+ */
+static void
+XLogChecksums(ChecksumType new_type)
+{
+	xl_checksum_state xlrec;
+	XLogRecPtr	recptr;
+
+	xlrec.new_checksumtype = new_type;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, sizeof(xl_checksum_state));
+
+	recptr = XLogInsert(RM_XLOG_ID, XLOG_CHECKSUMS);
+	XLogFlush(recptr);
+}
+
 /*
  * Update full_page_writes in shared memory, and write an
  * XLOG_FPW_CHANGE record if necessary.
@@ -8629,6 +9121,46 @@ xlog_redo(XLogReaderState *record)
 	{
 		/* nothing to do here, just for informational purposes */
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state state;
+		uint64		barrier;
+
+		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = state.new_checksumtype;
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		/*
+		 * Block on a procsignalbarrier to await all processes having seen the
+		 * change to checksum status. Once the barrier has been passed we can
+		 * initiate the corresponding processing.
+		 */
+		switch (state.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			default:
+				Assert(state.new_checksumtype == 0);
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+		}
+	}
 }
 
 /*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 8c3090165f0..6b1c2ed85ea 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -26,6 +26,7 @@
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "postmaster/datachecksumsworker.h"
 #include "replication/walreceiver.h"
 #include "storage/fd.h"
 #include "storage/latch.h"
@@ -748,3 +749,43 @@ pg_promote(PG_FUNCTION_ARGS)
 						   wait_seconds)));
 	PG_RETURN_BOOL(false);
 }
+
+/*
+ * Disables data checksums for the cluster, if applicable. Starts a background
+ * worker which turns off the data checksums.
+ */
+Datum
+disable_data_checksums(PG_FUNCTION_ARGS)
+{
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	StartDataChecksumsWorkerLauncher(false, 0, 0, false);
+	PG_RETURN_VOID();
+}
+
+/*
+ * Enables data checksums for the cluster, if applicable.  Supports vacuum-
+ * like cost based throttling to limit system load. Starts a background worker
+ * which updates data checksums on existing data.
+ */
+Datum
+enable_data_checksums(PG_FUNCTION_ARGS)
+{
+	int			cost_delay = PG_GETARG_INT32(0);
+	int			cost_limit = PG_GETARG_INT32(1);
+	bool		fast = PG_GETARG_BOOL(2);
+
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	if (cost_delay < 0)
+		ereport(ERROR, errmsg("cost delay cannot be a negative value"));
+
+	if (cost_limit <= 0)
+		ereport(ERROR, errmsg("cost limit must be greater than zero"));
+
+	StartDataChecksumsWorkerLauncher(true, cost_delay, cost_limit, fast);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 3f8a3c55725..598dd41bae6 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1613,7 +1613,8 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	 * enabled for this cluster, and if this is a relation file, then verify
 	 * the checksum.
 	 */
-	if (!noverify_checksums && DataChecksumsEnabled() &&
+	if (!noverify_checksums &&
+		DataChecksumsNeedWrite() &&
 		RelFileNumberIsValid(relfilenumber))
 		verify_checksum = true;
 
@@ -2006,6 +2007,9 @@ verify_page_checksum(Page page, XLogRecPtr start_lsn, BlockNumber blkno,
 	if (PageIsNew(page) || PageGetLSN(page) >= start_lsn)
 		return true;
 
+	if (!DataChecksumsNeedVerify())
+		return true;
+
 	/* Perform the actual checksum calculation. */
 	checksum = pg_checksum_page(page, blkno);
 
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 566f308e443..b18db2b5dde 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -650,6 +650,13 @@ LANGUAGE INTERNAL
 CALLED ON NULL INPUT VOLATILE PARALLEL SAFE
 AS 'pg_stat_reset_slru';
 
+CREATE OR REPLACE FUNCTION
+  pg_enable_data_checksums(cost_delay integer DEFAULT 0,
+                           cost_limit integer DEFAULT 100,
+						   fast boolean DEFAULT false)
+  RETURNS void STRICT VOLATILE LANGUAGE internal AS 'enable_data_checksums'
+  PARALLEL RESTRICTED;
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
@@ -775,6 +782,10 @@ REVOKE EXECUTE ON FUNCTION pg_ls_logicalmapdir() FROM PUBLIC;
 
 REVOKE EXECUTE ON FUNCTION pg_ls_replslotdir(text) FROM PUBLIC;
 
+REVOKE EXECUTE ON FUNCTION pg_enable_data_checksums(integer, integer, boolean) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_disable_data_checksums() FROM public;
+
 --
 -- We also set up some things as accessible to standard roles.
 --
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index a4d2cfdcaf5..6ffd31ce39c 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1334,6 +1334,26 @@ CREATE VIEW pg_stat_progress_copy AS
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_data_checksums AS
+    SELECT
+        S.pid AS pid, S.datid, D.datname AS datname,
+        CASE S.param1 WHEN 0 THEN 'enabling'
+                      WHEN 1 THEN 'disabling'
+                      WHEN 2 THEN 'waiting'
+                      WHEN 3 THEN 'waiting on temporary tables'
+                      WHEN 4 THEN 'waiting on checkpoint'
+                      WHEN 5 THEN 'done'
+                      END AS phase,
+        CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+        S.param3 AS databases_done,
+        CASE S.param4 WHEN -1 THEN NULL ELSE S.param4 END AS relations_total,
+        CASE S.param5 WHEN -1 THEN NULL ELSE S.param5 END AS relations_done,
+        CASE S.param6 WHEN -1 THEN NULL ELSE S.param6 END AS blocks_total,
+        CASE S.param7 WHEN -1 THEN NULL ELSE S.param7 END AS blocks_done
+    FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid
+    ORDER BY S.datid; -- return the launcher process first
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/postmaster/Makefile b/src/backend/postmaster/Makefile
index 0f4435d2d97..0c36765acfe 100644
--- a/src/backend/postmaster/Makefile
+++ b/src/backend/postmaster/Makefile
@@ -18,6 +18,7 @@ OBJS = \
 	bgworker.o \
 	bgwriter.o \
 	checkpointer.o \
+	datachecksumsworker.o \
 	fork_process.o \
 	interrupt.o \
 	launch_backend.o \
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 116ddf7b835..5720f66b0fd 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -18,6 +18,7 @@
 #include "pgstat.h"
 #include "port/atomics.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "replication/logicallauncher.h"
 #include "replication/logicalworker.h"
@@ -132,6 +133,12 @@ static const struct
 	},
 	{
 		"TablesyncWorkerMain", TablesyncWorkerMain
+	},
+	{
+		"DataChecksumsWorkerLauncherMain", DataChecksumsWorkerLauncherMain
+	},
+	{
+		"DataChecksumsWorkerMain", DataChecksumsWorkerMain
 	}
 };
 
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
new file mode 100644
index 00000000000..6a201dca8de
--- /dev/null
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -0,0 +1,1439 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.c
+ *	  Background worker for enabling or disabling data checksums online
+ *
+ * When enabling data checksums on a database at initdb time or when shut down
+ * with pg_checksums, no extra process is required as each page is checksummed,
+ * and verified, when accessed.  When enabling checksums on an already running
+ * cluster, this worker will ensure that all pages are checksummed before
+ * verification of the checksums is turned on. In the case of disabling
+ * checksums, the state transition is performed only in the control file, no
+ * changes are performed on the data pages.
+ *
+ * Checksums can be either enabled or disabled cluster-wide, with on/off being
+ * the end state for data_checksums.
+ *
+ * Enabling checksums
+ * ------------------
+ * When enabling checksums in an online cluster, data_checksums will be set to
+ * "inprogress-on" which signals that write operations MUST compute and write
+ * the checksum on the data page, but during reading the checksum SHALL NOT be
+ * verified. This ensures that all objects created during checksumming will
+ * have checksums set, but no reads will fail due to incorrect checksum. The
+ * DataChecksumsWorker will compile a list of databases which exist at the
+ * start of checksumming, and all of these which haven't been dropped during
+ * the processing MUST have been processed successfully in order for checksums
+ * to be enabled. Any new relation created during processing will see the
+ * in-progress state and will automatically be checksummed.
+ *
+ * For each database, all relations which have storage are read and every data
+ * page is marked dirty to force a write with the checksum. This will generate
+ * a lot of WAL as the entire database is read and written.
+ *
+ * If the processing is interrupted by a cluster restart, it will be restarted
+ * from the beginning again as state isn't persisted.
+ *
+ * Disabling checksums
+ * -------------------
+ * When disabling checksums, data_checksums will be set to "inprogress-off"
+ * which signals that checksums are written but no longer verified. This ensure
+ * that backends which have yet to move from the "on" state will still be able
+ * to process data checksum validation.
+ *
+ * Synchronization and Correctness
+ * -------------------------------
+ * The processes involved in enabling, or disabling, data checksums in an
+ * online cluster must be properly synchronized with the normal backends
+ * serving concurrent queries to ensure correctness. Correctness is defined
+ * as the following:
+ *
+ *    - Backends SHALL NOT violate local data_checksums state
+ *    - Data checksums SHALL NOT be considered enabled cluster-wide until all
+ *      currently connected backends have the local state "enabled"
+ *
+ * There are two levels of synchronization required for enabling data checksums
+ * in an online cluster: (i) changing state in the active backends ("on",
+ * "off", "inprogress-on" and "inprogress-off"), and (ii) ensuring no
+ * incompatible objects and processes are left in a database when workers end.
+ * The former deals with cluster-wide agreement on data checksum state and the
+ * latter with ensuring that any concurrent activity cannot break the data
+ * checksum contract during processing.
+ *
+ * Synchronizing the state change is done with procsignal barriers, where the
+ * WAL logging backend updating the global state in the controlfile will wait
+ * for all other backends to absorb the barrier. Barrier absorption will happen
+ * during interrupt processing, which means that connected backends will change
+ * state at different times. To prevent data checksum state changes when
+ * writing and verifying checksums, interrupts shall be held off before
+ * interrogating state and resumed when the IO operation has been performed.
+ *
+ *   When Enabling Data Checksums
+ *   ----------------------------
+ *   A process which fails to observe data checksums being enabled can induce
+ *   two types of errors: failing to write the checksum when modifying the page
+ *   and failing to validate the data checksum on the page when reading it.
+ *
+ *   When processing starts all backends belong to one of the below sets, with
+ *   one set being empty:
+ *
+ *   Bd: Backends in "off" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   If processing is started in an online cluster then all backends are in Bd.
+ *   If processing was halted by the cluster shutting down, the controlfile
+ *   state "inprogress-on" will be observed on system startup and all backends
+ *   will be in Bd. Backends transition Bd -> Bi via a procsignalbarrier.  When
+ *   the DataChecksumsWorker has finished writing checksums on all pages and
+ *   enables data checksums cluster-wide, there are four sets of backends where
+ *   Bd shall be an empty set:
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   Backends in Bi and Be will write checksums when modifying a page, but only
+ *   backends in Be will verify the checksum during reading. The Bg backend is
+ *   blocked waiting for all backends in Bi to process interrupts and move to
+ *   Be. Any backend starting while Bg is waiting on the procsignalbarrier will
+ *   observe the global state being "on" and will thus automatically belong to
+ *   Be.  Checksums are enabled cluster-wide when Bi is an empty set. Bi and Be
+ *   are compatible sets while still operating based on their local state as
+ *   both write data checksums.
+ *
+ *   When Disabling Data Checksums
+ *   -----------------------------
+ *   A process which fails to observe that data checksums have been disabled
+ *   can induce two types of errors: writing the checksum when modifying the
+ *   page and validating a data checksum which is no longer correct due to
+ *   modifications to the page.
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bo: Backends in "inprogress-off" state
+ *
+ *   Backends transition from the Be state to Bd like so: Be -> Bo -> Bd
+ *
+ *   The goal is to transition all backends to Bd making the others empty sets.
+ *   Backends in Bo write data checksums, but don't validate them, such that
+ *   backends still in Be can continue to validate pages until the barrier has
+ *   been absorbed such that they are in Bo. Once all backends are in Bo, the
+ *   barrier to transition to "off" can be raised and all backends can safely
+ *   stop writing data checksums as no backend is enforcing data checksum
+ *   validation any longer.
+ *
+ *
+ * Potential optimizations
+ * -----------------------
+ * Below are some potential optimizations and improvements which were brought
+ * up during reviews of this feature, but which weren't implemented in the
+ * initial version. These are ideas listed without any validation on their
+ * feasibility or potential payoff. More discussion on these can be found on
+ * the -hackers threads linked to in the commit message of this feature.
+ *
+ *   * Launching datachecksumsworker for resuming operation from the startup
+ *     process: Currently users have to restart processing manually after a
+ *     restart since dynamic background worker cannot be started from the
+ *     postmaster. Changing the startup process could make restarting the
+ *     processing automatic on cluster restart.
+ *   * Avoid dirtying the page when checksums already match: Iff the checksum
+ *     on the page happens to already match we still dirty the page. It should
+ *     be enough to only do the log_newpage_buffer() call in that case.
+ *   * Invent a lightweight WAL record that doesn't contain the full-page
+ *     image but just the block number: On replay, the redo routine would read
+ *     the page from disk.
+ *   * Teach pg_checksums to avoid checksummed pages when pg_checksums is used
+ *     to enable checksums on a cluster which is in inprogress-on state and
+ *     may have checksummed pages (make pg_checksums be able to resume an
+ *     online operation).
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/postmaster/datachecksumsworker.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/htup_details.h"
+#include "access/xact.h"
+#include "access/xloginsert.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_database.h"
+#include "commands/progress.h"
+#include "commands/vacuum.h"
+#include "common/relpath.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "postmaster/bgworker.h"
+#include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/bufmgr.h"
+#include "storage/checksum.h"
+#include "storage/ipc.h"
+#include "storage/lmgr.h"
+#include "storage/lwlock.h"
+#include "storage/procarray.h"
+#include "storage/smgr.h"
+#include "tcop/tcopprot.h"
+#include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
+#include "utils/ps_status.h"
+#include "utils/syscache.h"
+
+/*
+ * Number of times we retry to open a database before giving up and consider
+ * it to have failed processing.
+ */
+#define DATACHECKSUMSWORKER_MAX_DB_RETRIES 5
+
+typedef enum
+{
+	DATACHECKSUMSWORKER_SUCCESSFUL = 0,
+	DATACHECKSUMSWORKER_ABORTED,
+	DATACHECKSUMSWORKER_FAILED,
+	DATACHECKSUMSWORKER_RETRYDB,
+} DataChecksumsWorkerResult;
+
+/*
+ * Signaling between backends calling pg_enable/disable_data_checksums, the
+ * checksums launcher process, and the checksums worker process.
+ *
+ * This struct is protected by DataChecksumsWorkerLock
+ */
+typedef struct DataChecksumsWorkerShmemStruct
+{
+	/*
+	 * These are set by pg_enable/disable_data_checksums, to tell the launcher
+	 * what the target state is.
+	 */
+	bool		launch_enable_checksums;	/* True if checksums are being
+											 * enabled, else false */
+	int			launch_cost_delay;
+	int			launch_cost_limit;
+	bool		launch_fast;
+
+	/*
+	 * Is a launcher process is currently running?
+	 *
+	 * This is set by the launcher process, after it has read the above
+	 * launch_* parameters.
+	 */
+	bool		launcher_running;
+
+	/*
+	 * These fields indicate the target state that the launcher is currently
+	 * working towards. They can be different from the corresponding launch_*
+	 * fields, if a new pg_enable/disable_data_checksums() call was made while
+	 * the launcher/worker was already running.
+	 *
+	 * The below members are set when the launcher starts, and are only
+	 * accessed read-only by the single worker. Thus, we can access these
+	 * without a lock. If multiple workers, or dynamic cost parameters, are
+	 * supported at some point then this would need to be revisited.
+	 */
+	bool		enabling_checksums; /* True if checksums are being enabled,
+									 * else false */
+	int			cost_delay;
+	int			cost_limit;
+	bool		immediate_checkpoint;
+
+	/*
+	 * Signaling between the launcher and the worker process.
+	 *
+	 * As there is only a single worker, and the launcher won't read these
+	 * until the worker exits, they can be accessed without the need for a
+	 * lock. If multiple workers are supported then this will have to be
+	 * revisited.
+	 */
+
+	/* result, set by worker before exiting */
+	DataChecksumsWorkerResult success;
+
+	/*
+	 * tells the worker process whether it should also process the shared
+	 * catalogs
+	 */
+	bool		process_shared_catalogs;
+} DataChecksumsWorkerShmemStruct;
+
+/* Shared memory segment for datachecksumsworker */
+static DataChecksumsWorkerShmemStruct *DataChecksumsWorkerShmem;
+
+/* Bookkeeping for work to do */
+typedef struct DataChecksumsWorkerDatabase
+{
+	Oid			dboid;
+	char	   *dbname;
+} DataChecksumsWorkerDatabase;
+
+typedef struct DataChecksumsWorkerResultEntry
+{
+	Oid			dboid;
+	DataChecksumsWorkerResult result;
+	int			retries;
+} DataChecksumsWorkerResultEntry;
+
+
+/*
+ * Flag set by the interrupt handler
+ */
+static volatile sig_atomic_t abort_requested = false;
+
+/*
+ * Have we set the DataChecksumsWorkerShmemStruct->launcher_running flag?
+ * If we have, we need to clear it before exiting!
+ */
+static volatile sig_atomic_t launcher_running = false;
+
+/*
+ * Are we enabling data checksums, or disabling them?
+ */
+static bool enabling_checksums;
+
+/* Prototypes */
+static List *BuildDatabaseList(void);
+static List *BuildRelationList(bool temp_relations, bool include_shared);
+static void FreeDatabaseList(List *dblist);
+static DataChecksumsWorkerResult ProcessDatabase(DataChecksumsWorkerDatabase *db);
+static bool ProcessAllDatabases(bool immediate_checkpoint);
+static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy);
+static void launcher_cancel_handler(SIGNAL_ARGS);
+static void WaitForAllTransactionsToFinish(void);
+
+/*
+ * StartDataChecksumsWorkerLauncher
+ *		Main entry point for datachecksumsworker launcher process
+ *
+ * The main entrypoint for starting data checksums processing for enabling as
+ * well as disabling.
+ */
+void
+StartDataChecksumsWorkerLauncher(bool enable_checksums,
+								 int cost_delay,
+								 int cost_limit,
+								 bool fast)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	bool		launcher_running;
+
+	/* the cost delay settings have no effect when disabling */
+	Assert(enable_checksums || cost_delay == 0);
+	Assert(enable_checksums || cost_limit == 0);
+
+	/* store the desired state in shared memory */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	DataChecksumsWorkerShmem->launch_enable_checksums = enable_checksums;
+	DataChecksumsWorkerShmem->launch_cost_delay = cost_delay;
+	DataChecksumsWorkerShmem->launch_cost_limit = cost_limit;
+	DataChecksumsWorkerShmem->launch_fast = fast;
+
+	/* is the launcher already running? */
+	launcher_running = DataChecksumsWorkerShmem->launcher_running;
+
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * Launch a new launcher process, if it's not running already.
+	 *
+	 * If the launcher is currently busy enabling the checksums, and we want
+	 * them disabled (or vice versa), the launcher will notice that at latest
+	 * when it's about to exit, and will loop back process the new request. So
+	 * if the launcher is already running, we don't need to do anything more
+	 * here to abort it.
+	 *
+	 * If you call pg_enable/disable_data_checksums() twice in a row, before
+	 * the launcher has had a chance to start up, we still end up launching it
+	 * twice.  That's OK, the second invocation will see that a launcher is
+	 * already running and exit quickly.
+	 *
+	 * TODO: We could optimize here and skip launching the launcher, if we are
+	 * already in the desired state, i.e. if the checksums are already enabled
+	 * and you call pg_enable_data_checksums().
+	 */
+	if (!launcher_running)
+	{
+		/*
+		 * Prepare the BackgroundWorker and launch it.
+		 */
+		memset(&bgw, 0, sizeof(bgw));
+		bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+		bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+		snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+		snprintf(bgw.bgw_function_name, BGW_MAXLEN, "DataChecksumsWorkerLauncherMain");
+		snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker launcher");
+		snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker launcher");
+		bgw.bgw_restart_time = BGW_NEVER_RESTART;
+		bgw.bgw_notify_pid = MyProcPid;
+		bgw.bgw_main_arg = (Datum) 0;
+
+		if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+			ereport(ERROR,
+					(errmsg("failed to start background worker to process data checksums")));
+	}
+}
+
+/*
+ * ProcessSingleRelationFork
+ *		Enable data checksums in a single relation/fork.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy)
+{
+	BlockNumber numblocks = RelationGetNumberOfBlocksInFork(reln, forkNum);
+	BlockNumber blknum;
+	char		activity[NAMEDATALEN * 2 + 128];
+	char	   *relns;
+
+	relns = get_namespace_name(RelationGetNamespace(reln));
+
+	if (!relns)
+		return false;
+
+	/* Report the current relation to pgstat_activity */
+	snprintf(activity, sizeof(activity) - 1, "processing: %s.%s (%s, %dblocks)",
+			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	/* XXX only do this for main forks, maybe we should do it for all? */
+	if (forkNum == MAIN_FORKNUM)
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+									 numblocks);
+
+	/*
+	 * We are looping over the blocks which existed at the time of process
+	 * start, which is safe since new blocks are created with checksums set
+	 * already due to the state being "inprogress-on".
+	 */
+	for (blknum = 0; blknum < numblocks; blknum++)
+	{
+		Buffer		buf = ReadBufferExtended(reln, forkNum, blknum, RBM_NORMAL, strategy);
+
+		/* Need to get an exclusive lock before we can flag as dirty */
+		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+
+		/*
+		 * Mark the buffer as dirty and force a full page write.  We have to
+		 * re-write the page to WAL even if the checksum hasn't changed,
+		 * because if there is a replica it might have a slightly different
+		 * version of the page with an invalid checksum, caused by unlogged
+		 * changes (e.g. hintbits) on the master happening while checksums
+		 * were off. This can happen if there was a valid checksum on the page
+		 * at one point in the past, so only when checksums are first on, then
+		 * off, and then turned on again. Iff wal_level is set to "minimal",
+		 * this could be avoided iff the checksum is calculated to be correct.
+		 */
+		START_CRIT_SECTION();
+		MarkBufferDirty(buf);
+		log_newpage_buffer(buf, false);
+		END_CRIT_SECTION();
+
+		UnlockReleaseBuffer(buf);
+
+		/*
+		 * This is the only place where we check if we are asked to abort, the
+		 * abortion will bubble up from here. It's safe to check this without
+		 * a lock, because if we miss it being set, we will try again soon.
+		 */
+		Assert(enabling_checksums);
+		if (!DataChecksumsWorkerShmem->launch_enable_checksums)
+			abort_requested = true;
+		if (abort_requested)
+			return false;
+
+		/* XXX only do this for main forks, maybe we should do it for all? */
+		if (forkNum == MAIN_FORKNUM)
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+										 (blknum + 1));
+
+		vacuum_delay_point(false);
+	}
+
+	pfree(relns);
+	return true;
+}
+
+/*
+ * ProcessSingleRelationByOid
+ *		Process a single relation based on oid.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationByOid(Oid relationId, BufferAccessStrategy strategy)
+{
+	Relation	rel;
+	ForkNumber	fnum;
+	bool		aborted = false;
+
+	StartTransactionCommand();
+
+	rel = try_relation_open(relationId, AccessShareLock);
+	if (rel == NULL)
+	{
+		/*
+		 * Relation no longer exists. We don't consider this an error since
+		 * there are no pages in it that need data checksums, and thus return
+		 * true. The worker operates off a list of relations generated at the
+		 * start of processing, so relations being dropped in the meantime is
+		 * to be expected.
+		 */
+		CommitTransactionCommand();
+		pgstat_report_activity(STATE_IDLE, NULL);
+		return true;
+	}
+	RelationGetSmgr(rel);
+
+	for (fnum = 0; fnum <= MAX_FORKNUM; fnum++)
+	{
+		if (smgrexists(rel->rd_smgr, fnum))
+		{
+			if (!ProcessSingleRelationFork(rel, fnum, strategy))
+			{
+				aborted = true;
+				break;
+			}
+		}
+	}
+	relation_close(rel, AccessShareLock);
+	elog(DEBUG2,
+		 "data checksum processing done for relation with OID %u: %s",
+		 relationId, (aborted ? "aborted" : "finished"));
+
+	CommitTransactionCommand();
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return !aborted;
+}
+
+/*
+ * ProcessDatabase
+ *		Enable data checksums in a single database.
+ *
+ * We do this by launching a dynamic background worker into this database, and
+ * waiting for it to finish.  We have to do this in a separate worker, since
+ * each process can only be connected to one database during its lifetime.
+ */
+static DataChecksumsWorkerResult
+ProcessDatabase(DataChecksumsWorkerDatabase *db)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	BgwHandleStatus status;
+	pid_t		pid;
+	char		activity[NAMEDATALEN + 64];
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_FAILED;
+
+	memset(&bgw, 0, sizeof(bgw));
+	bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+	bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+	snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+	snprintf(bgw.bgw_function_name, BGW_MAXLEN, "%s", "DataChecksumsWorkerMain");
+	snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker worker");
+	snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker worker");
+	bgw.bgw_restart_time = BGW_NEVER_RESTART;
+	bgw.bgw_notify_pid = MyProcPid;
+	bgw.bgw_main_arg = ObjectIdGetDatum(db->dboid);
+
+	/*
+	 * If there are no worker slots available, make sure we retry processing
+	 * this database. This will make the datachecksumsworker move on to the
+	 * next database and quite likely fail with the same problem. TODO: Maybe
+	 * we need a backoff to avoid running through all the databases here in
+	 * short order.
+	 */
+	if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+	{
+		ereport(WARNING,
+				(errmsg("failed to start worker for enabling data checksums in database \"%s\", retrying",
+						db->dbname),
+				 errhint("The max_worker_processes setting might be too low.")));
+		return DATACHECKSUMSWORKER_RETRYDB;
+	}
+
+	status = WaitForBackgroundWorkerStartup(bgw_handle, &pid);
+	if (status == BGWH_STOPPED)
+	{
+		ereport(WARNING,
+				(errmsg("could not start background worker for enabling data checksums in database \"%s\"",
+						db->dbname),
+				 errhint("More details on the error might be found in the server log.")));
+		return DATACHECKSUMSWORKER_FAILED;
+	}
+
+	/*
+	 * If the postmaster crashed we cannot end up with a processed database so
+	 * we have no alternative other than exiting. When enabling checksums we
+	 * won't at this time have changed the pg_control version to enabled so
+	 * when the cluster comes back up processing will have to be restarted.
+	 * When disabling, the pg_control version will be set to off before this
+	 * so when the cluster comes up checksums will be off as expected.
+	 */
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("cannot enable data checksums without the postmaster process"),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	Assert(status == BGWH_STARTED);
+	ereport(DEBUG1,
+			(errmsg("initiating data checksum processing in database \"%s\"",
+					db->dbname)));
+
+	snprintf(activity, sizeof(activity) - 1,
+			 "Waiting for worker in database %s (pid %ld)", db->dbname, (long) pid);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	status = WaitForBackgroundWorkerShutdown(bgw_handle);
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("postmaster exited during data checksum processing in \"%s\"",
+						db->dbname),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	if (DataChecksumsWorkerShmem->success == DATACHECKSUMSWORKER_ABORTED)
+		ereport(LOG,
+				(errmsg("data checksums processing was aborted in database \"%s\"",
+						db->dbname)));
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return DataChecksumsWorkerShmem->success;
+}
+
+/*
+ * launcher_exit
+ *
+ * Internal routine for cleaning up state when the launcher process exits. We
+ * need to clean up the abort flag to ensure that processing can be restarted
+ * again after it was previously aborted.
+ */
+static void
+launcher_exit(int code, Datum arg)
+{
+	if (launcher_running)
+	{
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		launcher_running = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		LWLockRelease(DataChecksumsWorkerLock);
+	}
+}
+
+/*
+ * launcher_cancel_handler
+ *
+ * Internal routine for reacting to SIGINT and flagging the worker to abort.
+ * The worker won't be interrupted immediately but will check for abort flag
+ * between each block in a relation.
+ */
+static void
+launcher_cancel_handler(SIGNAL_ARGS)
+{
+	int			save_errno = errno;
+
+	abort_requested = true;
+
+	/*
+	 * There is no sleeping in the main loop, the flag will be checked
+	 * periodically in ProcessSingleRelationFork. The worker does however
+	 * sleep when waiting for concurrent transactions to end so we still need
+	 * to set the latch.
+	 */
+	SetLatch(MyLatch);
+
+	errno = save_errno;
+}
+
+/*
+ * WaitForAllTransactionsToFinish
+ *		Blocks awaiting all current transactions to finish
+ *
+ * Returns when all transactions which are active at the call of the function
+ * have ended, or if the postmaster dies while waiting. If the postmaster dies
+ * the abort flag will be set to indicate that the caller of this shouldn't
+ * proceed.
+ *
+ * NB: this will return early, if aborted by SIGINT or if the target state
+ * is changed while we're running.
+ */
+static void
+WaitForAllTransactionsToFinish(void)
+{
+	TransactionId waitforxid;
+
+	LWLockAcquire(XidGenLock, LW_SHARED);
+	waitforxid = XidFromFullTransactionId(TransamVariables->nextXid);
+	LWLockRelease(XidGenLock);
+
+	while (TransactionIdPrecedes(GetOldestActiveTransactionId(), waitforxid))
+	{
+		char		activity[64];
+		int			rc;
+
+		/* Oldest running xid is older than us, so wait */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for current transactions to finish (waiting for %u)",
+				 waitforxid);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		rc = WaitLatch(MyLatch,
+					   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+					   5000,
+					   WAIT_EVENT_CHECKSUM_ENABLE_STARTCONDITION);
+
+		/*
+		 * If the postmaster died we won't be able to enable checksums
+		 * cluster-wide so abort and hope to continue when restarted.
+		 */
+		if (rc & WL_POSTMASTER_DEATH)
+			ereport(FATAL,
+					(errmsg("postmaster exited during data checksum processing"),
+					 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_SHARED);
+		if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			abort_requested = true;
+		LWLockRelease(DataChecksumsWorkerLock);
+		if (abort_requested)
+			break;
+	}
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+	return;
+}
+
+/*
+ * DataChecksumsWorkerLauncherMain
+ *
+ * Main function for launching dynamic background workers for processing data
+ * checksums in databases. This function has the bgworker management, with
+ * ProcessAllDatabases being responsible for looping over the databases and
+ * initiating processing.
+ */
+void
+DataChecksumsWorkerLauncherMain(Datum arg)
+{
+	on_shmem_exit(launcher_exit, 0);
+
+	ereport(DEBUG1,
+			errmsg("background worker \"datachecksumsworker\" launcher started"));
+
+	pqsignal(SIGTERM, die);
+	pqsignal(SIGINT, launcher_cancel_handler);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_LAUNCHER;
+	init_ps_display(NULL);
+
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	if (DataChecksumsWorkerShmem->launcher_running)
+	{
+		/* Launcher was already running, let it finish */
+		LWLockRelease(DataChecksumsWorkerLock);
+		return;
+	}
+
+	launcher_running = true;
+
+	/*
+	 * Initialize a connection to shared catalogs only.
+	 */
+	BackgroundWorkerInitializeConnectionByOid(InvalidOid, InvalidOid, 0);
+
+	enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+	DataChecksumsWorkerShmem->launcher_running = true;
+	DataChecksumsWorkerShmem->enabling_checksums = enabling_checksums;
+	DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+	DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+	DataChecksumsWorkerShmem->immediate_checkpoint = DataChecksumsWorkerShmem->launch_fast;
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * The target state can change while we are busy enabling/disabling
+	 * checksums, if the user calls pg_disable/enable_data_checksums() before
+	 * we are finished with the previous request. In that case, we will loop
+	 * back here, to process the new request.
+	 */
+again:
+
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	/*
+	 * If we're asked to enable checksums, we need to check if processing was
+	 * previously interrupted such that we should resume rather than start
+	 * from scratch.
+	 */
+	if (enabling_checksums)
+	{
+		/*
+		 * If we are asked to enable checksums in a cluster which already has
+		 * checksums enabled, exit immediately as there is nothing more to do.
+		 * Hold interrupts to make sure state doesn't change during checking.
+		 */
+		HOLD_INTERRUPTS();
+		if (DataChecksumsNeedVerify())
+		{
+			RESUME_INTERRUPTS();
+			goto done;
+		}
+		RESUME_INTERRUPTS();
+
+		/*
+		 * Set the state to inprogress-on and wait on the procsignal barrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_ENABLING);
+		SetDataChecksumsOnInProgress();
+
+		if (!ProcessAllDatabases(DataChecksumsWorkerShmem->immediate_checkpoint))
+		{
+			/*
+			 * If the target state changed during processing then it's not a
+			 * failure, so restart processing instead.
+			 */
+			LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+			if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			{
+				LWLockRelease(DataChecksumsWorkerLock);
+				goto done;
+			}
+			LWLockRelease(DataChecksumsWorkerLock);
+			ereport(ERROR,
+					(errmsg("unable to enable data checksums in cluster")));
+		}
+
+		SetDataChecksumsOn();
+	}
+	else
+	{
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_DISABLING);
+		SetDataChecksumsOff();
+	}
+
+done:
+
+	/*
+	 * All done. But before we exit, check if the target state was changed
+	 * while we were running. In that case we will have to start all over
+	 * again.
+	 */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+	if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+	{
+		DataChecksumsWorkerShmem->enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+		DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+		LWLockRelease(DataChecksumsWorkerLock);
+		goto again;
+	}
+
+	/* Shut down progress reporting as we are done */
+	pgstat_progress_end_command();
+
+	launcher_running = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	LWLockRelease(DataChecksumsWorkerLock);
+}
+
+/*
+ * ProcessAllDatabases
+ *		Compute the list of all databases and process checksums in each
+ *
+ * This will repeatedly generate a list of databases to process for enabling
+ * checksums. Until no new databases are found, this will loop around computing
+ * a new list and comparing it to the already seen ones.
+ *
+ * If immediate_checkpoint is set to true then a CHECKPOINT_IMMEDIATE will be
+ * issued. This is useful for testing but should be avoided in production use
+ * as it may affect cluster performance drastically.
+ */
+static bool
+ProcessAllDatabases(bool immediate_checkpoint)
+{
+	List	   *DatabaseList;
+	HTAB	   *ProcessedDatabases = NULL;
+	HASHCTL		hash_ctl;
+	bool		found_failed = false;
+	int			flags;
+
+	/* Initialize a hash tracking all processed databases */
+	memset(&hash_ctl, 0, sizeof(hash_ctl));
+	hash_ctl.keysize = sizeof(Oid);
+	hash_ctl.entrysize = sizeof(DataChecksumsWorkerResultEntry);
+	ProcessedDatabases = hash_create("Processed databases",
+									 64,
+									 &hash_ctl,
+									 HASH_ELEM | HASH_BLOBS);
+
+	/*
+	 * Set up so first run processes shared catalogs, but not once in every
+	 * db.
+	 */
+	DataChecksumsWorkerShmem->process_shared_catalogs = true;
+
+	/*
+	 * Get a list of all databases to process. This may include databases that
+	 * were created during our runtime.  Since a database can be created as a
+	 * copy of any other database (which may not have existed in our last
+	 * run), we have to repeat this loop until no new databases show up in the
+	 * list.
+	 */
+	DatabaseList = BuildDatabaseList();
+
+	/*
+	 * Update progress reporting with the total number of databases we need to
+	 * process. This number should not be changed during processing, the
+	 * columns for processed databases is instead increased such that it can
+	 * be compared against the total.
+	 */
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_DBS_TOTAL,
+			PROGRESS_DATACHECKSUMS_DBS_DONE,
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE,
+			PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+			PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+		};
+
+		int64		vals[6];
+
+		vals[0] = list_length(DatabaseList);
+		vals[1] = 0;
+
+		/* translated to NULL */
+		vals[2] = -1;
+		vals[3] = -1;
+		vals[4] = -1;
+		vals[5] = -1;
+
+		pgstat_progress_update_multi_param(6, index, vals);
+	}
+
+	while (true)
+	{
+		int			processed_databases = 0;
+
+		foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+		{
+			DataChecksumsWorkerResult result;
+			DataChecksumsWorkerResultEntry *entry;
+			bool		found;
+
+			/*
+			 * Check if this database has been processed already, and if so
+			 * whether it should be retried or skipped.
+			 */
+			entry = (DataChecksumsWorkerResultEntry *) hash_search(ProcessedDatabases, &db->dboid,
+																   HASH_FIND, NULL);
+
+			if (entry)
+			{
+				if (entry->result == DATACHECKSUMSWORKER_RETRYDB)
+				{
+					/*
+					 * Limit the number of retries to avoid infinite looping
+					 * in case there simply won't be enough workers in the
+					 * cluster to finish this operation.
+					 */
+					if (entry->retries > DATACHECKSUMSWORKER_MAX_DB_RETRIES)
+						entry->result = DATACHECKSUMSWORKER_FAILED;
+				}
+
+				/* Skip if this database has been processed already */
+				if (entry->result != DATACHECKSUMSWORKER_RETRYDB)
+					continue;
+			}
+
+			result = ProcessDatabase(db);
+			processed_databases++;
+
+			/*
+			 * Update the number of processed databases in the progress
+			 * report.
+			 */
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_DBS_DONE,
+										 processed_databases);
+
+			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
+			{
+				/*
+				 * If one database has completed shared catalogs, we don't
+				 * have to process them again.
+				 */
+				if (DataChecksumsWorkerShmem->process_shared_catalogs)
+					DataChecksumsWorkerShmem->process_shared_catalogs = false;
+			}
+			else if (result == DATACHECKSUMSWORKER_ABORTED)
+			{
+				/* Abort flag set, so exit the whole process */
+				return false;
+			}
+
+			entry = hash_search(ProcessedDatabases, &db->dboid, HASH_ENTER, &found);
+			entry->dboid = db->dboid;
+			entry->result = result;
+			if (!found)
+				entry->retries = 0;
+			else
+				entry->retries++;
+		}
+
+		elog(DEBUG1,
+			 "%i databases processed for data checksum enabling, %s",
+			 processed_databases,
+			 (processed_databases ? "process with restart" : "process completed"));
+
+		FreeDatabaseList(DatabaseList);
+
+		/*
+		 * If no databases were processed in this run of the loop, we have now
+		 * finished all databases and no concurrently created ones can exist.
+		 */
+		if (processed_databases == 0)
+			break;
+
+		/*
+		 * Re-generate the list of databases for another pass. Since we wait
+		 * for all pre-existing transactions finish, this way we can be
+		 * certain that there are no databases left without checksums.
+		 */
+		WaitForAllTransactionsToFinish();
+		DatabaseList = BuildDatabaseList();
+	}
+
+	/*
+	 * ProcessedDatabases now has all databases and the results of their
+	 * processing. Failure to enable checksums for a database can be because
+	 * they actually failed for some reason, or because the database was
+	 * dropped between us getting the database list and trying to process it.
+	 * Get a fresh list of databases to detect the second case where the
+	 * database was dropped before we had started processing it. If a database
+	 * still exists, but enabling checksums failed then we fail the entire
+	 * checksumming process and exit with an error.
+	 */
+	WaitForAllTransactionsToFinish();
+	DatabaseList = BuildDatabaseList();
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+	{
+		DataChecksumsWorkerResult *entry;
+		bool		found;
+
+		entry = hash_search(ProcessedDatabases, (void *) &db->dboid,
+							HASH_FIND, &found);
+
+		/*
+		 * We are only interested in the databases where the failed database
+		 * still exists.
+		 */
+		if (found && *entry == DATACHECKSUMSWORKER_FAILED)
+		{
+			ereport(WARNING,
+					(errmsg("failed to enable data checksums in \"%s\"",
+							db->dbname)));
+			found_failed = found;
+			continue;
+		}
+	}
+
+	FreeDatabaseList(DatabaseList);
+
+	if (found_failed)
+	{
+		/* Disable checksums on cluster, because we failed */
+		SetDataChecksumsOff();
+		ereport(ERROR,
+				(errmsg("data checksums failed to get enabled in all databases, aborting"),
+				 errhint("The server log might have more information on the cause of the error.")));
+	}
+
+	/*
+	 * When enabling checksums, we have to wait for a checkpoint for the
+	 * checksums to change from in-progress to on.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT);
+
+	/*
+	 * Force a checkpoint to get everything out to disk. The use of immediate
+	 * checkpoints is for running tests, as they would otherwise not execute
+	 * in such a way that they can reliably be placed under timeout control.
+	 */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_IMMEDIATE;
+	RequestCheckpoint(flags);
+
+	return true;
+}
+
+/*
+ * DataChecksumsWorkerShmemSize
+ *		Compute required space for datachecksumsworker-related shared memory
+ */
+Size
+DataChecksumsWorkerShmemSize(void)
+{
+	Size		size;
+
+	size = sizeof(DataChecksumsWorkerShmemStruct);
+	size = MAXALIGN(size);
+
+	return size;
+}
+
+/*
+ * DataChecksumsWorkerShmemInit
+ *		Allocate and initialize datachecksumsworker-related shared memory
+ */
+void
+DataChecksumsWorkerShmemInit(void)
+{
+	bool		found;
+
+	DataChecksumsWorkerShmem = (DataChecksumsWorkerShmemStruct *)
+		ShmemInitStruct("DataChecksumsWorker Data",
+						DataChecksumsWorkerShmemSize(),
+						&found);
+
+	if (!found)
+	{
+		MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
+
+		/*
+		 * Even if this is a redundant assignment, we want to be explicit
+		 * about our intent for readability, since we want to be able to query
+		 * this state in case of restartability.
+		 */
+		DataChecksumsWorkerShmem->launch_enable_checksums = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		DataChecksumsWorkerShmem->launch_fast = false;
+	}
+}
+
+/*
+ * BuildDatabaseList
+ *		Compile a list of all currently available databases in the cluster
+ *
+ * This creates the list of databases for the datachecksumsworker workers to
+ * add checksums to. If the caller wants to ensure that no concurrently
+ * running CREATE DATABASE calls exist, this needs to be preceded by a call
+ * to WaitForAllTransactionsToFinish().
+ */
+static List *
+BuildDatabaseList(void)
+{
+	List	   *DatabaseList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(DatabaseRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_database pgdb = (Form_pg_database) GETSTRUCT(tup);
+		DataChecksumsWorkerDatabase *db;
+
+		oldctx = MemoryContextSwitchTo(ctx);
+
+		db = (DataChecksumsWorkerDatabase *) palloc0(sizeof(DataChecksumsWorkerDatabase));
+
+		db->dboid = pgdb->oid;
+		db->dbname = pstrdup(NameStr(pgdb->datname));
+
+		DatabaseList = lappend(DatabaseList, db);
+
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return DatabaseList;
+}
+
+static void
+FreeDatabaseList(List *dblist)
+{
+	if (!dblist)
+		return;
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, dblist)
+	{
+		if (db->dbname != NULL)
+			pfree(db->dbname);
+	}
+
+	list_free_deep(dblist);
+}
+
+/*
+ * BuildRelationList
+ *		Compile a list of relations in the database
+ *
+ * Returns a list of OIDs for the request relation types. If temp_relations
+ * is True then only temporary relations are returned. If temp_relations is
+ * False then non-temporary relations which have data checksums are returned.
+ * If include_shared is True then shared relations are included as well in a
+ * non-temporary list. include_shared has no relevance when building a list of
+ * temporary relations.
+ */
+static List *
+BuildRelationList(bool temp_relations, bool include_shared)
+{
+	List	   *RelationList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(RelationRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_class pgc = (Form_pg_class) GETSTRUCT(tup);
+
+		/*
+		 * Only include temporary relations when asked for a temp relation
+		 * list.
+		 */
+		if (pgc->relpersistence == RELPERSISTENCE_TEMP)
+		{
+			if (!temp_relations)
+				continue;
+		}
+		else
+		{
+			/*
+			 * If we are only interested in temp relations then continue
+			 * immediately as the current relation isn't a temp relation.
+			 */
+			if (temp_relations)
+				continue;
+
+			if (!RELKIND_HAS_STORAGE(pgc->relkind))
+				continue;
+
+			if (pgc->relisshared && !include_shared)
+				continue;
+		}
+
+		oldctx = MemoryContextSwitchTo(ctx);
+		RelationList = lappend_oid(RelationList, pgc->oid);
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return RelationList;
+}
+
+/*
+ * DataChecksumsWorkerMain
+ *
+ * Main function for enabling checksums in a single database, This is the
+ * function set as the bgw_function_name in the dynamic background worker
+ * process initiated for each database by the worker launcher. After enabling
+ * data checksums in each applicable relation in the database, it will wait for
+ * all temporary relations that were present when the function started to
+ * disappear before returning. This is required since we cannot rewrite
+ * existing temporary relations with data checksums.
+ */
+void
+DataChecksumsWorkerMain(Datum arg)
+{
+	Oid			dboid = DatumGetObjectId(arg);
+	List	   *RelationList = NIL;
+	List	   *InitialTempTableList = NIL;
+	BufferAccessStrategy strategy;
+	bool		aborted = false;
+	int64		rels_done;
+
+	enabling_checksums = true;
+
+	pqsignal(SIGTERM, die);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_WORKER;
+	init_ps_display(NULL);
+
+	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
+											  BGWORKER_BYPASS_ALLOWCONN);
+
+	/* worker will have a separate entry in pg_stat_progress_data_checksums */
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	/*
+	 * Get a list of all temp tables present as we start in this database. We
+	 * need to wait until they are all gone until we are done, since we cannot
+	 * access these relations and modify them.
+	 */
+	InitialTempTableList = BuildRelationList(true, false);
+
+	/*
+	 * Enable vacuum cost delay, if any.
+	 */
+	Assert(DataChecksumsWorkerShmem->enabling_checksums);
+	VacuumCostDelay = DataChecksumsWorkerShmem->cost_delay;
+	VacuumCostLimit = DataChecksumsWorkerShmem->cost_limit;
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumCostPageHit = 0;
+	VacuumCostPageMiss = 0;
+	VacuumCostPageDirty = 0;
+
+	/*
+	 * Create and set the vacuum strategy as our buffer strategy.
+	 */
+	strategy = GetAccessStrategy(BAS_VACUUM);
+
+	RelationList = BuildRelationList(false,
+									 DataChecksumsWorkerShmem->process_shared_catalogs);
+
+	/* Update the total number of relations to be processed in this DB. */
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE
+		};
+
+		int64		vals[2];
+
+		vals[0] = list_length(RelationList);
+		vals[1] = 0;
+
+		pgstat_progress_update_multi_param(2, index, vals);
+	}
+
+	/* process the relations */
+	rels_done = 0;
+	foreach_oid(reloid, RelationList)
+	{
+		if (!ProcessSingleRelationByOid(reloid, strategy))
+		{
+			aborted = true;
+			break;
+		}
+
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_RELS_DONE,
+									 ++rels_done);
+	}
+	list_free(RelationList);
+
+	if (aborted)
+	{
+		DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+		ereport(DEBUG1,
+				(errmsg("data checksum processing aborted in database OID %u",
+						dboid)));
+		return;
+	}
+
+	/* The worker is about to wait for temporary tables to go away. */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL);
+
+	/*
+	 * Wait for all temp tables that existed when we started to go away. This
+	 * is necessary since we cannot "reach" them to enable checksums. Any temp
+	 * tables created after we started will already have checksums in them
+	 * (due to the "inprogress-on" state), so no need to wait for those.
+	 */
+	for (;;)
+	{
+		List	   *CurrentTempTables;
+		int			numleft;
+		char		activity[64];
+
+		CurrentTempTables = BuildRelationList(true, false);
+		numleft = 0;
+		foreach_oid(tmptbloid, InitialTempTableList)
+		{
+			if (list_member_oid(CurrentTempTables, tmptbloid))
+				numleft++;
+		}
+		list_free(CurrentTempTables);
+
+		if (numleft == 0)
+			break;
+
+		/*
+		 * At least one temp table is left to wait for, indicate in pgstat
+		 * activity and progress reporting.
+		 */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for %d temp tables to be removed", numleft);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		(void) WaitLatch(MyLatch,
+						 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+						 5000,
+						 WAIT_EVENT_CHECKSUM_ENABLE_FINISHCONDITION);
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		aborted = DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums;
+		LWLockRelease(DataChecksumsWorkerLock);
+
+		if (aborted || abort_requested)
+		{
+			DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+			ereport(DEBUG1,
+					(errmsg("data checksum processing aborted in database OID %u",
+							dboid)));
+			return;
+		}
+	}
+
+	list_free(InitialTempTableList);
+
+	/* worker done */
+	pgstat_progress_end_command();
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
+}
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 77fb877dbad..65bbf770d28 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -202,6 +202,9 @@ static child_process_kind child_process_kinds[] = {
 	[B_WAL_SUMMARIZER] = {"wal_summarizer", WalSummarizerMain, true},
 	[B_WAL_WRITER] = {"wal_writer", WalWriterMain, true},
 
+	[B_DATACHECKSUMSWORKER_LAUNCHER] = {"datachecksum launcher", NULL, false},
+	[B_DATACHECKSUMSWORKER_WORKER] = {"datachecksum worker", NULL, false},
+
 	[B_LOGGER] = {"syslogger", SysLoggerMain, false},
 };
 
@@ -282,6 +285,16 @@ postmaster_child_launch(BackendType child_type, int child_slot,
 			memcpy(MyClientSocket, client_sock, sizeof(ClientSocket));
 		}
 
+		/*
+		 * update the LocalProcessControlFile to match XLogCtl->data_checksum_version
+		 *
+		 * XXX It seems the postmaster (which is what gets forked into the new
+		 * child process) does not absorb the checksum barriers, therefore it
+		 * does not update the value (except after a restart). Not sure if there
+		 * is some sort of race condition.
+		 */
+		InitLocalDataChecksumVersion();
+
 		/*
 		 * Run the appropriate Main function
 		 */
diff --git a/src/backend/postmaster/meson.build b/src/backend/postmaster/meson.build
index 0008603cfee..ce10ef1059a 100644
--- a/src/backend/postmaster/meson.build
+++ b/src/backend/postmaster/meson.build
@@ -6,6 +6,7 @@ backend_sources += files(
   'bgworker.c',
   'bgwriter.c',
   'checkpointer.c',
+  'datachecksumsworker.c',
   'fork_process.c',
   'interrupt.c',
   'launch_backend.c',
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index d13846298bd..81eb84e7efb 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2946,6 +2946,11 @@ PostmasterStateMachine(void)
 									B_INVALID,
 									B_STANDALONE_BACKEND);
 
+			/* also add checksumming processes */
+			remainMask = btmask_add(remainMask,
+									B_DATACHECKSUMSWORKER_LAUNCHER,
+									B_DATACHECKSUMSWORKER_WORKER);
+
 			/* All types should be included in targetMask or remainMask */
 			Assert((remainMask.mask | targetMask.mask) == BTYPE_MASK_ALL.mask);
 		}
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 78f9a0a11c4..fe1922868b9 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -186,6 +186,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 		case XLOG_FPW_CHANGE:
 		case XLOG_FPI_FOR_HINT:
 		case XLOG_FPI:
+		case XLOG_CHECKSUMS:
 		case XLOG_OVERWRITE_CONTRECORD:
 		case XLOG_CHECKPOINT_REDO:
 			break;
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 174eed70367..e7761a3ddb6 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -30,6 +30,8 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "postmaster/postmaster.h"
 #include "postmaster/walsummarizer.h"
 #include "replication/logicallauncher.h"
 #include "replication/origin.h"
@@ -148,6 +150,7 @@ CalculateShmemSize(int *num_semaphores)
 	size = add_size(size, WaitEventCustomShmemSize());
 	size = add_size(size, InjectionPointShmemSize());
 	size = add_size(size, SlotSyncShmemSize());
+	size = add_size(size, DataChecksumsWorkerShmemSize());
 
 	/* include additional requested shmem from preload libraries */
 	size = add_size(size, total_addin_request);
@@ -330,6 +333,7 @@ CreateOrAttachShmemStructs(void)
 	PgArchShmemInit();
 	ApplyLauncherShmemInit();
 	SlotSyncShmemInit();
+	DataChecksumsWorkerShmemInit();
 
 	/*
 	 * Set up other modules that need some shared memory space
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 7d201965503..2b13a8cd260 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -18,6 +18,7 @@
 #include <unistd.h>
 
 #include "access/parallel.h"
+#include "access/xlog.h"
 #include "commands/async.h"
 #include "miscadmin.h"
 #include "pgstat.h"
@@ -575,6 +576,18 @@ ProcessProcSignalBarrier(void)
 					case PROCSIGNAL_BARRIER_SMGRRELEASE:
 						processed = ProcessBarrierSmgrRelease();
 						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON:
+						processed = AbsorbChecksumsOnInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_ON:
+						processed = AbsorbChecksumsOnBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF:
+						processed = AbsorbChecksumsOffInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_OFF:
+						processed = AbsorbChecksumsOffBarrier();
+						break;
 				}
 
 				/*
diff --git a/src/backend/storage/page/README b/src/backend/storage/page/README
index e30d7ac59ad..73c36a63908 100644
--- a/src/backend/storage/page/README
+++ b/src/backend/storage/page/README
@@ -10,7 +10,9 @@ http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf, discussed
 2010/12/22 on -hackers list.
 
 Current implementation requires this be enabled system-wide at initdb time, or
-by using the pg_checksums tool on an offline cluster.
+by using the pg_checksums tool on an offline cluster.  Checksums can also be
+enabled at runtime using pg_enable_data_checksums(), and disabled by using
+pg_disable_data_checksums().
 
 The checksum is not valid at all times on a data page!!
 The checksum is valid when the page leaves the shared pool and is checked
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index ecc81aacfc3..8bd5fed8c85 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -98,7 +98,7 @@ PageIsVerifiedExtended(PageData *page, BlockNumber blkno, int flags)
 	 */
 	if (!PageIsNew(page))
 	{
-		if (DataChecksumsEnabled())
+		if (DataChecksumsNeedVerify())
 		{
 			checksum = pg_checksum_page(page, blkno);
 
@@ -1501,7 +1501,7 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 	static char *pageCopy = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return page;
 
 	/*
@@ -1531,7 +1531,7 @@ void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
 	/* If we don't need a checksum, just return */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return;
 
 	((PageHeader) page)->pd_checksum = pg_checksum_page(page, blkno);
diff --git a/src/backend/utils/activity/pgstat_backend.c b/src/backend/utils/activity/pgstat_backend.c
index a8cb54a7732..7c9ff5c19c5 100644
--- a/src/backend/utils/activity/pgstat_backend.c
+++ b/src/backend/utils/activity/pgstat_backend.c
@@ -376,6 +376,8 @@ pgstat_tracks_backend_bktype(BackendType bktype)
 		case B_BG_WRITER:
 		case B_CHECKPOINTER:
 		case B_STARTUP:
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 			return false;
 
 		case B_AUTOVAC_WORKER:
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index eb575025596..e7b418a000e 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -370,6 +370,8 @@ pgstat_tracks_io_bktype(BackendType bktype)
 		case B_LOGGER:
 			return false;
 
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 		case B_AUTOVAC_LAUNCHER:
 		case B_AUTOVAC_WORKER:
 		case B_BACKEND:
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index 3c594415bfd..b1b5cdcf36c 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -115,6 +115,8 @@ CHECKPOINT_DELAY_COMPLETE	"Waiting for a backend that blocks a checkpoint from c
 CHECKPOINT_DELAY_START	"Waiting for a backend that blocks a checkpoint from starting."
 CHECKPOINT_DONE	"Waiting for a checkpoint to complete."
 CHECKPOINT_START	"Waiting for a checkpoint to start."
+CHECKSUM_ENABLE_STARTCONDITION	"Waiting for data checksums enabling to start."
+CHECKSUM_ENABLE_FINISHCONDITION	"Waiting for data checksums to be enabled."
 EXECUTE_GATHER	"Waiting for activity from a child process while executing a <literal>Gather</literal> plan node."
 HASH_BATCH_ALLOCATE	"Waiting for an elected Parallel Hash participant to allocate a hash table."
 HASH_BATCH_ELECT	"Waiting to elect a Parallel Hash participant to allocate a hash table."
@@ -346,6 +348,7 @@ WALSummarizer	"Waiting to read or update WAL summarization state."
 DSMRegistry	"Waiting to read or update the dynamic shared memory registry."
 InjectionPoint	"Waiting to read or update information related to injection points."
 SerialControl	"Waiting to read or update shared <filename>pg_serial</filename> state."
+DataChecksumsWorker	"Waiting for data checksumsworker."
 
 #
 # END OF PREDEFINED LWLOCKS (DO NOT CHANGE THIS LINE)
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 662ce46cbc2..2cb7766c1e2 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -274,6 +274,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		cmdtype = PROGRESS_COMMAND_BASEBACKUP;
 	else if (pg_strcasecmp(cmd, "COPY") == 0)
 		cmdtype = PROGRESS_COMMAND_COPY;
+	else if (pg_strcasecmp(cmd, "DATACHECKSUMS") == 0)
+		cmdtype = PROGRESS_COMMAND_DATACHECKSUMS;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -1146,9 +1148,6 @@ pg_stat_get_db_checksum_failures(PG_FUNCTION_ARGS)
 	int64		result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
@@ -1164,9 +1163,6 @@ pg_stat_get_db_checksum_last_failure(PG_FUNCTION_ARGS)
 	TimestampTz result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index dc3521457c7..a071ba6f455 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -293,6 +293,12 @@ GetBackendTypeDesc(BackendType backendType)
 		case B_CHECKPOINTER:
 			backendDesc = gettext_noop("checkpointer");
 			break;
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+			backendDesc = "datachecksumsworker launcher";
+			break;
+		case B_DATACHECKSUMSWORKER_WORKER:
+			backendDesc = "datachecksumsworker worker";
+			break;
 		case B_LOGGER:
 			backendDesc = gettext_noop("logger");
 			break;
@@ -892,7 +898,8 @@ InitializeSessionUserIdStandalone(void)
 	 * workers, in slot sync worker and in background workers.
 	 */
 	Assert(!IsUnderPostmaster || AmAutoVacuumWorkerProcess() ||
-		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess());
+		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess() ||
+		   AmDataChecksumsWorkerProcess());
 
 	/* call only once */
 	Assert(!OidIsValid(AuthenticatedUserId));
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 4b2faf1ba9d..bae18b449aa 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -746,6 +746,11 @@ InitPostgres(const char *in_dbname, Oid dboid,
 	 */
 	SharedInvalBackendInit(false);
 
+	/*
+	 * Set up backend local cache of Controldata values.
+	 */
+	InitLocalControldata();
+
 	ProcSignalInit(MyCancelKeyValid, MyCancelKey);
 
 	/*
@@ -876,7 +881,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
 					 errhint("You should immediately run CREATE USER \"%s\" SUPERUSER;.",
 							 username != NULL ? username : "postgres")));
 	}
-	else if (AmBackgroundWorkerProcess())
+	else if (AmBackgroundWorkerProcess() || AmDataChecksumsWorkerProcess())
 	{
 		if (username == NULL && !OidIsValid(useroid))
 		{
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 9c0b10ad4dc..5c00e6a5bf6 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -476,6 +476,14 @@ static const struct config_enum_entry wal_compression_options[] = {
 	{NULL, 0, false}
 };
 
+static const struct config_enum_entry data_checksums_options[] = {
+	{"on", DATA_CHECKSUMS_ON, true},
+	{"off", DATA_CHECKSUMS_OFF, true},
+	{"inprogress-on", DATA_CHECKSUMS_INPROGRESS_ON, true},
+	{"inprogress-off", DATA_CHECKSUMS_INPROGRESS_OFF, true},
+	{NULL, 0, false}
+};
+
 /*
  * Options for enum values stored in other modules
  */
@@ -601,7 +609,6 @@ static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
 static int	num_os_semaphores;
-static bool data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
@@ -1952,17 +1959,6 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
-	{
-		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
-			gettext_noop("Shows whether data checksums are turned on for this cluster."),
-			NULL,
-			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
-		},
-		&data_checksums,
-		false,
-		NULL, NULL, NULL
-	},
-
 	{
 		{"syslog_sequence_numbers", PGC_SIGHUP, LOGGING_WHERE,
 			gettext_noop("Add sequence number to syslog messages to avoid duplicate suppression."),
@@ -5311,6 +5307,17 @@ struct config_enum ConfigureNamesEnum[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows whether data checksums are turned on for this cluster."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
+		},
+		&data_checksums,
+		DATA_CHECKSUMS_OFF, data_checksums_options,
+		NULL, NULL, show_data_checksums
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 867aeddc601..79c3f86357e 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -576,7 +576,7 @@ main(int argc, char *argv[])
 		mode == PG_MODE_DISABLE)
 		pg_fatal("data checksums are already disabled in cluster");
 
-	if (ControlFile->data_checksum_version > 0 &&
+	if (ControlFile->data_checksum_version == DATA_CHECKSUMS_ON &&
 		mode == PG_MODE_ENABLE)
 		pg_fatal("data checksums are already enabled in cluster");
 
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index bd49ea867bf..f48936d3b00 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -14,6 +14,7 @@
 
 #include "common/string.h"
 #include "pg_upgrade.h"
+#include "storage/bufpage.h"
 
 
 /*
@@ -735,6 +736,14 @@ check_control_data(ControlData *oldctrl,
 	 * check_for_isn_and_int8_passing_mismatch().
 	 */
 
+	/*
+	 * If data checksums are in any in-progress state then disallow the
+	 * upgrade. The user should either let the process finish, or turn off
+	 * data checksums, before retrying.
+	 */
+	if (oldctrl->data_checksum_version > PG_DATA_CHECKSUM_VERSION)
+		pg_fatal("checksums are being enabled in the old cluster");
+
 	/*
 	 * We might eventually allow upgrades from checksum to no-checksum
 	 * clusters.
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index d313099c027..aec3ea0bc63 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -56,6 +56,7 @@ extern PGDLLIMPORT int CommitDelay;
 extern PGDLLIMPORT int CommitSiblings;
 extern PGDLLIMPORT bool track_wal_io_timing;
 extern PGDLLIMPORT int wal_decode_buffer_size;
+extern PGDLLIMPORT int data_checksums;
 
 extern PGDLLIMPORT int CheckPointSegments;
 
@@ -117,7 +118,7 @@ extern PGDLLIMPORT int wal_level;
  * of the bits make it to disk, but the checksum wouldn't match.  Also WAL-log
  * them if forced by wal_log_hints=on.
  */
-#define XLogHintBitIsNeeded() (DataChecksumsEnabled() || wal_log_hints)
+#define XLogHintBitIsNeeded() (wal_log_hints || DataChecksumsNeedWrite())
 
 /* Do we need to WAL-log information required only for Hot Standby and logical replication? */
 #define XLogStandbyInfoActive() (wal_level >= WAL_LEVEL_REPLICA)
@@ -230,7 +231,19 @@ extern XLogRecPtr GetXLogWriteRecPtr(void);
 
 extern uint64 GetSystemIdentifier(void);
 extern char *GetMockAuthenticationNonce(void);
-extern bool DataChecksumsEnabled(void);
+extern bool DataChecksumsNeedWrite(void);
+extern bool DataChecksumsNeedVerify(void);
+extern bool DataChecksumsOnInProgress(void);
+extern bool DataChecksumsOffInProgress(void);
+extern void SetDataChecksumsOnInProgress(void);
+extern void SetDataChecksumsOn(void);
+extern void SetDataChecksumsOff(void);
+extern bool AbsorbChecksumsOnInProgressBarrier(void);
+extern bool AbsorbChecksumsOffInProgressBarrier(void);
+extern bool AbsorbChecksumsOnBarrier(void);
+extern bool AbsorbChecksumsOffBarrier(void);
+extern const char *show_data_checksums(void);
+extern void InitLocalControldata(void);
 extern bool GetDefaultCharSignedness(void);
 extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
 extern Size XLOGShmemSize(void);
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index 2cf8d55d706..ee82d4ce73d 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,6 +25,7 @@
 #include "lib/stringinfo.h"
 #include "pgtime.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/relfilelocator.h"
 
 
@@ -289,6 +290,12 @@ typedef struct xl_restore_point
 	char		rp_name[MAXFNAMELEN];
 } xl_restore_point;
 
+/* Information logged when data checksum level is changed */
+typedef struct xl_checksum_state
+{
+	ChecksumType new_checksumtype;
+} xl_checksum_state;
+
 /* Overwrite of prior contrecord */
 typedef struct xl_overwrite_contrecord
 {
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 63e834a6ce4..727042b73be 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -62,6 +62,9 @@ typedef struct CheckPoint
 	 * set to InvalidTransactionId.
 	 */
 	TransactionId oldestActiveXid;
+
+	/* data checksums at the time of the checkpoint  */
+	uint32		data_checksum_version;
 } CheckPoint;
 
 /* XLOG info values for XLOG rmgr */
@@ -80,6 +83,7 @@ typedef struct CheckPoint
 /* 0xC0 is used in Postgres 9.5-11 */
 #define XLOG_OVERWRITE_CONTRECORD		0xD0
 #define XLOG_CHECKPOINT_REDO			0xE0
+#define XLOG_CHECKSUMS					0xF0
 
 
 /*
@@ -219,7 +223,7 @@ typedef struct ControlFileData
 	bool		float8ByVal;	/* float8, int8, etc pass-by-value? */
 
 	/* Are data pages protected by checksums? Zero if no checksum version */
-	uint32		data_checksum_version;
+	uint32		data_checksum_version;		/* persistent */
 
 	/*
 	 * True if the default signedness of char is "signed" on a platform where
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 890822eaf79..6987bfb3ab9 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12256,6 +12256,23 @@
   proname => 'jsonb_subscript_handler', prorettype => 'internal',
   proargtypes => 'internal', prosrc => 'jsonb_subscript_handler' },
 
+# data checksum management functions
+{ oid => '9258',
+  descr => 'disable data checksums',
+  proname => 'pg_disable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => '',
+  prosrc => 'disable_data_checksums' },
+
+{ oid => '9257',
+  descr => 'enable data checksums',
+  proname => 'pg_enable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'int4 int4 bool', proallargtypes => '{int4,int4,bool}',
+  proargmodes => '{i,i,i}',
+  proargnames => '{cost_delay,cost_limit,fast}',
+  prosrc => 'enable_data_checksums' },
+
 # collation management functions
 { oid => '3445', descr => 'import collations from operating system',
   proname => 'pg_import_system_collations', procost => '100',
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 7c736e7b03b..b172a5f24ce 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -157,4 +157,20 @@
 #define PROGRESS_COPY_TYPE_PIPE 3
 #define PROGRESS_COPY_TYPE_CALLBACK 4
 
+/* Progress parameters for PROGRESS_DATACHECKSUMS */
+#define PROGRESS_DATACHECKSUMS_PHASE		0
+#define PROGRESS_DATACHECKSUMS_DBS_TOTAL	1
+#define PROGRESS_DATACHECKSUMS_DBS_DONE		2
+#define PROGRESS_DATACHECKSUMS_RELS_TOTAL	3
+#define PROGRESS_DATACHECKSUMS_RELS_DONE	4
+#define PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL	5
+#define PROGRESS_DATACHECKSUMS_BLOCKS_DONE	6
+
+/* Phases of datachecksumsworker operation */
+#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING			0
+#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING			1
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING			2
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL	3
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT 4
+
 #endif
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 6f16794eb63..133e7fde290 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -365,6 +365,9 @@ typedef enum BackendType
 	B_WAL_SUMMARIZER,
 	B_WAL_WRITER,
 
+	B_DATACHECKSUMSWORKER_LAUNCHER,
+	B_DATACHECKSUMSWORKER_WORKER,
+
 	/*
 	 * Logger is not connected to shared memory and does not have a PGPROC
 	 * entry.
@@ -389,6 +392,9 @@ extern PGDLLIMPORT BackendType MyBackendType;
 #define AmWalReceiverProcess()		(MyBackendType == B_WAL_RECEIVER)
 #define AmWalSummarizerProcess()	(MyBackendType == B_WAL_SUMMARIZER)
 #define AmWalWriterProcess()		(MyBackendType == B_WAL_WRITER)
+#define AmDataChecksumsWorkerProcess() \
+	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || \
+	 MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
 
 #define AmSpecialWorkerProcess() \
 	(AmAutoVacuumLauncherProcess() || \
@@ -534,6 +540,10 @@ extern Size EstimateClientConnectionInfoSpace(void);
 extern void SerializeClientConnectionInfo(Size maxsize, char *start_address);
 extern void RestoreClientConnectionInfo(char *conninfo);
 
+extern uint32 GetLocalDataChecksumVersion(void);
+extern uint32 GetCurrentDataChecksumVersion(void);
+extern void InitLocalDataChecksumVersion(void);
+
 /* in executor/nodeHash.c */
 extern size_t get_hash_memory_limit(void);
 
diff --git a/src/include/postmaster/datachecksumsworker.h b/src/include/postmaster/datachecksumsworker.h
new file mode 100644
index 00000000000..59c9000d646
--- /dev/null
+++ b/src/include/postmaster/datachecksumsworker.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.h
+ *	  header file for checksum helper background worker
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/postmaster/datachecksumsworker.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef DATACHECKSUMSWORKER_H
+#define DATACHECKSUMSWORKER_H
+
+/* Shared memory */
+extern Size DataChecksumsWorkerShmemSize(void);
+extern void DataChecksumsWorkerShmemInit(void);
+
+/* Start the background processes for enabling or disabling checksums */
+void		StartDataChecksumsWorkerLauncher(bool enable_checksums,
+											 int cost_delay,
+											 int cost_limit,
+											 bool fast);
+
+/* Background worker entrypoints */
+void		DataChecksumsWorkerLauncherMain(Datum arg);
+void		DataChecksumsWorkerMain(Datum arg);
+
+#endif							/* DATACHECKSUMSWORKER_H */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 6646b6f6371..5326782171f 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -205,7 +205,17 @@ typedef PageHeaderData *PageHeader;
  * handling pages.
  */
 #define PG_PAGE_LAYOUT_VERSION		4
+
+/*
+ * Checksum version 0 is used for when data checksums are disabled.
+ * PG_DATA_CHECKSUM_VERSION defines that data checksums are enabled in the
+ * cluster and PG_DATA_CHECKSUM_INPROGRESS_{ON|OFF}_VERSION defines that data
+ * checksums are either currently being enabled or disabled.
+ */
 #define PG_DATA_CHECKSUM_VERSION	1
+#define PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION	2
+#define PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION	3
+
 
 /* ----------------------------------------------------------------
  *						page support functions
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 25d13a798d1..6faff962ef0 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -15,6 +15,14 @@
 
 #include "storage/block.h"
 
+typedef enum ChecksumType
+{
+	DATA_CHECKSUMS_OFF = 0,
+	DATA_CHECKSUMS_ON,
+	DATA_CHECKSUMS_INPROGRESS_ON,
+	DATA_CHECKSUMS_INPROGRESS_OFF
+} ChecksumType;
+
 /*
  * Compute the checksum for a Postgres page.  The page must be aligned on a
  * 4-byte boundary.
diff --git a/src/include/storage/lwlocklist.h b/src/include/storage/lwlocklist.h
index cf565452382..afe39db753c 100644
--- a/src/include/storage/lwlocklist.h
+++ b/src/include/storage/lwlocklist.h
@@ -83,3 +83,4 @@ PG_LWLOCK(49, WALSummarizer)
 PG_LWLOCK(50, DSMRegistry)
 PG_LWLOCK(51, InjectionPoint)
 PG_LWLOCK(52, SerialControl)
+PG_LWLOCK(53, DataChecksumsWorker)
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 0750ec3c474..17fb0decaff 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -447,9 +447,10 @@ extern PGDLLIMPORT PGPROC *PreparedXactProcs;
  * Background writer, checkpointer, WAL writer, WAL summarizer, and archiver
  * run during normal operation.  Startup process and WAL receiver also consume
  * 2 slots, but WAL writer is launched only after startup has exited, so we
- * only need 6 slots.
+ * only need 6 slots to cover these. The DataChecksums worker and launcher
+ * can consume 2 slots when data checksums are enabled or disabled.
  */
-#define NUM_AUXILIARY_PROCS		6
+#define NUM_AUXILIARY_PROCS		8
 
 /* configurable options */
 extern PGDLLIMPORT int DeadlockTimeout;
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 022fd8ed933..8937fa6ed3d 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -54,6 +54,11 @@ typedef enum
 typedef enum
 {
 	PROCSIGNAL_BARRIER_SMGRRELEASE, /* ask smgr to close files */
+
+	PROCSIGNAL_BARRIER_CHECKSUM_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_ON,
 } ProcSignalBarrierType;
 
 /*
diff --git a/src/include/utils/backend_progress.h b/src/include/utils/backend_progress.h
index dda813ab407..c664e92dbfe 100644
--- a/src/include/utils/backend_progress.h
+++ b/src/include/utils/backend_progress.h
@@ -28,6 +28,7 @@ typedef enum ProgressCommandType
 	PROGRESS_COMMAND_CREATE_INDEX,
 	PROGRESS_COMMAND_BASEBACKUP,
 	PROGRESS_COMMAND_COPY,
+	PROGRESS_COMMAND_DATACHECKSUMS,
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	20
diff --git a/src/test/Makefile b/src/test/Makefile
index 511a72e6238..278ce3e8a86 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,16 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = perl postmaster regress isolation modules authentication recovery subscription
+SUBDIRS = \
+		perl \
+		postmaster \
+		regress \
+		isolation \
+		modules \
+		authentication \
+		recovery \
+		subscription \
+		checksum
 
 ifeq ($(with_icu),yes)
 SUBDIRS += icu
diff --git a/src/test/checksum/.gitignore b/src/test/checksum/.gitignore
new file mode 100644
index 00000000000..871e943d50e
--- /dev/null
+++ b/src/test/checksum/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/checksum/Makefile b/src/test/checksum/Makefile
new file mode 100644
index 00000000000..fd03bf73df4
--- /dev/null
+++ b/src/test/checksum/Makefile
@@ -0,0 +1,23 @@
+#-------------------------------------------------------------------------
+#
+# Makefile for src/test/checksum
+#
+# Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/checksum/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/test/checksum
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
+
+clean distclean maintainer-clean:
+	rm -rf tmp_check
diff --git a/src/test/checksum/README b/src/test/checksum/README
new file mode 100644
index 00000000000..0f0317060b3
--- /dev/null
+++ b/src/test/checksum/README
@@ -0,0 +1,22 @@
+src/test/checksum/README
+
+Regression tests for data checksums
+===================================
+
+This directory contains a test suite for enabling data checksums
+in a running cluster.
+
+Running the tests
+=================
+
+    make check
+
+or
+
+    make installcheck
+
+NOTE: This creates a temporary installation (in the case of "check"),
+with multiple nodes, be they master or standby(s) for the purpose of
+the tests.
+
+NOTE: This requires the --enable-tap-tests argument to configure.
diff --git a/src/test/checksum/meson.build b/src/test/checksum/meson.build
new file mode 100644
index 00000000000..5f96b5c246d
--- /dev/null
+++ b/src/test/checksum/meson.build
@@ -0,0 +1,15 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+tests += {
+  'name': 'checksums',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'tests': [
+      't/001_basic.pl',
+      't/002_restarts.pl',
+      't/003_standby_restarts.pl',
+      't/004_offline.pl',
+    ],
+  },
+}
diff --git a/src/test/checksum/t/001_basic.pl b/src/test/checksum/t/001_basic.pl
new file mode 100644
index 00000000000..4c64f6a14fc
--- /dev/null
+++ b/src/test/checksum/t/001_basic.pl
@@ -0,0 +1,88 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are turned off
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable data checksums
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+# Wait for checksums to become enabled
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Enable data checksums again which should be a no-op..
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+# ..and make sure we can still read/write data
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums again
+$node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are disabled');
+
+# Test reading again
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure previously checksummed pages can be read back');
+
+# Re-enable checksums and make sure that the underlying data has changed to
+# ensure that checksums will be different.
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back the data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$node->stop;
+
+done_testing();
diff --git a/src/test/checksum/t/002_restarts.pl b/src/test/checksum/t/002_restarts.pl
new file mode 100644
index 00000000000..2697b722257
--- /dev/null
+++ b/src/test/checksum/t/002_restarts.pl
@@ -0,0 +1,92 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# restarting the processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, '1', "ensure checksums aren't enabled yet");
+
+$bsession->quit;
+$node->stop;
+$node->start;
+
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'inprogress-on', "ensure checksums aren't enabled yet");
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are turned on');
+
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+$result = $node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are turned off');
+
+done_testing();
diff --git a/src/test/checksum/t/003_standby_restarts.pl b/src/test/checksum/t/003_standby_restarts.pl
new file mode 100644
index 00000000000..6782664f4e6
--- /dev/null
+++ b/src/test/checksum/t/003_standby_restarts.pl
@@ -0,0 +1,139 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# streaming replication
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize primary node
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+$node_primary->init(allows_streaming => 1, no_data_checksums => 1);
+$node_primary->start;
+
+my $slotname = 'physical_slot';
+$node_primary->safe_psql('postgres',
+	"SELECT pg_create_physical_replication_slot('$slotname')");
+
+# Take backup
+my $backup_name = 'my_backup';
+$node_primary->backup($backup_name);
+
+# Create streaming standby linking to primary
+my $node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $backup_name,
+	has_streaming => 1);
+$node_standby_1->append_conf(
+	'postgresql.conf', qq[
+primary_slot_name = '$slotname'
+]);
+$node_standby_1->start;
+
+# Create some content on the primary to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Wait for standbys to catch up
+$node_primary->wait_for_catchup($node_standby_1, 'replay',
+	$node_primary->lsn('insert'));
+
+# Check that checksums are turned off on all nodes
+my $result = $node_primary->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on primary');
+
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on standby_1');
+
+# Enable checksums for the cluster
+$node_primary->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+# Ensure that the primary switches to "inprogress-on"
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	"inprogress-on");
+is($result, 1, 'ensure checksums are in progress on primary');
+
+# Wait for checksum enable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Ensure that the standby has switched to "inprogress-on" or "on".  Normally it
+# would be "inprogress-on", but it is theoretically possible for the primary to
+# complete the checksum enabling *and* have the standby replay that record
+# before we reach the check below.
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting = 'off' FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'f');
+is($result, 1, 'ensure standby has absorbed the inprogress-on barrier');
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+
+is(($result eq 'inprogress-on' || $result eq 'on'),
+	1, 'ensure checksums are on, or in progress, on standby_1');
+
+# Insert some more data which should be checksummed on INSERT
+$node_primary->safe_psql('postgres',
+	"INSERT INTO t VALUES (generate_series(1, 10000));");
+
+# Wait for checksums enabled on the primary
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the primary');
+
+# Wait for checksums enabled on the standby
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the standby');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, '20000', 'ensure we can safely read all data with checksums');
+
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+# Disable checksums and ensure it's propagated to standby and that we can
+# still read all data
+$node_primary->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+# Wait for checksum disable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure data checksums are disabled on the primary 2');
+
+# Ensure that the standby has switched to off
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are off on standby_1');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, "20000", 'ensure we can safely read all data without checksums');
+
+done_testing();
diff --git a/src/test/checksum/t/004_offline.pl b/src/test/checksum/t/004_offline.pl
new file mode 100644
index 00000000000..9cee62c9b52
--- /dev/null
+++ b/src/test/checksum/t/004_offline.pl
@@ -0,0 +1,101 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums offline from various states
+# of checksum processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable checksums offline using pg_checksums
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are enabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums offline again using pg_checksums
+$node->stop;
+$node->checksum_disable_offline;
+$node->start;
+
+# Ensure that checksums are disabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, 1, 'ensure checksums are in the process of being enabled');
+
+# Turn the cluster off and enable checksums offline, then start back up
+$bsession->quit;
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are now enabled even though processing wasn't
+# restarted
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+done_testing();
diff --git a/src/test/meson.build b/src/test/meson.build
index ccc31d6a86a..c4bfef2c0e2 100644
--- a/src/test/meson.build
+++ b/src/test/meson.build
@@ -8,6 +8,7 @@ subdir('postmaster')
 subdir('recovery')
 subdir('subscription')
 subdir('modules')
+subdir('checksum')
 
 if ssl.found()
   subdir('ssl')
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index b105cba05a6..1c66360c16c 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3748,6 +3748,42 @@ sub advance_wal
 	}
 }
 
+=item $node->checksum_enable_offline()
+
+Enable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_enable_offline
+{
+	my ($self) = @_;
+
+	print "### Enabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-e');
+	return;
+}
+
+=item checksum_disable_offline
+
+Disable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_disable_offline
+{
+	my ($self) = @_;
+
+	print "### Disabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-d');
+	return;
+}
+
 =pod
 
 =back
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 62f69ac20b2..0dd383a76dd 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2041,6 +2041,42 @@ pg_stat_progress_create_index| SELECT s.pid,
     s.param15 AS partitions_done
    FROM (pg_stat_get_progress_info('CREATE INDEX'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
+pg_stat_progress_data_checksums| SELECT s.pid,
+    s.datid,
+    d.datname,
+        CASE s.param1
+            WHEN 0 THEN 'enabling'::text
+            WHEN 1 THEN 'disabling'::text
+            WHEN 2 THEN 'waiting'::text
+            WHEN 3 THEN 'waiting on temporary tables'::text
+            WHEN 4 THEN 'waiting on checkpoint'::text
+            WHEN 5 THEN 'done'::text
+            ELSE NULL::text
+        END AS phase,
+        CASE s.param2
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param2
+        END AS databases_total,
+    s.param3 AS databases_done,
+        CASE s.param4
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param4
+        END AS relations_total,
+        CASE s.param5
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param5
+        END AS relations_done,
+        CASE s.param6
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param6
+        END AS blocks_total,
+        CASE s.param7
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param7
+        END AS blocks_done
+   FROM (pg_stat_get_progress_info('DATACHECKSUMS'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)))
+  ORDER BY s.datid;
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,
diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index f77caacc17d..85e15d4cc2b 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -51,6 +51,22 @@ client backend|relation|vacuum
 client backend|temp relation|normal
 client backend|wal|init
 client backend|wal|normal
+datachecksumsworker launcher|relation|bulkread
+datachecksumsworker launcher|relation|bulkwrite
+datachecksumsworker launcher|relation|init
+datachecksumsworker launcher|relation|normal
+datachecksumsworker launcher|relation|vacuum
+datachecksumsworker launcher|temp relation|normal
+datachecksumsworker launcher|wal|init
+datachecksumsworker launcher|wal|normal
+datachecksumsworker worker|relation|bulkread
+datachecksumsworker worker|relation|bulkwrite
+datachecksumsworker worker|relation|init
+datachecksumsworker worker|relation|normal
+datachecksumsworker worker|relation|vacuum
+datachecksumsworker worker|temp relation|normal
+datachecksumsworker worker|wal|init
+datachecksumsworker worker|wal|normal
 slotsync worker|relation|bulkread
 slotsync worker|relation|bulkwrite
 slotsync worker|relation|init
@@ -87,7 +103,7 @@ walsummarizer|wal|init
 walsummarizer|wal|normal
 walwriter|wal|init
 walwriter|wal|normal
-(71 rows)
+(87 rows)
 \a
 -- ensure that both seqscan and indexscan plans are allowed
 SET enable_seqscan TO on;
diff --git a/src/test/subscription/t/013_partition.pl b/src/test/subscription/t/013_partition.pl
index 61b0cb4aa1a..4f78dd48815 100644
--- a/src/test/subscription/t/013_partition.pl
+++ b/src/test/subscription/t/013_partition.pl
@@ -51,8 +51,7 @@ $node_subscriber1->safe_psql('postgres',
 );
 # make a BRIN index to test aminsertcleanup logic in subscriber
 $node_subscriber1->safe_psql('postgres',
-	"CREATE INDEX tab1_c_brin_idx ON tab1 USING brin (c)"
-);
+	"CREATE INDEX tab1_c_brin_idx ON tab1 USING brin (c)");
 $node_subscriber1->safe_psql('postgres',
 	"CREATE TABLE tab1_1 (b text, c text DEFAULT 'sub1_tab1', a int NOT NULL)"
 );
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 93339ef3c58..3ed3f7d4216 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -403,6 +403,7 @@ CheckPointStmt
 CheckpointStatsData
 CheckpointerRequest
 CheckpointerShmemStruct
+ChecksumType
 Chromosome
 CkptSortItem
 CkptTsStatus
@@ -594,6 +595,10 @@ DataPageDeleteStack
 DataTypesUsageChecks
 DataTypesUsageVersionCheck
 DatabaseInfo
+DataChecksumsWorkerDatabase
+DataChecksumsWorkerResult
+DataChecksumsWorkerResultEntry
+DataChecksumsWorkerShmemStruct
 DateADT
 DateTimeErrorExtra
 Datum
@@ -4146,6 +4151,7 @@ xl_btree_split
 xl_btree_unlink_page
 xl_btree_update
 xl_btree_vacuum
+xl_checksum_state
 xl_clog_truncate
 xl_commit_ts_truncate
 xl_dbase_create_file_copy_rec
-- 
2.48.1

v20250315-0002-Reviewfixups.patchtext/x-patch; charset=UTF-8; name=v20250315-0002-Reviewfixups.patchDownload
From d56cb09458a5fbee539601c084fc7e72e5e9cfb5 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Fri, 14 Mar 2025 15:00:51 +0100
Subject: [PATCH v20250315 2/4] Reviewfixups

---
 src/backend/access/transam/xlog.c            | 26 +++++++++++++++++++-
 src/backend/postmaster/datachecksumsworker.c |  2 +-
 src/backend/utils/init/postinit.c            |  4 +--
 src/include/postmaster/datachecksumsworker.h |  2 +-
 src/test/checksum/Makefile                   |  2 +-
 5 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 61da6d583cd..e4c72f985e4 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -660,6 +660,16 @@ static bool updateMinRecoveryPoint = true;
  */
 static uint32 LocalDataChecksumVersion = 0;
 
+/*
+ * Flag to remember if the procsignalbarrier being absorbed for enabling
+ * checksums is the first one or not. The first procsignalbarrier can in rare
+ * circumstances cause a transition from 'on' to 'on' when a new process is
+ * spawned between the update of XLogCtl->data_checksum_version and the
+ * barrier being emitted.  This can only happen on the very first barrier so
+ * mark that with this flag.
+ */
+static bool InitialDataChecksumTransition = true;
+
 /*
  * Variable backing the GUC, keep it in sync with LocalDataChecksumVersion.
  * See SetLocalDataChecksumVersion().
@@ -4935,7 +4945,20 @@ AbsorbChecksumsOnInProgressBarrier(void)
 bool
 AbsorbChecksumsOnBarrier(void)
 {
-	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	/*
+	 * If the process was spawned between updating XLogCtl and emitting the
+	 * barrier it will have seen the updated value, so for the first barrier
+	 * we accept both "on" and "inprogress-on".
+	 */
+	if (InitialDataChecksumTransition)
+	{
+		Assert((LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION) ||
+			   (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION));
+		InitialDataChecksumTransition = false;
+	}
+	else
+		Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+
 	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_VERSION);
 	return true;
 }
@@ -5319,6 +5342,7 @@ LocalProcessControlFile(bool reset)
 	Assert(reset || ControlFile == NULL);
 	ControlFile = palloc(sizeof(ControlFileData));
 	ReadControlFile();
+	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
 }
 
 /*
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
index 6a201dca8de..81be2808895 100644
--- a/src/backend/postmaster/datachecksumsworker.c
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -150,7 +150,7 @@
  *     online operation).
  *
  *
- * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
  *
  *
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index bae18b449aa..692570eb0f1 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -746,13 +746,13 @@ InitPostgres(const char *in_dbname, Oid dboid,
 	 */
 	SharedInvalBackendInit(false);
 
+	ProcSignalInit(MyCancelKeyValid, MyCancelKey);
+
 	/*
 	 * Set up backend local cache of Controldata values.
 	 */
 	InitLocalControldata();
 
-	ProcSignalInit(MyCancelKeyValid, MyCancelKey);
-
 	/*
 	 * Also set up timeout handlers needed for backend operation.  We need
 	 * these in every case except bootstrap.
diff --git a/src/include/postmaster/datachecksumsworker.h b/src/include/postmaster/datachecksumsworker.h
index 59c9000d646..0649232723d 100644
--- a/src/include/postmaster/datachecksumsworker.h
+++ b/src/include/postmaster/datachecksumsworker.h
@@ -4,7 +4,7 @@
  *	  header file for checksum helper background worker
  *
  *
- * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
  *
  * src/include/postmaster/datachecksumsworker.h
diff --git a/src/test/checksum/Makefile b/src/test/checksum/Makefile
index fd03bf73df4..f287001301e 100644
--- a/src/test/checksum/Makefile
+++ b/src/test/checksum/Makefile
@@ -2,7 +2,7 @@
 #
 # Makefile for src/test/checksum
 #
-# Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+# Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
 # Portions Copyright (c) 1994, Regents of the University of California
 #
 # src/test/checksum/Makefile
-- 
2.48.1

v20250315-0003-reworks.patchtext/x-patch; charset=UTF-8; name=v20250315-0003-reworks.patchDownload
From 95bfea2b12418671b5f9513dee107a0b18c8a78e Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Fri, 14 Mar 2025 22:04:27 +0100
Subject: [PATCH v20250315 3/4] reworks

---
 src/backend/access/transam/xlog.c       | 45 +++++++++----------------
 src/backend/postmaster/auxprocess.c     | 19 +++++++++++
 src/backend/postmaster/launch_backend.c | 10 ------
 src/backend/utils/init/postinit.c       | 17 ++++++++--
 src/include/access/xlog.h               |  2 +-
 src/include/miscadmin.h                 |  1 -
 6 files changed, 51 insertions(+), 43 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index e4c72f985e4..064cc5555dc 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -661,12 +661,15 @@ static bool updateMinRecoveryPoint = true;
 static uint32 LocalDataChecksumVersion = 0;
 
 /*
- * Flag to remember if the procsignalbarrier being absorbed for enabling
- * checksums is the first one or not. The first procsignalbarrier can in rare
- * circumstances cause a transition from 'on' to 'on' when a new process is
+ * Flag to remember if the procsignalbarrier being absorbed for checksums
+ * is the first one. The first procsignalbarrier can in rare cases be for
+ * the state we've initialized, i.e. a duplicate. This may happen for any
+ * data_checksum_version value, but for PG_DATA_CHECKSUM_ON_VERSION this
+ * would trigger an assert failure (this is the only transition with an
+ * assert) when processing the barrier. This may happen if the process is
  * spawned between the update of XLogCtl->data_checksum_version and the
- * barrier being emitted.  This can only happen on the very first barrier so
- * mark that with this flag.
+ * barrier being emitted. This can only happen on the very first barrier
+ * so mark that with this flag.
  */
 static bool InitialDataChecksumTransition = true;
 
@@ -4938,6 +4941,7 @@ SetDataChecksumsOff(void)
 bool
 AbsorbChecksumsOnInProgressBarrier(void)
 {
+	/* XXX can't we check we're in OFF or INPROGRESSS_ON? */
 	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
 	return true;
 }
@@ -4950,22 +4954,19 @@ AbsorbChecksumsOnBarrier(void)
 	 * barrier it will have seen the updated value, so for the first barrier
 	 * we accept both "on" and "inprogress-on".
 	 */
-	if (InitialDataChecksumTransition)
-	{
-		Assert((LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION) ||
-			   (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION));
-		InitialDataChecksumTransition = false;
-	}
-	else
-		Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	Assert((LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION) ||
+		   (InitialDataChecksumTransition &&
+			(LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)));
 
 	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_VERSION);
+	InitialDataChecksumTransition = false;
 	return true;
 }
 
 bool
 AbsorbChecksumsOffInProgressBarrier(void)
 {
+	/* XXX can't we check we're in ON or INPROGRESSS_OFF? */
 	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
 	return true;
 }
@@ -4973,6 +4974,7 @@ AbsorbChecksumsOffInProgressBarrier(void)
 bool
 AbsorbChecksumsOffBarrier(void)
 {
+	/* XXX can't we check we're in INPROGRESSS_OFF? */
 	SetLocalDataChecksumVersion(0);
 	return true;
 }
@@ -4986,7 +4988,7 @@ AbsorbChecksumsOffBarrier(void)
  * purpose enough to handle future cases.
  */
 void
-InitLocalControldata(void)
+InitLocalDataChecksumVersion(void)
 {
 	SpinLockAcquire(&XLogCtl->info_lck);
 	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
@@ -5024,21 +5026,6 @@ SetLocalDataChecksumVersion(uint32 data_checksum_version)
 	}
 }
 
-/*
- * Initialize the various data checksum values - GUC, local, ....
- */
-void
-InitLocalDataChecksumVersion(void)
-{
-	uint32	data_checksum_version;
-
-	SpinLockAcquire(&XLogCtl->info_lck);
-	data_checksum_version = XLogCtl->data_checksum_version;
-	SpinLockRelease(&XLogCtl->info_lck);
-
-	SetLocalDataChecksumVersion(data_checksum_version);
-}
-
 /*
  * Get the local data_checksum_version (cached XLogCtl value).
  */
diff --git a/src/backend/postmaster/auxprocess.c b/src/backend/postmaster/auxprocess.c
index 4f6795f7265..50d5308816c 100644
--- a/src/backend/postmaster/auxprocess.c
+++ b/src/backend/postmaster/auxprocess.c
@@ -15,6 +15,7 @@
 #include <unistd.h>
 #include <signal.h>
 
+#include "access/xlog.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/auxprocess.h"
@@ -68,6 +69,24 @@ AuxiliaryProcessMainCommon(void)
 
 	ProcSignalInit(false, 0);
 
+	/*
+	 * Initialize a local cache of the data_checksum_version, to be updated
+	 * by the procsignal-based barriers.
+	 *
+	 * This intentionally happens after initializing the procsignal, otherwise
+	 * we might miss a state change. This means we can get a barrier for the
+	 * state we've just initialized - but it can happen only once.
+	 *
+	 * The postmaster (which is what gets forked into the new child process)
+	 * does not handle barriers, therefore it may not have the current value
+	 * of LocalDataChecksumVersion value (it'll have the value read from the
+	 * control file, which may be arbitrarily old).
+	 *
+	 * NB: Even if the postmaster handled barriers, the value might still be
+	 * stale, as it might have changed after this process forked.
+	 */
+	InitLocalDataChecksumVersion();
+
 	/*
 	 * Auxiliary processes don't run transactions, but they may need a
 	 * resource owner anyway to manage buffer pins acquired outside
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 65bbf770d28..b06b5fb45dd 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -285,16 +285,6 @@ postmaster_child_launch(BackendType child_type, int child_slot,
 			memcpy(MyClientSocket, client_sock, sizeof(ClientSocket));
 		}
 
-		/*
-		 * update the LocalProcessControlFile to match XLogCtl->data_checksum_version
-		 *
-		 * XXX It seems the postmaster (which is what gets forked into the new
-		 * child process) does not absorb the checksum barriers, therefore it
-		 * does not update the value (except after a restart). Not sure if there
-		 * is some sort of race condition.
-		 */
-		InitLocalDataChecksumVersion();
-
 		/*
 		 * Run the appropriate Main function
 		 */
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 692570eb0f1..d1394fc05f9 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -749,9 +749,22 @@ InitPostgres(const char *in_dbname, Oid dboid,
 	ProcSignalInit(MyCancelKeyValid, MyCancelKey);
 
 	/*
-	 * Set up backend local cache of Controldata values.
+	 * Initialize a local cache of the data_checksum_version, to be updated
+	 * by the procsignal-based barriers.
+	 *
+	 * This intentionally happens after initializing the procsignal, otherwise
+	 * we might miss a state change. This means we can get a barrier for the
+	 * state we've just initialized - but it can happen only once.
+	 *
+	 * The postmaster (which is what gets forked into the new child process)
+	 * does not handle barriers, therefore it may not have the current value
+	 * of LocalDataChecksumVersion value (it'll have the value read from the
+	 * control file, which may be arbitrarily old).
+	 *
+	 * NB: Even if the postmaster handled barriers, the value might still be
+	 * stale, as it might have changed after this process forked.
 	 */
-	InitLocalControldata();
+	InitLocalDataChecksumVersion();
 
 	/*
 	 * Also set up timeout handlers needed for backend operation.  We need
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index aec3ea0bc63..615b2cf4ec8 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -243,7 +243,7 @@ extern bool AbsorbChecksumsOffInProgressBarrier(void);
 extern bool AbsorbChecksumsOnBarrier(void);
 extern bool AbsorbChecksumsOffBarrier(void);
 extern const char *show_data_checksums(void);
-extern void InitLocalControldata(void);
+extern void InitLocalDataChecksumVersion(void);
 extern bool GetDefaultCharSignedness(void);
 extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
 extern Size XLOGShmemSize(void);
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 133e7fde290..193ce1b4514 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -542,7 +542,6 @@ extern void RestoreClientConnectionInfo(char *conninfo);
 
 extern uint32 GetLocalDataChecksumVersion(void);
 extern uint32 GetCurrentDataChecksumVersion(void);
-extern void InitLocalDataChecksumVersion(void);
 
 /* in executor/nodeHash.c */
 extern size_t get_hash_memory_limit(void);
-- 
2.48.1

v20250315-0004-debug-stuff.patchtext/x-patch; charset=UTF-8; name=v20250315-0004-debug-stuff.patchDownload
From d9a9caf0425543ef21700239dcd8d62ccaadfd5b Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Sat, 15 Mar 2025 10:42:16 +0100
Subject: [PATCH v20250315 4/4] debug stuff

---
 src/backend/access/transam/xlog.c       | 86 ++++++++++++++++++++++++-
 src/backend/postmaster/launch_backend.c |  3 +
 src/backend/storage/ipc/procsignal.c    | 34 ++++++++++
 src/backend/storage/page/bufpage.c      | 35 ++++++++++
 src/backend/utils/init/miscinit.c       |  3 +
 src/bin/pg_controldata/pg_controldata.c |  2 +
 6 files changed, 160 insertions(+), 3 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 064cc5555dc..a01afc32af8 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4284,6 +4284,8 @@ InitControlFile(uint64 sysidentifier, uint32 data_checksum_version)
 	 * processes get the current value from. (Maybe it should go just there?)
 	 */
 	XLogCtl->data_checksum_version = data_checksum_version;
+
+	elog(LOG, "InitControlFile %p data_checksum_version %u", XLogCtl, ControlFile->data_checksum_version);
 }
 
 static void
@@ -4623,6 +4625,8 @@ ReadControlFile(void)
 		(SizeOfXLogLongPHD - SizeOfXLogShortPHD);
 
 	CalculateCheckpointSegments();
+
+	elog(LOG, "ReadControlFile ControlFile->data_checksum_version = %u", ControlFile->data_checksum_version);
 }
 
 /*
@@ -4758,6 +4762,8 @@ SetDataChecksumsOnInProgress(void)
 	XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
 	SpinLockRelease(&XLogCtl->info_lck);
 
+	elog(LOG, "SetDataChecksumsOnInProgress XLogCtl->data_checksum_version = %u", XLogCtl->data_checksum_version);
+
 	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
 
 	END_CRIT_SECTION();
@@ -4823,6 +4829,8 @@ SetDataChecksumsOn(void)
 	XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
 	SpinLockRelease(&XLogCtl->info_lck);
 
+	elog(LOG, "SetDataChecksumsOn XLogCtl->data_checksum_version = %u", XLogCtl->data_checksum_version);
+
 	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
 
 	END_CRIT_SECTION();
@@ -4861,6 +4869,7 @@ SetDataChecksumsOff(void)
 	/* If data checksums are already disabled there is nothing to do */
 	if (XLogCtl->data_checksum_version == 0)
 	{
+		elog(LOG, "SetDataChecksumsOff XLogCtl->data_checksum_version = %u (SKIP)", XLogCtl->data_checksum_version);
 		SpinLockRelease(&XLogCtl->info_lck);
 		return;
 	}
@@ -4885,6 +4894,8 @@ SetDataChecksumsOff(void)
 		XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
 		SpinLockRelease(&XLogCtl->info_lck);
 
+		elog(LOG, "SetDataChecksumsOff XLogCtl->data_checksum_version = %u", XLogCtl->data_checksum_version);
+
 		barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
 
 		END_CRIT_SECTION();
@@ -4905,6 +4916,8 @@ SetDataChecksumsOff(void)
 	}
 	else
 	{
+		elog(LOG, "SetDataChecksumsOff XLogCtl->data_checksum_version = %u (SKIP)", XLogCtl->data_checksum_version);
+
 		/*
 		 * Ending up here implies that the checksums state is "inprogress-on"
 		 * or "inprogress-off" and we can transition directly to "off" from
@@ -4942,6 +4955,7 @@ bool
 AbsorbChecksumsOnInProgressBarrier(void)
 {
 	/* XXX can't we check we're in OFF or INPROGRESSS_ON? */
+	elog(LOG, "AbsorbChecksumsOnInProgressBarrier");
 	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
 	return true;
 }
@@ -4949,6 +4963,8 @@ AbsorbChecksumsOnInProgressBarrier(void)
 bool
 AbsorbChecksumsOnBarrier(void)
 {
+	elog(LOG, "AbsorbChecksumsOnBarrier");
+
 	/*
 	 * If the process was spawned between updating XLogCtl and emitting the
 	 * barrier it will have seen the updated value, so for the first barrier
@@ -4967,6 +4983,7 @@ bool
 AbsorbChecksumsOffInProgressBarrier(void)
 {
 	/* XXX can't we check we're in ON or INPROGRESSS_OFF? */
+	elog(LOG, "AbsorbChecksumsOffInProgressBarrier");
 	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
 	return true;
 }
@@ -4975,6 +4992,7 @@ bool
 AbsorbChecksumsOffBarrier(void)
 {
 	/* XXX can't we check we're in INPROGRESSS_OFF? */
+	elog(LOG, "AbsorbChecksumsOffBarrier");
 	SetLocalDataChecksumVersion(0);
 	return true;
 }
@@ -4990,6 +5008,7 @@ AbsorbChecksumsOffBarrier(void)
 void
 InitLocalDataChecksumVersion(void)
 {
+	elog(LOG, "InitLocalDataChecksumVersion XLogCtl->data_checksum_version = %u", XLogCtl->data_checksum_version);
 	SpinLockAcquire(&XLogCtl->info_lck);
 	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
 	SpinLockRelease(&XLogCtl->info_lck);
@@ -5004,6 +5023,7 @@ InitLocalDataChecksumVersion(void)
 void
 SetLocalDataChecksumVersion(uint32 data_checksum_version)
 {
+	elog(LOG, "SetLocalDataChecksumVersion %u", data_checksum_version);
 	LocalDataChecksumVersion = data_checksum_version;
 
 	switch (LocalDataChecksumVersion)
@@ -5499,9 +5519,15 @@ XLOGShmemInit(void)
 	XLogCtl->InstallXLogFileSegmentActive = false;
 	XLogCtl->WalWriterSleeping = false;
 
+	elog(LOG, "XLogCtl->data_checksum_version %u ControlFile->data_checksum_version %u",
+		 XLogCtl->data_checksum_version, ControlFile->data_checksum_version);
+
 	/* use the checksum info from control file */
 	XLogCtl->data_checksum_version = ControlFile->data_checksum_version;
 
+	elog(LOG, "XLogCtl->data_checksum_version %u ControlFile->data_checksum_version %u (UPDATED)",
+		 XLogCtl->data_checksum_version, ControlFile->data_checksum_version);
+
 	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
 
 	SpinLockInit(&XLogCtl->Insert.insertpos_lck);
@@ -6635,6 +6661,8 @@ StartupXLOG(void)
 	 */
 	CompleteCommitTsInitialization();
 
+	elog(LOG, "StartupXLOG XLogCtl->data_checksum_version = %u", XLogCtl->data_checksum_version);
+
 	/*
 	 * If we reach this point with checksums being enabled ("inprogress-on"
 	 * state), we notify the user that they need to manually restart the
@@ -6652,6 +6680,9 @@ StartupXLOG(void)
 	 * we know that we have a state where all backends have stopped validating
 	 * checksums and we can move to off instead of prompting the user to
 	 * perform any action.
+	 *
+	 * XXX Is it safe to access the data_checksum_version without holding the
+	 * spinlock?
 	 */
 	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
 	{
@@ -6660,6 +6691,8 @@ StartupXLOG(void)
 		SpinLockAcquire(&XLogCtl->info_lck);
 		XLogCtl->data_checksum_version = 0;
 		SpinLockRelease(&XLogCtl->info_lck);
+
+		elog(LOG, "StartupXLOG XLogCtl->data_checksum_version = %u (UPDATED)", XLogCtl->data_checksum_version);
 	}
 
 	/*
@@ -7523,6 +7556,8 @@ CreateCheckPoint(int flags)
 	 */
 	checkPoint.data_checksum_version = XLogCtl->data_checksum_version;
 
+	elog(WARNING, "CREATECHECKPOINT XLogCtl->data_checksum_version %u", XLogCtl->data_checksum_version);
+
 	if (shutdown)
 	{
 		XLogRecPtr	curInsert = XLogBytePosToRecPtr(Insert->CurrBytePos);
@@ -7778,6 +7813,10 @@ CreateCheckPoint(int flags)
 	ControlFile->minRecoveryPoint = InvalidXLogRecPtr;
 	ControlFile->minRecoveryPointTLI = 0;
 
+	elog(LOG, "CreateCheckPoint data_checksum_version %u %u",
+		 ControlFile->data_checksum_version,
+		 checkPoint.data_checksum_version);
+
 	/* make sure we start with the checksum version as of the checkpoint */
 	ControlFile->data_checksum_version = checkPoint.data_checksum_version;
 
@@ -7928,6 +7967,10 @@ CreateEndOfRecoveryRecord(void)
 	ControlFile->minRecoveryPoint = recptr;
 	ControlFile->minRecoveryPointTLI = xlrec.ThisTimeLineID;
 
+	elog(LOG, "CreateEndOfRecoveryRecord data_checksum_version %u xlog %u",
+		 ControlFile->data_checksum_version,
+		 XLogCtl->data_checksum_version);
+
 	/* start with the latest checksum version (as of the end of recovery) */
 	ControlFile->data_checksum_version = XLogCtl->data_checksum_version;
 
@@ -8101,6 +8144,14 @@ RecoveryRestartPoint(const CheckPoint *checkPoint, XLogReaderState *record)
 	XLogCtl->lastCheckPointEndPtr = record->EndRecPtr;
 	XLogCtl->lastCheckPoint = *checkPoint;
 	SpinLockRelease(&XLogCtl->info_lck);
+
+	elog(LOG, "RecoveryRestartPoint lastCheckPointRecPtr %X/%X XLogCtl->lastCheckPointEndPtr %X/%X redo %X/%X checksums xlogctl %u record %d file %u",
+		 LSN_FORMAT_ARGS(XLogCtl->lastCheckPointRecPtr),
+		 LSN_FORMAT_ARGS(XLogCtl->lastCheckPointEndPtr),
+		 LSN_FORMAT_ARGS(XLogCtl->lastCheckPoint.redo),
+		 XLogCtl->lastCheckPoint.data_checksum_version,
+		 checkPoint->data_checksum_version,
+		 ControlFile->data_checksum_version);
 }
 
 /*
@@ -8138,6 +8189,9 @@ CreateRestartPoint(int flags)
 	lastCheckPoint = XLogCtl->lastCheckPoint;
 	SpinLockRelease(&XLogCtl->info_lck);
 
+	elog(LOG, "CreateRestartPoint lastCheckPointRecPtr %X/%X lastCheckPointEndPtr %X/%X",
+		 LSN_FORMAT_ARGS(lastCheckPointRecPtr), LSN_FORMAT_ARGS(lastCheckPointEndPtr));
+
 	/*
 	 * Check that we're still in recovery mode. It's ok if we exit recovery
 	 * mode after this check, the restart point is valid anyway.
@@ -8166,9 +8220,11 @@ CreateRestartPoint(int flags)
 	if (XLogRecPtrIsInvalid(lastCheckPointRecPtr) ||
 		lastCheckPoint.redo <= ControlFile->checkPointCopy.redo)
 	{
-		ereport(DEBUG2,
-				(errmsg_internal("skipping restartpoint, already performed at %X/%X",
-								 LSN_FORMAT_ARGS(lastCheckPoint.redo))));
+		ereport(LOG,
+				(errmsg_internal("CreateRestartPoint: skipping restartpoint, already performed at %X/%X <= %X/%X lastCheckPointRecPtr %X/%X",
+								 LSN_FORMAT_ARGS(lastCheckPoint.redo),
+								 LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo),
+								 LSN_FORMAT_ARGS(lastCheckPointRecPtr))));
 
 		UpdateMinRecoveryPoint(InvalidXLogRecPtr, true);
 		if (flags & CHECKPOINT_IS_SHUTDOWN)
@@ -8236,6 +8292,8 @@ CreateRestartPoint(int flags)
 	 * end-of-recovery checkpoint.
 	 */
 	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	elog(LOG, "CreateRestartPoint ControlFile->checkPointCopy.redo %X/%X lastCheckPoint.redo %X/%X",
+		 LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo), LSN_FORMAT_ARGS(lastCheckPoint.redo));
 	if (ControlFile->checkPointCopy.redo < lastCheckPoint.redo)
 	{
 		/*
@@ -8274,11 +8332,17 @@ CreateRestartPoint(int flags)
 				ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY;
 		}
 
+		elog(LOG, "CreateRestartPoint data_checksum_version %u %u",
+			 ControlFile->data_checksum_version,
+			 lastCheckPoint.data_checksum_version);
+
 		/* we shall start with the latest checksum version */
 		ControlFile->data_checksum_version = lastCheckPoint.data_checksum_version;
 
 		UpdateControlFile();
 	}
+	else
+		elog(LOG, "CreateRestartPoint: skipped ControlFile update");
 	LWLockRelease(ControlFileLock);
 
 	/*
@@ -9136,13 +9200,25 @@ xlog_redo(XLogReaderState *record)
 	{
 		xl_checksum_state state;
 		uint64		barrier;
+		XLogRecPtr	checkpointLsn;
+		uint32		value,
+					value_last;
 
 		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
 
 		SpinLockAcquire(&XLogCtl->info_lck);
+		value_last = XLogCtl->data_checksum_version;
 		XLogCtl->data_checksum_version = state.new_checksumtype;
 		SpinLockRelease(&XLogCtl->info_lck);
 
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		checkpointLsn = ControlFile->checkPoint;
+		value = ControlFile->data_checksum_version;
+		LWLockRelease(ControlFileLock);
+
+		elog(LOG, "XLOG_CHECKSUMS xlog_redo %X/%X control checkpoint %X/%X control %u last %u record %u",
+			 LSN_FORMAT_ARGS(lsn), LSN_FORMAT_ARGS(checkpointLsn), value, value_last, state.new_checksumtype);
+
 		/*
 		 * Block on a procsignalbarrier to await all processes having seen the
 		 * change to checksum status. Once the barrier has been passed we can
@@ -9151,22 +9227,26 @@ xlog_redo(XLogReaderState *record)
 		switch (state.new_checksumtype)
 		{
 			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				elog(LOG, "XLOG_CHECKSUMS emit PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON");
 				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
 				WaitForProcSignalBarrier(barrier);
 				break;
 
 			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				elog(LOG, "XLOG_CHECKSUMS emit PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF");
 				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
 				WaitForProcSignalBarrier(barrier);
 				break;
 
 			case PG_DATA_CHECKSUM_VERSION:
+				elog(LOG, "XLOG_CHECKSUMS emit PROCSIGNAL_BARRIER_CHECKSUM_ON");
 				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
 				WaitForProcSignalBarrier(barrier);
 				break;
 
 			default:
 				Assert(state.new_checksumtype == 0);
+				elog(LOG, "XLOG_CHECKSUMS emit PROCSIGNAL_BARRIER_CHECKSUM_OFF");
 				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
 				WaitForProcSignalBarrier(barrier);
 				break;
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index b06b5fb45dd..2119c5d59b4 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -239,6 +239,9 @@ postmaster_child_launch(BackendType child_type, int child_slot,
 	if (IsExternalConnectionBackend(child_type))
 		((BackendStartupData *) startup_data)->fork_started = GetCurrentTimestamp();
 
+	elog(LOG, "postmaster_child_launch: LocalDataChecksumVersion %u xlog %u", GetLocalDataChecksumVersion(),
+	GetCurrentDataChecksumVersion());
+
 #ifdef EXEC_BACKEND
 	pid = internal_forkexec(child_process_kinds[child_type].name, child_slot,
 							startup_data, startup_data_len, client_sock);
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 2b13a8cd260..b88d1d07431 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -553,6 +553,40 @@ ProcessProcSignalBarrier(void)
 
 		PG_TRY();
 		{
+			/* print info about barriers */
+			{
+				uint32	tmp = flags;
+
+				elog(LOG, "ProcessProcSignalBarrier flags %u", tmp);
+
+				while (tmp != 0)
+				{
+					ProcSignalBarrierType type;
+
+					type = (ProcSignalBarrierType) pg_rightmost_one_pos32(tmp);
+					switch (type)
+					{
+						case PROCSIGNAL_BARRIER_SMGRRELEASE:
+							elog(LOG, "PROCSIGNAL_BARRIER_SMGRRELEASE");
+							break;
+						case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON:
+							elog(LOG, "PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON");
+							break;
+						case PROCSIGNAL_BARRIER_CHECKSUM_ON:
+							elog(LOG, "PROCSIGNAL_BARRIER_CHECKSUM_ON");
+							break;
+						case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF:
+							elog(LOG, "PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF");
+							break;
+						case PROCSIGNAL_BARRIER_CHECKSUM_OFF:
+							elog(LOG, "PROCSIGNAL_BARRIER_CHECKSUM_OFF");
+							break;
+					}
+
+					BARRIER_CLEAR_BIT(tmp, type);
+				}
+			}
+
 			/*
 			 * Process each type of barrier. The barrier-processing functions
 			 * should normally return true, but may return false if the
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 8bd5fed8c85..a8af9a7068b 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -22,6 +22,7 @@
 #include "utils/memdebug.h"
 #include "utils/memutils.h"
 
+#include <execinfo.h>
 
 /* GUC variable */
 bool		ignore_checksum_failure = false;
@@ -136,10 +137,44 @@ PageIsVerifiedExtended(PageData *page, BlockNumber blkno, int flags)
 	if (checksum_failure)
 	{
 		if ((flags & PIV_LOG_WARNING) != 0)
+		{
+			XLogRecPtr	lsn = PageGetLSN(page);
+
 			ereport(WARNING,
 					(errcode(ERRCODE_DATA_CORRUPTED),
 					 errmsg("page verification failed, calculated checksum %u but expected %u",
 							checksum, p->pd_checksum)));
+			ereport(WARNING,
+					(errcode(ERRCODE_DATA_CORRUPTED),
+					 errmsg("page verification failed, LSN %X/%X", LSN_FORMAT_ARGS(lsn))));
+			ereport(WARNING,
+					(errcode(ERRCODE_DATA_CORRUPTED),
+					 errmsg("page verification failed, header flags %u lower %u upper %u special %u pagesize_version %u prune_xid %u",
+							p->pd_flags,
+							p->pd_lower,
+							p->pd_upper,
+							p->pd_special,
+							p->pd_pagesize_version,
+							p->pd_prune_xid)));
+
+			{
+#define BT_BUF_SIZE 32
+				int nptrs;
+				void *buffer[BT_BUF_SIZE];
+				char **strings;
+
+				nptrs = backtrace(buffer, BT_BUF_SIZE);
+				strings = backtrace_symbols(buffer, nptrs);
+
+				if (strings != NULL)
+				{
+					for (int i = 0; i < nptrs; i++)
+					{
+						elog(WARNING, "backtrace %d : %s", i, strings[i]);
+					}
+				}
+			}
+		}
 
 		if ((flags & PIV_REPORT_STAT) != 0)
 			pgstat_report_checksum_failure();
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index a071ba6f455..df52ce8ad7f 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -164,6 +164,9 @@ InitPostmasterChild(void)
 				(errcode_for_socket_access(),
 				 errmsg_internal("could not set postmaster death monitoring pipe to FD_CLOEXEC mode: %m")));
 #endif
+
+	elog(LOG, "InitPostmasterChild: LocalDataChecksumVersion %u xlog %u", GetLocalDataChecksumVersion(),
+	GetCurrentDataChecksumVersion());
 }
 
 /*
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index bea779eef94..3d176fc0a9f 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -280,6 +280,8 @@ main(int argc, char *argv[])
 		   ControlFile->checkPointCopy.oldestCommitTsXid);
 	printf(_("Latest checkpoint's newestCommitTsXid:%u\n"),
 		   ControlFile->checkPointCopy.newestCommitTsXid);
+	printf(_("Latest checkpoint's data_checksum_version:%u\n"),
+		   ControlFile->checkPointCopy.data_checksum_version);
 	printf(_("Time of latest checkpoint:            %s\n"),
 		   ckpttime_str);
 	printf(_("Fake LSN counter for unlogged rels:   %X/%X\n"),
-- 
2.48.1

test.shapplication/x-shellscript; name=test.shDownload
#42Andres Freund
andres@anarazel.de
In reply to: Tomas Vondra (#41)
Re: Changing the state of data checksums in a running cluster

Jo.

On 2025-03-15 16:50:02 +0100, Tomas Vondra wrote:

Thanks, here's an updated patch version

FWIW, this fails in CI;

https://cirrus-ci.com/build/4678473324691456
On all OSs:
[16:08:36.331] # Failed test 'options --locale-provider=icu --locale=und --lc-*=C: no stderr'
[16:08:36.331] # at /tmp/cirrus-ci-build/src/bin/initdb/t/001_initdb.pl line 132.
[16:08:36.331] # got: '2025-03-15 16:08:26.216 UTC [63153] LOG: XLogCtl->data_checksum_version 0 ControlFile->data_checksum_version 0
[16:08:36.331] # 2025-03-15 16:08:26.216 UTC [63153] LOG: XLogCtl->data_checksum_version 0 ControlFile->data_checksum_version 0 (UPDATED)

Windows & Compiler warnings:
[16:05:08.723] ../src/backend/storage/page/bufpage.c(25): fatal error C1083: Cannot open include file: 'execinfo.h': No such file or directory

[16:18:52.385] bufpage.c:25:10: fatal error: execinfo.h: No such file or directory
[16:18:52.385] 25 | #include <execinfo.h>
[16:18:52.385] | ^~~~~~~~~~~~

Greetings,

Andres Freund

#43Tomas Vondra
tomas@vondra.me
In reply to: Andres Freund (#42)
Re: Changing the state of data checksums in a running cluster

On 3/15/25 17:26, Andres Freund wrote:

Jo.

On 2025-03-15 16:50:02 +0100, Tomas Vondra wrote:

Thanks, here's an updated patch version

FWIW, this fails in CI;

https://cirrus-ci.com/build/4678473324691456
On all OSs:
[16:08:36.331] # Failed test 'options --locale-provider=icu --locale=und --lc-*=C: no stderr'
[16:08:36.331] # at /tmp/cirrus-ci-build/src/bin/initdb/t/001_initdb.pl line 132.
[16:08:36.331] # got: '2025-03-15 16:08:26.216 UTC [63153] LOG: XLogCtl->data_checksum_version 0 ControlFile->data_checksum_version 0
[16:08:36.331] # 2025-03-15 16:08:26.216 UTC [63153] LOG: XLogCtl->data_checksum_version 0 ControlFile->data_checksum_version 0 (UPDATED)

Windows & Compiler warnings:
[16:05:08.723] ../src/backend/storage/page/bufpage.c(25): fatal error C1083: Cannot open include file: 'execinfo.h': No such file or directory

[16:18:52.385] bufpage.c:25:10: fatal error: execinfo.h: No such file or directory
[16:18:52.385] 25 | #include <execinfo.h>
[16:18:52.385] | ^~~~~~~~~~~~

Greetings,

Yeah, that's just the "debug stuff" - I don't expect any of that to be
included in the commit, I only posted it for convenience. It adds a lot
of debug logging, which I hope might help others to understand what the
problem with checksums on standby is.

regards

--
Tomas Vondra

#44Alexander Korotkov
aekorotkov@gmail.com
In reply to: Tomas Vondra (#43)
Re: Changing the state of data checksums in a running cluster

Hi!

On Sat, Mar 15, 2025 at 7:33 PM Tomas Vondra <tomas@vondra.me> wrote:

On 3/15/25 17:26, Andres Freund wrote:

Jo.

On 2025-03-15 16:50:02 +0100, Tomas Vondra wrote:

Thanks, here's an updated patch version

FWIW, this fails in CI;

https://cirrus-ci.com/build/4678473324691456
On all OSs:
[16:08:36.331] # Failed test 'options --locale-provider=icu

--locale=und --lc-*=C: no stderr'

[16:08:36.331] # at /tmp/cirrus-ci-build/src/bin/initdb/t/

001_initdb.pl line 132.

[16:08:36.331] # got: '2025-03-15 16:08:26.216 UTC [63153]

LOG: XLogCtl->data_checksum_version 0 ControlFile->data_checksum_version 0

[16:08:36.331] # 2025-03-15 16:08:26.216 UTC [63153] LOG:

XLogCtl->data_checksum_version 0 ControlFile->data_checksum_version 0
(UPDATED)

Windows & Compiler warnings:
[16:05:08.723] ../src/backend/storage/page/bufpage.c(25): fatal error

C1083: Cannot open include file: 'execinfo.h': No such file or directory

[16:18:52.385] bufpage.c:25:10: fatal error: execinfo.h: No such file or

directory

[16:18:52.385] 25 | #include <execinfo.h>
[16:18:52.385] | ^~~~~~~~~~~~

Greetings,

Yeah, that's just the "debug stuff" - I don't expect any of that to be
included in the commit, I only posted it for convenience. It adds a lot
of debug logging, which I hope might help others to understand what the
problem with checksums on standby is.

I took a look at this patch. I have following notes.
1) I think reporting of these errors could be better, more detailed.
Especially the second one could be similar to some of other errors on
checksums processing.
ereport(ERROR,
(errmsg("failed to start background worker to process
data checksums")));
ereport(ERROR,
(errmsg("unable to enable data checksums in cluster")));

2) ProcessAllDatabases() contains loop, which repeats scanning the new
databases for checkums. It continues while there are new database on each
iteration. Could we just limit the number of iterations to 2? Given at
each step we're calling WaitForAllTransactionsToFinish(), everything that
gets created after first WaitForAllTransactionsToFinish() call should have
checksums enabled in the beginning.

------
Regards,
Alexander Korotkov
Supabase

#45Bernd Helmle
mailings@oopsware.de
In reply to: Tomas Vondra (#41)
5 attachment(s)
Re: Changing the state of data checksums in a running cluster

Am Samstag, dem 15.03.2025 um 16:50 +0100 schrieb Tomas Vondra:

I wonder if this "time travel backwards" might be an issue for this
too,
because it might mean we end up picking the wrong
data_checksum_version
from the control file. In any case, if this happens, we don't get to
the
ControlFile->data_checksum_version update a bit further down. And
there's another condition that can skip that.

I'll continue investigating this next week, but at this point I'm
quite
confused and would be grateful for any insights ...

Hi,

Since i wanted to dig a little deeper in this patch i took the
opportunity and rebased it to current master, hopefully not having
broken something seriously.

Thanks,
Bernd

Attachments:

v20250711-0004-debug-stuff.patchtext/x-patch; charset=UTF-8; name=v20250711-0004-debug-stuff.patchDownload
From 9292aac9d8c3a78ceab7aaa5a1c2d80432fbcbaa Mon Sep 17 00:00:00 2001
From: Bernd Helmle <Bernd Helmle mailings@oopsware.de>
Date: Fri, 11 Jul 2025 16:10:32 +0200
Subject: [PATCH 4/4] debug stuff

---
 src/backend/access/transam/xlog.c       | 86 ++++++++++++++++++++++++-
 src/backend/postmaster/launch_backend.c |  3 +
 src/backend/storage/ipc/procsignal.c    | 34 ++++++++++
 src/backend/storage/page/bufpage.c      | 35 ++++++++++
 src/backend/utils/init/miscinit.c       |  3 +
 src/bin/pg_controldata/pg_controldata.c |  2 +
 6 files changed, 160 insertions(+), 3 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 097db70fcfc..a6b6c0eaddd 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4398,6 +4398,8 @@ InitControlFile(uint64 sysidentifier, uint32 data_checksum_version)
 	 * processes get the current value from. (Maybe it should go just there?)
 	 */
 	XLogCtl->data_checksum_version = data_checksum_version;
+
+	elog(LOG, "InitControlFile %p data_checksum_version %u", XLogCtl, ControlFile->data_checksum_version);
 }
 
 static void
@@ -4737,6 +4739,8 @@ ReadControlFile(void)
 		(SizeOfXLogLongPHD - SizeOfXLogShortPHD);
 
 	CalculateCheckpointSegments();
+
+	elog(LOG, "ReadControlFile ControlFile->data_checksum_version = %u", ControlFile->data_checksum_version);
 }
 
 /*
@@ -4872,6 +4876,8 @@ SetDataChecksumsOnInProgress(void)
 	XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
 	SpinLockRelease(&XLogCtl->info_lck);
 
+	elog(LOG, "SetDataChecksumsOnInProgress XLogCtl->data_checksum_version = %u", XLogCtl->data_checksum_version);
+
 	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
 
 	END_CRIT_SECTION();
@@ -4937,6 +4943,8 @@ SetDataChecksumsOn(void)
 	XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
 	SpinLockRelease(&XLogCtl->info_lck);
 
+	elog(LOG, "SetDataChecksumsOn XLogCtl->data_checksum_version = %u", XLogCtl->data_checksum_version);
+
 	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
 
 	END_CRIT_SECTION();
@@ -4975,6 +4983,7 @@ SetDataChecksumsOff(void)
 	/* If data checksums are already disabled there is nothing to do */
 	if (XLogCtl->data_checksum_version == 0)
 	{
+		elog(LOG, "SetDataChecksumsOff XLogCtl->data_checksum_version = %u (SKIP)", XLogCtl->data_checksum_version);
 		SpinLockRelease(&XLogCtl->info_lck);
 		return;
 	}
@@ -4999,6 +5008,8 @@ SetDataChecksumsOff(void)
 		XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
 		SpinLockRelease(&XLogCtl->info_lck);
 
+		elog(LOG, "SetDataChecksumsOff XLogCtl->data_checksum_version = %u", XLogCtl->data_checksum_version);
+
 		barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
 
 		END_CRIT_SECTION();
@@ -5019,6 +5030,8 @@ SetDataChecksumsOff(void)
 	}
 	else
 	{
+		elog(LOG, "SetDataChecksumsOff XLogCtl->data_checksum_version = %u (SKIP)", XLogCtl->data_checksum_version);
+
 		/*
 		 * Ending up here implies that the checksums state is "inprogress-on"
 		 * or "inprogress-off" and we can transition directly to "off" from
@@ -5056,6 +5069,7 @@ bool
 AbsorbChecksumsOnInProgressBarrier(void)
 {
 	/* XXX can't we check we're in OFF or INPROGRESSS_ON? */
+	elog(LOG, "AbsorbChecksumsOnInProgressBarrier");
 	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
 	return true;
 }
@@ -5063,6 +5077,8 @@ AbsorbChecksumsOnInProgressBarrier(void)
 bool
 AbsorbChecksumsOnBarrier(void)
 {
+	elog(LOG, "AbsorbChecksumsOnBarrier");
+
 	/*
 	 * If the process was spawned between updating XLogCtl and emitting the
 	 * barrier it will have seen the updated value, so for the first barrier
@@ -5081,6 +5097,7 @@ bool
 AbsorbChecksumsOffInProgressBarrier(void)
 {
 	/* XXX can't we check we're in ON or INPROGRESSS_OFF? */
+	elog(LOG, "AbsorbChecksumsOffInProgressBarrier");
 	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
 	return true;
 }
@@ -5089,6 +5106,7 @@ bool
 AbsorbChecksumsOffBarrier(void)
 {
 	/* XXX can't we check we're in INPROGRESSS_OFF? */
+	elog(LOG, "AbsorbChecksumsOffBarrier");
 	SetLocalDataChecksumVersion(0);
 	return true;
 }
@@ -5104,6 +5122,7 @@ AbsorbChecksumsOffBarrier(void)
 void
 InitLocalDataChecksumVersion(void)
 {
+	elog(LOG, "InitLocalDataChecksumVersion XLogCtl->data_checksum_version = %u", XLogCtl->data_checksum_version);
 	SpinLockAcquire(&XLogCtl->info_lck);
 	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
 	SpinLockRelease(&XLogCtl->info_lck);
@@ -5118,6 +5137,7 @@ InitLocalDataChecksumVersion(void)
 void
 SetLocalDataChecksumVersion(uint32 data_checksum_version)
 {
+	elog(LOG, "SetLocalDataChecksumVersion %u", data_checksum_version);
 	LocalDataChecksumVersion = data_checksum_version;
 
 	switch (LocalDataChecksumVersion)
@@ -5615,9 +5635,15 @@ XLOGShmemInit(void)
 	XLogCtl->InstallXLogFileSegmentActive = false;
 	XLogCtl->WalWriterSleeping = false;
 
+	elog(LOG, "XLogCtl->data_checksum_version %u ControlFile->data_checksum_version %u",
+		 XLogCtl->data_checksum_version, ControlFile->data_checksum_version);
+
 	/* use the checksum info from control file */
 	XLogCtl->data_checksum_version = ControlFile->data_checksum_version;
 
+	elog(LOG, "XLogCtl->data_checksum_version %u ControlFile->data_checksum_version %u (UPDATED)",
+		 XLogCtl->data_checksum_version, ControlFile->data_checksum_version);
+
 	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
 
 	SpinLockInit(&XLogCtl->Insert.insertpos_lck);
@@ -6758,6 +6784,8 @@ StartupXLOG(void)
 	 */
 	CompleteCommitTsInitialization();
 
+	elog(LOG, "StartupXLOG XLogCtl->data_checksum_version = %u", XLogCtl->data_checksum_version);
+
 	/*
 	 * If we reach this point with checksums being enabled ("inprogress-on"
 	 * state), we notify the user that they need to manually restart the
@@ -6775,6 +6803,9 @@ StartupXLOG(void)
 	 * we know that we have a state where all backends have stopped validating
 	 * checksums and we can move to off instead of prompting the user to
 	 * perform any action.
+	 *
+	 * XXX Is it safe to access the data_checksum_version without holding the
+	 * spinlock?
 	 */
 	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
 	{
@@ -6783,6 +6814,8 @@ StartupXLOG(void)
 		SpinLockAcquire(&XLogCtl->info_lck);
 		XLogCtl->data_checksum_version = 0;
 		SpinLockRelease(&XLogCtl->info_lck);
+
+		elog(LOG, "StartupXLOG XLogCtl->data_checksum_version = %u (UPDATED)", XLogCtl->data_checksum_version);
 	}
 
 	/*
@@ -7646,6 +7679,8 @@ CreateCheckPoint(int flags)
 	 */
 	checkPoint.data_checksum_version = XLogCtl->data_checksum_version;
 
+	elog(WARNING, "CREATECHECKPOINT XLogCtl->data_checksum_version %u", XLogCtl->data_checksum_version);
+
 	if (shutdown)
 	{
 		XLogRecPtr	curInsert = XLogBytePosToRecPtr(Insert->CurrBytePos);
@@ -7901,6 +7936,10 @@ CreateCheckPoint(int flags)
 	ControlFile->minRecoveryPoint = InvalidXLogRecPtr;
 	ControlFile->minRecoveryPointTLI = 0;
 
+	elog(LOG, "CreateCheckPoint data_checksum_version %u %u",
+		 ControlFile->data_checksum_version,
+		 checkPoint.data_checksum_version);
+
 	/* make sure we start with the checksum version as of the checkpoint */
 	ControlFile->data_checksum_version = checkPoint.data_checksum_version;
 
@@ -8055,6 +8094,10 @@ CreateEndOfRecoveryRecord(void)
 	ControlFile->minRecoveryPoint = recptr;
 	ControlFile->minRecoveryPointTLI = xlrec.ThisTimeLineID;
 
+	elog(LOG, "CreateEndOfRecoveryRecord data_checksum_version %u xlog %u",
+		 ControlFile->data_checksum_version,
+		 XLogCtl->data_checksum_version);
+
 	/* start with the latest checksum version (as of the end of recovery) */
 	ControlFile->data_checksum_version = XLogCtl->data_checksum_version;
 
@@ -8227,6 +8270,14 @@ RecoveryRestartPoint(const CheckPoint *checkPoint, XLogReaderState *record)
 	XLogCtl->lastCheckPointEndPtr = record->EndRecPtr;
 	XLogCtl->lastCheckPoint = *checkPoint;
 	SpinLockRelease(&XLogCtl->info_lck);
+
+	elog(LOG, "RecoveryRestartPoint lastCheckPointRecPtr %X/%X XLogCtl->lastCheckPointEndPtr %X/%X redo %X/%X checksums xlogctl %u record %d file %u",
+		 LSN_FORMAT_ARGS(XLogCtl->lastCheckPointRecPtr),
+		 LSN_FORMAT_ARGS(XLogCtl->lastCheckPointEndPtr),
+		 LSN_FORMAT_ARGS(XLogCtl->lastCheckPoint.redo),
+		 XLogCtl->lastCheckPoint.data_checksum_version,
+		 checkPoint->data_checksum_version,
+		 ControlFile->data_checksum_version);
 }
 
 /*
@@ -8264,6 +8315,9 @@ CreateRestartPoint(int flags)
 	lastCheckPoint = XLogCtl->lastCheckPoint;
 	SpinLockRelease(&XLogCtl->info_lck);
 
+	elog(LOG, "CreateRestartPoint lastCheckPointRecPtr %X/%X lastCheckPointEndPtr %X/%X",
+		 LSN_FORMAT_ARGS(lastCheckPointRecPtr), LSN_FORMAT_ARGS(lastCheckPointEndPtr));
+
 	/*
 	 * Check that we're still in recovery mode. It's ok if we exit recovery
 	 * mode after this check, the restart point is valid anyway.
@@ -8292,9 +8346,11 @@ CreateRestartPoint(int flags)
 	if (XLogRecPtrIsInvalid(lastCheckPointRecPtr) ||
 		lastCheckPoint.redo <= ControlFile->checkPointCopy.redo)
 	{
-		ereport(DEBUG2,
-				errmsg_internal("skipping restartpoint, already performed at %X/%08X",
-								LSN_FORMAT_ARGS(lastCheckPoint.redo)));
+		ereport(LOG,
+				(errmsg_internal("CreateRestartPoint: skipping restartpoint, already performed at %X/%X <= %X/%X lastCheckPointRecPtr %X/%X",
+								 LSN_FORMAT_ARGS(lastCheckPoint.redo),
+								 LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo),
+								 LSN_FORMAT_ARGS(lastCheckPointRecPtr))));
 
 		UpdateMinRecoveryPoint(InvalidXLogRecPtr, true);
 		if (flags & CHECKPOINT_IS_SHUTDOWN)
@@ -8362,6 +8418,8 @@ CreateRestartPoint(int flags)
 	 * end-of-recovery checkpoint.
 	 */
 	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	elog(LOG, "CreateRestartPoint ControlFile->checkPointCopy.redo %X/%X lastCheckPoint.redo %X/%X",
+		 LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo), LSN_FORMAT_ARGS(lastCheckPoint.redo));
 	if (ControlFile->checkPointCopy.redo < lastCheckPoint.redo)
 	{
 		/*
@@ -8400,11 +8458,17 @@ CreateRestartPoint(int flags)
 				ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY;
 		}
 
+		elog(LOG, "CreateRestartPoint data_checksum_version %u %u",
+			 ControlFile->data_checksum_version,
+			 lastCheckPoint.data_checksum_version);
+
 		/* we shall start with the latest checksum version */
 		ControlFile->data_checksum_version = lastCheckPoint.data_checksum_version;
 
 		UpdateControlFile();
 	}
+	else
+		elog(LOG, "CreateRestartPoint: skipped ControlFile update");
 	LWLockRelease(ControlFileLock);
 
 	/*
@@ -9264,13 +9328,25 @@ xlog_redo(XLogReaderState *record)
 	{
 		xl_checksum_state state;
 		uint64		barrier;
+		XLogRecPtr	checkpointLsn;
+		uint32		value,
+					value_last;
 
 		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
 
 		SpinLockAcquire(&XLogCtl->info_lck);
+		value_last = XLogCtl->data_checksum_version;
 		XLogCtl->data_checksum_version = state.new_checksumtype;
 		SpinLockRelease(&XLogCtl->info_lck);
 
+		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+		checkpointLsn = ControlFile->checkPoint;
+		value = ControlFile->data_checksum_version;
+		LWLockRelease(ControlFileLock);
+
+		elog(LOG, "XLOG_CHECKSUMS xlog_redo %X/%X control checkpoint %X/%X control %u last %u record %u",
+			 LSN_FORMAT_ARGS(lsn), LSN_FORMAT_ARGS(checkpointLsn), value, value_last, state.new_checksumtype);
+
 		/*
 		 * Block on a procsignalbarrier to await all processes having seen the
 		 * change to checksum status. Once the barrier has been passed we can
@@ -9279,22 +9355,26 @@ xlog_redo(XLogReaderState *record)
 		switch (state.new_checksumtype)
 		{
 			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				elog(LOG, "XLOG_CHECKSUMS emit PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON");
 				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
 				WaitForProcSignalBarrier(barrier);
 				break;
 
 			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				elog(LOG, "XLOG_CHECKSUMS emit PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF");
 				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
 				WaitForProcSignalBarrier(barrier);
 				break;
 
 			case PG_DATA_CHECKSUM_VERSION:
+				elog(LOG, "XLOG_CHECKSUMS emit PROCSIGNAL_BARRIER_CHECKSUM_ON");
 				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
 				WaitForProcSignalBarrier(barrier);
 				break;
 
 			default:
 				Assert(state.new_checksumtype == 0);
+				elog(LOG, "XLOG_CHECKSUMS emit PROCSIGNAL_BARRIER_CHECKSUM_OFF");
 				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
 				WaitForProcSignalBarrier(barrier);
 				break;
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 955df32be5d..d2ae41172a6 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -241,6 +241,9 @@ postmaster_child_launch(BackendType child_type, int child_slot,
 	if (IsExternalConnectionBackend(child_type))
 		((BackendStartupData *) startup_data)->fork_started = GetCurrentTimestamp();
 
+	elog(LOG, "postmaster_child_launch: LocalDataChecksumVersion %u xlog %u", GetLocalDataChecksumVersion(),
+	GetCurrentDataChecksumVersion());
+
 #ifdef EXEC_BACKEND
 	pid = internal_forkexec(child_process_kinds[child_type].name, child_slot,
 							startup_data, startup_data_len, client_sock);
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index c48abee2b60..d5e1bb12ba5 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -554,6 +554,40 @@ ProcessProcSignalBarrier(void)
 
 		PG_TRY();
 		{
+			/* print info about barriers */
+			{
+				uint32	tmp = flags;
+
+				elog(LOG, "ProcessProcSignalBarrier flags %u", tmp);
+
+				while (tmp != 0)
+				{
+					ProcSignalBarrierType type;
+
+					type = (ProcSignalBarrierType) pg_rightmost_one_pos32(tmp);
+					switch (type)
+					{
+						case PROCSIGNAL_BARRIER_SMGRRELEASE:
+							elog(LOG, "PROCSIGNAL_BARRIER_SMGRRELEASE");
+							break;
+						case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON:
+							elog(LOG, "PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON");
+							break;
+						case PROCSIGNAL_BARRIER_CHECKSUM_ON:
+							elog(LOG, "PROCSIGNAL_BARRIER_CHECKSUM_ON");
+							break;
+						case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF:
+							elog(LOG, "PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF");
+							break;
+						case PROCSIGNAL_BARRIER_CHECKSUM_OFF:
+							elog(LOG, "PROCSIGNAL_BARRIER_CHECKSUM_OFF");
+							break;
+					}
+
+					BARRIER_CLEAR_BIT(tmp, type);
+				}
+			}
+
 			/*
 			 * Process each type of barrier. The barrier-processing functions
 			 * should normally return true, but may return false if the
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 19cf6512e52..7c8f4d73606 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -22,6 +22,7 @@
 #include "utils/memdebug.h"
 #include "utils/memutils.h"
 
+#include <execinfo.h>
 
 /* GUC variable */
 bool		ignore_checksum_failure = false;
@@ -149,10 +150,44 @@ PageIsVerified(PageData *page, BlockNumber blkno, int flags, bool *checksum_fail
 	if (checksum_failure)
 	{
 		if ((flags & (PIV_LOG_WARNING | PIV_LOG_LOG)) != 0)
+		{
+			XLogRecPtr	lsn = PageGetLSN(page);
+
 			ereport(flags & PIV_LOG_WARNING ? WARNING : LOG,
 					(errcode(ERRCODE_DATA_CORRUPTED),
 					 errmsg("page verification failed, calculated checksum %u but expected %u",
 							checksum, p->pd_checksum)));
+			ereport(WARNING,
+					(errcode(ERRCODE_DATA_CORRUPTED),
+					 errmsg("page verification failed, LSN %X/%X", LSN_FORMAT_ARGS(lsn))));
+			ereport(WARNING,
+					(errcode(ERRCODE_DATA_CORRUPTED),
+					 errmsg("page verification failed, header flags %u lower %u upper %u special %u pagesize_version %u prune_xid %u",
+							p->pd_flags,
+							p->pd_lower,
+							p->pd_upper,
+							p->pd_special,
+							p->pd_pagesize_version,
+							p->pd_prune_xid)));
+
+			{
+#define BT_BUF_SIZE 32
+				int nptrs;
+				void *buffer[BT_BUF_SIZE];
+				char **strings;
+
+				nptrs = backtrace(buffer, BT_BUF_SIZE);
+				strings = backtrace_symbols(buffer, nptrs);
+
+				if (strings != NULL)
+				{
+					for (int i = 0; i < nptrs; i++)
+					{
+						elog(WARNING, "backtrace %d : %s", i, strings[i]);
+					}
+				}
+			}
+		}
 
 		if (header_sane && (flags & PIV_IGNORE_CHECKSUM_FAILURE))
 			return true;
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index e39498d1250..25f35c2690c 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -164,6 +164,9 @@ InitPostmasterChild(void)
 				(errcode_for_socket_access(),
 				 errmsg_internal("could not set postmaster death monitoring pipe to FD_CLOEXEC mode: %m")));
 #endif
+
+	elog(LOG, "InitPostmasterChild: LocalDataChecksumVersion %u xlog %u", GetLocalDataChecksumVersion(),
+	GetCurrentDataChecksumVersion());
 }
 
 /*
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index 10de058ce91..acf5c7b026e 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -280,6 +280,8 @@ main(int argc, char *argv[])
 		   ControlFile->checkPointCopy.oldestCommitTsXid);
 	printf(_("Latest checkpoint's newestCommitTsXid:%u\n"),
 		   ControlFile->checkPointCopy.newestCommitTsXid);
+	printf(_("Latest checkpoint's data_checksum_version:%u\n"),
+		   ControlFile->checkPointCopy.data_checksum_version);
 	printf(_("Time of latest checkpoint:            %s\n"),
 		   ckpttime_str);
 	printf(_("Fake LSN counter for unlogged rels:   %X/%08X\n"),
-- 
2.50.0

test.shapplication/x-shellscript; name=test.shDownload
v20250711-0001-Online-enabling-and-disabling-of-data-checksums.patchtext/x-patch; charset=UTF-8; name*0=v20250711-0001-Online-enabling-and-disabling-of-data-checksums.pa; name*1=tchDownload
From 374aae5c819a75b4743849b91c5a7e4fdb224f17 Mon Sep 17 00:00:00 2001
From: Bernd Helmle <Bernd Helmle mailings@oopsware.de>
Date: Thu, 10 Jul 2025 17:25:57 +0200
Subject: [PATCH 1/4] Online enabling and disabling of data checksums

This allows data checksums to be enabled, or disabled, in a running
cluster without restricting access to the cluster during processing.

Data checksums could prior to this only be enabled during initdb or
when the cluster is offline using the pg_checksums app. This commit
introduce functionality to enable, or disable, data checksums while
the cluster is running regardless of how it was initialized.

A background worker launcher process is responsible for launching a
dynamic per-database background worker which will mark all buffers
dirty for all relation with storage in order for them to have data
checksums calcuated on write.  Once all relations in all databases
have been processed, the data_checksums state will be set to on and
the cluster will at that point be identical to one which had data
checksums enabled during initialization or via offline processing.

When data checksums are being enabled, concurrent I/O operations
from backends other than the data checksums worker will write the
checksums but not verify them on reading.  Only when all backends
have absorbed the procsignalbarrier for setting data_checksums to
on will they also start verifying checksums on reading.  The same
process is repeated during disabling; all backends write checksums
but do not verify them until the barrier for setting the state to
off has been absorbed by all.  This in-progress state is used to
ensure there are no false negatives (or positives) due to reading
a checksum which is not in sync with the page.

This work is based on an earlier version of this patch which was
reviewed by among others Heikki Linnakangas, Robert Haas, Andres
Freund, Tomas Vondra, Michael Banck and Andrey Borodin.

Author: Daniel Gustafsson <daniel@yesql.se>
Author: Magnus Hagander <magnus@hagander.net>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com
Discussion: https://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de
Discussion: https://postgr.es/m/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com
---
 doc/src/sgml/func.sgml                        |   71 +
 doc/src/sgml/glossary.sgml                    |   23 +
 doc/src/sgml/monitoring.sgml                  |  208 ++-
 doc/src/sgml/ref/pg_checksums.sgml            |    6 +
 doc/src/sgml/wal.sgml                         |   59 +-
 src/backend/access/rmgrdesc/xlogdesc.c        |   24 +
 src/backend/access/transam/xlog.c             |  554 ++++++-
 src/backend/access/transam/xlogfuncs.c        |   41 +
 src/backend/backup/basebackup.c               |    6 +-
 src/backend/catalog/system_functions.sql      |   11 +
 src/backend/catalog/system_views.sql          |   20 +
 src/backend/postmaster/Makefile               |    1 +
 src/backend/postmaster/bgworker.c             |    7 +
 src/backend/postmaster/datachecksumsworker.c  | 1439 +++++++++++++++++
 src/backend/postmaster/launch_backend.c       |   13 +
 src/backend/postmaster/meson.build            |    1 +
 src/backend/postmaster/postmaster.c           |    5 +
 src/backend/replication/logical/decode.c      |    1 +
 src/backend/storage/ipc/ipci.c                |    4 +
 src/backend/storage/ipc/procsignal.c          |   13 +
 src/backend/storage/page/README               |    4 +-
 src/backend/storage/page/bufpage.c            |    6 +-
 src/backend/utils/activity/pgstat_backend.c   |    2 +
 src/backend/utils/activity/pgstat_io.c        |    2 +
 .../utils/activity/wait_event_names.txt       |    3 +
 src/backend/utils/adt/pgstatfuncs.c           |    8 +-
 src/backend/utils/init/miscinit.c             |    9 +-
 src/backend/utils/init/postinit.c             |    7 +-
 src/backend/utils/misc/guc_tables.c           |   31 +-
 src/bin/pg_checksums/pg_checksums.c           |    2 +-
 src/bin/pg_upgrade/controldata.c              |    9 +
 src/include/access/xlog.h                     |   17 +-
 src/include/access/xlog_internal.h            |    7 +
 src/include/catalog/pg_control.h              |    6 +-
 src/include/catalog/pg_proc.dat               |   17 +
 src/include/commands/progress.h               |   16 +
 src/include/miscadmin.h                       |   10 +
 src/include/postmaster/datachecksumsworker.h  |   31 +
 src/include/storage/bufpage.h                 |   10 +
 src/include/storage/checksum.h                |    8 +
 src/include/storage/lwlocklist.h              |    1 +
 src/include/storage/proc.h                    |    6 +-
 src/include/storage/procsignal.h              |    5 +
 src/include/utils/backend_progress.h          |    1 +
 src/test/Makefile                             |   11 +-
 src/test/checksum/.gitignore                  |    2 +
 src/test/checksum/Makefile                    |   23 +
 src/test/checksum/README                      |   22 +
 src/test/checksum/meson.build                 |   15 +
 src/test/checksum/t/001_basic.pl              |   88 +
 src/test/checksum/t/002_restarts.pl           |   92 ++
 src/test/checksum/t/003_standby_restarts.pl   |  139 ++
 src/test/checksum/t/004_offline.pl            |  101 ++
 src/test/meson.build                          |    1 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |   36 +
 src/test/regress/expected/rules.out           |   36 +
 src/test/regress/expected/stats.out           |   18 +-
 src/tools/pgindent/typedefs.list              |    6 +
 58 files changed, 3262 insertions(+), 53 deletions(-)
 create mode 100644 src/backend/postmaster/datachecksumsworker.c
 create mode 100644 src/include/postmaster/datachecksumsworker.h
 create mode 100644 src/test/checksum/.gitignore
 create mode 100644 src/test/checksum/Makefile
 create mode 100644 src/test/checksum/README
 create mode 100644 src/test/checksum/meson.build
 create mode 100644 src/test/checksum/t/001_basic.pl
 create mode 100644 src/test/checksum/t/002_restarts.pl
 create mode 100644 src/test/checksum/t/003_standby_restarts.pl
 create mode 100644 src/test/checksum/t/004_offline.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index c28aa71f570..b60514c6869 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -30053,6 +30053,77 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
 
   </sect2>
 
+  <sect2 id="functions-admin-checksum">
+   <title>Data Checksum Functions</title>
+
+   <para>
+    The functions shown in <xref linkend="functions-checksums-table" /> can
+    be used to enable or disable data checksums in a running cluster.
+    See <xref linkend="checksums" /> for details.
+   </para>
+
+   <table id="functions-checksums-table">
+    <title>Data Checksum Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_enable_data_checksums</primary>
+        </indexterm>
+        <function>pg_enable_data_checksums</function> ( <optional><parameter>cost_delay</parameter> <type>int</type>, <parameter>cost_limit</parameter> <type>int</type></optional> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Initiates data checksums for the cluster. This will switch the data
+        checksums mode to <literal>inprogress-on</literal> as well as start a
+        background worker that will process all pages in the database and
+        enable checksums on them. When all data pages have had checksums
+        enabled, the cluster will automatically switch data checksums mode to
+        <literal>on</literal>.
+       </para>
+       <para>
+        If <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter> are
+        specified, the speed of the process is throttled using the same principles as
+        <link linkend="runtime-config-resource-vacuum-cost">Cost-based Vacuum Delay</link>.
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_disable_data_checksums</primary>
+        </indexterm>
+        <function>pg_disable_data_checksums</function> ()
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Disables data checksum validation and calculation for the cluster. This
+        will switch the data checksum mode to <literal>inprogress-off</literal>
+        while data checksums are being disabled. When all active backends have
+        stopped validating data checksums, the data checksum mode will be
+        changed to <literal>off</literal>.  At this point the data pages will
+        still have checksums recorded but they are not updated when pages are
+        modified.
+       </para></entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+
+  </sect2>
+
   <sect2 id="functions-admin-dbobject">
    <title>Database Object Management Functions</title>
 
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index b88cac598e9..a4e16d03aae 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -184,6 +184,8 @@
      (but not the autovacuum workers),
      the <glossterm linkend="glossary-background-writer">background writer</glossterm>,
      the <glossterm linkend="glossary-checkpointer">checkpointer</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker">data checksums worker</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker-launcher">data checksums worker launcher</glossterm>,
      the <glossterm linkend="glossary-logger">logger</glossterm>,
      the <glossterm linkend="glossary-startup-process">startup process</glossterm>,
      the <glossterm linkend="glossary-wal-archiver">WAL archiver</glossterm>,
@@ -573,6 +575,27 @@
    </glossdef>
   </glossentry>
 
+  <glossentry id="glossary-data-checksums-worker">
+   <glossterm>Data Checksums Worker</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which enables or disables data checksums in a specific database.
+    </para>
+   </glossdef>
+  </glossentry>
+
+  <glossentry id="glossary-data-checksums-worker-launcher">
+   <glossterm>Data Checksums Worker Launcher</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which starts <glossterm linkend="glossary-data-checksums-worker"> processes</glossterm>
+     for each database.
+    </para>
+   </glossdef>
+  </glossentry>
+
   <glossentry id="glossary-db-cluster">
    <glossterm>Database cluster</glossterm>
    <glossdef>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 4265a22d4de..5464e236056 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3516,8 +3516,9 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Number of data page checksum failures detected in this
-       database (or on a shared object), or NULL if data checksums are
-       disabled.
+       database (or on a shared object).
+       Detected failures are reported regardless of the
+       <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -3527,8 +3528,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Time at which the last data page checksum failure was detected in
-       this database (or on a shared object), or NULL if data checksums are
-       disabled.
+       this database (or on a shared object). Last failure is reported
+       regardless of the <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -6854,6 +6855,205 @@ FROM pg_stat_get_backend_idset() AS backendid;
 
  </sect2>
 
+ <sect2 id="data-checksum-progress-reporting">
+  <title>Data Checksum Progress Reporting</title>
+
+  <indexterm>
+   <primary>pg_stat_progress_data_checksums</primary>
+  </indexterm>
+
+  <para>
+   When data checksums are being enabled on a running cluster, the
+   <structname>pg_stat_progress_data_checksums</structname> view will contain
+   a row for the launcher process, and one row for each worker process which
+   is currently calculating checksums for the data pages in one database.
+  </para>
+
+  <table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
+   <title><structname>pg_stat_progress_data_checksums</structname> View</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        Column Type
+       </para>
+       <para>
+        Description>
+       </para>
+      </entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>pid</structfield> <type>integer</type>
+       </para>
+       <para>
+        Process ID of a datachecksumworker process.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datid</structfield> <type>oid</type>
+      </para>
+      <para>
+       OID of this database, or 0 for the launcher process
+       relation
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datname</structfield> <type>name</type>
+      </para>
+      <para>
+       Name of this database, or <literal>NULL</literal> for the
+       launcher process.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>phase</structfield> <type>text</type>
+       </para>
+       <para>
+        Current processing phase, see <xref linkend="datachecksum-phases"/>
+        for description of the phases.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of databases which will be processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this set to <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of databases which have been processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this set to <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of relations which will be processed, or
+        <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of relations yet. The launcher process has
+        this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of relations which have been processed. The launcher
+        process has this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>blocks_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks in the current relation which will be processed,
+        or <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of blocks yet. The launcher process has
+        this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>blocks_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks in the current relation which have been processed.
+        The launcher process has this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+  <table id="datachecksum-phases">
+   <title>Data Checksum Phases</title>
+   <tgroup cols="2">
+    <colspec colname="col1" colwidth="1*"/>
+    <colspec colname="col2" colwidth="2*"/>
+    <thead>
+     <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+    <tbody>
+     <row>
+      <entry><literal>enabling</literal></entry>
+      <entry>
+       The command is currently enabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>disabling</literal></entry>
+      <entry>
+       The command is currently disabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on temporary tables</literal></entry>
+      <entry>
+       The command is currently waiting for all temporary tables which existed
+       at the time the command was started to be removed.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on checkpoint</literal></entry>
+      <entry>
+       The command is currently waiting for a checkpoint to update the checksum
+       state before finishing.
+      </entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+ </sect2>
+
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index 95043aa329c..0343710af53 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -45,6 +45,12 @@ PostgreSQL documentation
    exit status is nonzero if the operation failed.
   </para>
 
+  <para>
+   When enabling checksums, if checksums were in the process of being enabled
+   when the cluster was shut down, <application>pg_checksums</application>
+   will still process all relations regardless of the online processing.
+  </para>
+
   <para>
    When verifying checksums, every file in the cluster is scanned. When
    enabling checksums, each relation file block with a changed checksum is
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index f3b86b26be9..0ada90ca0b1 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -246,9 +246,10 @@
   <para>
    Checksums can be disabled when the cluster is initialized using <link
    linkend="app-initdb-data-checksums"><application>initdb</application></link>.
-   They can also be enabled or disabled at a later time as an offline
-   operation. Data checksums are enabled or disabled at the full cluster
-   level, and cannot be specified individually for databases or tables.
+   They can also be enabled or disabled at a later time either as an offline
+   operation or online in a running cluster allowing concurrent access. Data
+   checksums are enabled or disabled at the full cluster level, and cannot be
+   specified individually for databases or tables.
   </para>
 
   <para>
@@ -265,7 +266,7 @@
   </para>
 
   <sect2 id="checksums-offline-enable-disable">
-   <title>Off-line Enabling of Checksums</title>
+   <title>Offline Enabling of Checksums</title>
 
    <para>
     The <link linkend="app-pgchecksums"><application>pg_checksums</application></link>
@@ -274,6 +275,56 @@
    </para>
 
   </sect2>
+
+  <sect2 id="checksums-online-enable-disable">
+   <title>Online Enabling of Checksums</title>
+
+   <para>
+    Checksums can be enabled or disabled online, by calling the appropriate
+    <link linkend="functions-admin-checksum">functions</link>.
+   </para>
+
+   <para>
+    Enabling checksums will put the cluster checksum mode in
+    <literal>inprogress-on</literal> mode.  During this time, checksums will be
+    written but not verified. In addition to this, a background worker process
+    is started that enables checksums on all existing data in the cluster. Once
+    this worker has completed processing all databases in the cluster, the
+    checksum mode will automatically switch to <literal>on</literal>. The
+    processing will consume two background worker processes, make sure that
+    <varname>max_worker_processes</varname> allows for at least two more
+    additional processes.
+   </para>
+
+   <para>
+    The process will initially wait for all open transactions to finish before
+    it starts, so that it can be certain that there are no tables that have been
+    created inside a transaction that has not committed yet and thus would not
+    be visible to the process enabling checksums. It will also, for each database,
+    wait for all pre-existing temporary tables to get removed before it finishes.
+    If long-lived temporary tables are used in the application it may be necessary
+    to terminate these application connections to allow the process to complete.
+   </para>
+
+   <para>
+    If the cluster is stopped while in <literal>inprogress-on</literal> mode, for
+    any reason, then this process must be restarted manually. To do this,
+    re-execute the function <function>pg_enable_data_checksums()</function>
+    once the cluster has been restarted. The process will start over, there is
+    no support for resuming work from where it was interrupted.
+   </para>
+
+   <note>
+    <para>
+     Enabling checksums can cause significant I/O to the system, as most of the
+     database pages will need to be rewritten, and will be written both to the
+     data files and the WAL. The impact may be limited by throttling using the
+     <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter>
+     parameters of the <function>pg_enable_data_checksums</function> function.
+    </para>
+   </note>
+
+  </sect2>
  </sect1>
 
   <sect1 id="wal-intro">
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index cd6c2a2f650..c50d654db30 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -18,6 +18,7 @@
 #include "access/xlog.h"
 #include "access/xlog_internal.h"
 #include "catalog/pg_control.h"
+#include "storage/bufpage.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -167,6 +168,26 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
 		memcpy(&wal_level, rec, sizeof(int));
 		appendStringInfo(buf, "wal_level %s", get_wal_level_string(wal_level));
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state xlrec;
+
+		memcpy(&xlrec, rec, sizeof(xl_checksum_state));
+		switch (xlrec.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_VERSION:
+				appendStringInfoString(buf, "on");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				appendStringInfoString(buf, "inprogress-off");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				appendStringInfoString(buf, "inprogress-on");
+				break;
+			default:
+				appendStringInfoString(buf, "off");
+		}
+	}
 }
 
 const char *
@@ -218,6 +239,9 @@ xlog_identify(uint8 info)
 		case XLOG_CHECKPOINT_REDO:
 			id = "CHECKPOINT_REDO";
 			break;
+		case XLOG_CHECKSUMS:
+			id = "CHECKSUMS";
+			break;
 	}
 
 	return id;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 304b60933c9..4350f6abe3a 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -561,6 +561,9 @@ typedef struct XLogCtlData
 	 */
 	XLogRecPtr	lastFpwDisableRecPtr;
 
+	/* last data_checksum_version we've seen */
+	uint32		data_checksum_version;
+
 	slock_t		info_lck;		/* locks shared variables shown above */
 } XLogCtlData;
 
@@ -658,6 +661,24 @@ static XLogRecPtr LocalMinRecoveryPoint;
 static TimeLineID LocalMinRecoveryPointTLI;
 static bool updateMinRecoveryPoint = true;
 
+/*
+ * Local state fror Controlfile data_checksum_version. After initialization
+ * this is only updated when absorbing a procsignal barrier during interrupt
+ * processing. The reason for keeping a copy in backend-private memory is to
+ * avoid locking for interrogating checksum state. Possible values are the
+ * checksum versions defined in storage/bufpage.h as well as zero when data
+ * checksums are disabled.
+ */
+static uint32 LocalDataChecksumVersion = 0;
+
+/*
+ * Variable backing the GUC, keep it in sync with LocalDataChecksumVersion.
+ * See SetLocalDataChecksumVersion().
+ */
+int			data_checksums = 0;
+
+static void SetLocalDataChecksumVersion(uint32 data_checksum_version);
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -726,6 +747,8 @@ static void WALInsertLockAcquireExclusive(void);
 static void WALInsertLockRelease(void);
 static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);
 
+static void XLogChecksums(ChecksumType new_type);
+
 /*
  * Insert an XLOG record represented by an already-constructed chain of data
  * chunks.  This is a low-level routine; to construct the WAL record header
@@ -839,9 +862,10 @@ XLogInsertRecord(XLogRecData *rdata,
 		 * only happen just after a checkpoint, so it's better to be slow in
 		 * this case and fast otherwise.
 		 *
-		 * Also check to see if fullPageWrites was just turned on or there's a
-		 * running backup (which forces full-page writes); if we weren't
-		 * already doing full-page writes then go back and recompute.
+		 * Also check to see if fullPageWrites was just turned on, there's a
+		 * running backup or if checksums are enabled (all of which forces
+		 * full-page writes); if we weren't already doing full-page writes
+		 * then go back and recompute.
 		 *
 		 * If we aren't doing full-page writes then RedoRecPtr doesn't
 		 * actually affect the contents of the XLOG record, so we'll update
@@ -854,7 +878,9 @@ XLogInsertRecord(XLogRecData *rdata,
 			Assert(RedoRecPtr < Insert->RedoRecPtr);
 			RedoRecPtr = Insert->RedoRecPtr;
 		}
-		doPageWrites = (Insert->fullPageWrites || Insert->runningBackups > 0);
+		doPageWrites = (Insert->fullPageWrites ||
+						Insert->runningBackups > 0 ||
+						DataChecksumsNeedWrite());
 
 		if (doPageWrites &&
 			(!prevDoPageWrites ||
@@ -4353,6 +4379,12 @@ InitControlFile(uint64 sysidentifier, uint32 data_checksum_version)
 	ControlFile->wal_log_hints = wal_log_hints;
 	ControlFile->track_commit_timestamp = track_commit_timestamp;
 	ControlFile->data_checksum_version = data_checksum_version;
+
+	/*
+	 * Set the data_checksum_version value into XLogCtl, which is where all
+	 * processes get the current value from. (Maybe it should go just there?)
+	 */
+	XLogCtl->data_checksum_version = data_checksum_version;
 }
 
 static void
@@ -4692,10 +4724,6 @@ ReadControlFile(void)
 		(SizeOfXLogLongPHD - SizeOfXLogShortPHD);
 
 	CalculateCheckpointSegments();
-
-	/* Make the initdb settings visible as GUC variables, too */
-	SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no",
-					PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
 }
 
 /*
@@ -4729,13 +4757,410 @@ GetMockAuthenticationNonce(void)
 }
 
 /*
- * Are checksums enabled for data pages?
+ * DataChecksumsNeedWrite
+ *		Returns whether data checksums must be written or not
+ *
+ * Returns true iff data checksums are enabled or are in the process of being
+ * enabled. During "inprogress-on" and "inprogress-off" states checksums must
+ * be written even though they are not verified (see datachecksumsworker.c for
+ * a longer discussion).
+ *
+ * This function is intended for callsites which are about to write a data page
+ * to storage, and need to know whether to re-calculate the checksum for the
+ * page header. Calling this function must be performed as close to the write
+ * operation as possible to keep the critical section short.
+ */
+bool
+DataChecksumsNeedWrite(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * DataChecksumsNeedVerify
+ *		Returns whether data checksums must be verified or not
+ *
+ * Data checksums are only verified if they are fully enabled in the cluster.
+ * During the "inprogress-on" and "inprogress-off" states they are only
+ * updated, not verified (see datachecksumsworker.c for a longer discussion).
+ *
+ * This function is intended for callsites which have read data and are about
+ * to perform checksum validation based on the result of this.  Calling this
+ * function must be performed as close to the validation call as possible to
+ * keep the critical section short. This is in order to protect against time of
+ * check/time of use situations around data checksum validation.
  */
 bool
-DataChecksumsEnabled(void)
+DataChecksumsNeedVerify(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
+}
+
+/*
+ * DataChecksumsOnInProgress
+ *		Returns whether data checksums are being enabled
+ *
+ * Most operations don't need to worry about the "inprogress" states, and
+ * should use DataChecksumsNeedVerify() or DataChecksumsNeedWrite(). The
+ * "inprogress-on" state for enabling checksums is used when the checksum
+ * worker is setting checksums on all pages, it can thus be used to check for
+ * aborted checksum processing which need to be restarted.
+ */
+inline bool
+DataChecksumsOnInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+}
+
+/*
+ * DataChecksumsOffInProgress
+ *		Returns whether data checksums are being disabled
+ *
+ * The "inprogress-off" state for disabling checksums is used for when the
+ * worker resets the catalog state.  DataChecksumsNeedVerify() or
+ * DataChecksumsNeedWrite() should be used for deciding whether to read/write
+ * checksums.
+ */
+bool
+DataChecksumsOffInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * SetDataChecksumsOnInProgress
+ *		Sets the data checksum state to "inprogress-on" to enable checksums
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on". See
+ * SetDataChecksumsOn below for a description on how this state change works.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOnInProgress(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile != NULL);
+
+	/*
+	 * The state transition is performed in a critical section with
+	 * checkpoints held off to provide crash safety.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state change in all backends to ensure that all backends are in
+	 * "inprogress-on". Once done we know that all backends are writing data
+	 * checksums.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOn
+ *		Enables data checksums cluster-wide
+ *
+ * Enabling data checksums is performed using two barriers, the first one to
+ * set the checksums state to "inprogress-on" (which is performed by
+ * SetDataChecksumsOnInProgress()) and the second one to set the state to "on"
+ * (performed here).
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on".  This state
+ * requires data checksums to be written but not verified. This ensures that
+ * all data pages can be checksummed without the risk of false negatives in
+ * validation during the process.  When all existing pages are guaranteed to
+ * have checksums, and all new pages will be initiated with checksums, the
+ * state can be changed to "on". Once the state is "on" checksums will be both
+ * written and verified. See datachecksumsworker.c for a longer discussion on
+ * how data checksums can be enabled in a running cluster.
+ *
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOn(void)
 {
+	uint64		barrier;
+
 	Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+
+	/*
+	 * The only allowed state transition to "on" is from "inprogress-on" since
+	 * that state ensures that all pages will have data checksums written.
+	 */
+	if (XLogCtl->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		SpinLockRelease(&XLogCtl->info_lck);
+		elog(ERROR, "checksums not in \"inprogress-on\" mode");
+	}
+
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_VERSION);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state transition of "on" in all backends. When done we know that
+	 * data checksums are enabled in all backends and data checksums are both
+	 * written and verified.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOff
+ *		Disables data checksums cluster-wide
+ *
+ * Disabling data checksums must be performed with two sets of barriers, each
+ * carrying a different state. The state is first set to "inprogress-off"
+ * during which checksums are still written but not verified. This ensures that
+ * backends which have yet to observe the state change from "on" won't get
+ * validation errors on concurrently modified pages. Once all backends have
+ * changed to "inprogress-off", the barrier for moving to "off" can be emitted.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOff(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+
+	/* If data checksums are already disabled there is nothing to do */
+	if (XLogCtl->data_checksum_version == 0)
+	{
+		SpinLockRelease(&XLogCtl->info_lck);
+		return;
+	}
+
+	/*
+	 * If data checksums are currently enabled we first transition to the
+	 * "inprogress-off" state during which backends continue to write
+	 * checksums without verifying them. When all backends are in
+	 * "inprogress-off" the next transition to "off" can be performed, after
+	 * which all data checksum processing is disabled.
+	 */
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
+	{
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+		START_CRIT_SECTION();
+
+		XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+
+		END_CRIT_SECTION();
+		MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+
+		/*
+		 * Update local state in all backends to ensure that any backend in
+		 * "on" state is changed to "inprogress-off".
+		 */
+		WaitForProcSignalBarrier(barrier);
+
+		/*
+		 * At this point we know that no backends are verifying data checksums
+		 * during reading. Next, we can safely move to state "off" to also
+		 * stop writing checksums.
+		 */
+	}
+	else
+	{
+		/*
+		 * Ending up here implies that the checksums state is "inprogress-on"
+		 * or "inprogress-off" and we can transition directly to "off" from
+		 * there.
+		 */
+		SpinLockRelease(&XLogCtl->info_lck);
+	}
+
+	/*
+	 * Ensure that we don't incur a checkpoint during disabling checksums.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(0);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = 0;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * ProcSignalBarrier absorption functions for enabling and disabling data
+ * checksums in a running cluster. The procsignalbarriers are emitted in the
+ * SetDataChecksums* functions.
+ */
+bool
+AbsorbChecksumsOnInProgressBarrier(void)
+{
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	return true;
+}
+
+bool
+AbsorbChecksumsOnBarrier(void)
+{
+	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_VERSION);
+	return true;
+}
+
+bool
+AbsorbChecksumsOffInProgressBarrier(void)
+{
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+	return true;
+}
+
+bool
+AbsorbChecksumsOffBarrier(void)
+{
+	SetLocalDataChecksumVersion(0);
+	return true;
+}
+
+/*
+ * InitLocalControlData
+ *
+ * Set up backend local caches of controldata variables which may change at
+ * any point during runtime and thus require special cased locking. So far
+ * this only applies to data_checksum_version, but it's intended to be general
+ * purpose enough to handle future cases.
+ */
+void
+InitLocalControldata(void)
+{
+	SpinLockAcquire(&XLogCtl->info_lck);
+	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+	SpinLockRelease(&XLogCtl->info_lck);
+}
+
+/*
+ * XXX probably should be called in all places that modify the value of
+ * LocalDataChecksumVersion (to make sure data_checksums GUC is in sync)
+ *
+ * XXX aren't PG_DATA_ and DATA_ constants the same? why do we need both?
+ */
+void
+SetLocalDataChecksumVersion(uint32 data_checksum_version)
+{
+	LocalDataChecksumVersion = data_checksum_version;
+
+	switch (LocalDataChecksumVersion)
+	{
+		case PG_DATA_CHECKSUM_VERSION:
+			data_checksums = DATA_CHECKSUMS_ON;
+			break;
+
+		case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+			data_checksums = DATA_CHECKSUMS_INPROGRESS_ON;
+			break;
+
+		case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+			data_checksums = DATA_CHECKSUMS_INPROGRESS_OFF;
+			break;
+
+		default:
+			data_checksums = DATA_CHECKSUMS_OFF;
+			break;
+	}
+}
+
+/*
+ * Initialize the various data checksum values - GUC, local, ....
+ */
+void
+InitLocalDataChecksumVersion(void)
+{
+	uint32	data_checksum_version;
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	data_checksum_version = XLogCtl->data_checksum_version;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	SetLocalDataChecksumVersion(data_checksum_version);
+}
+
+/*
+ * Get the local data_checksum_version (cached XLogCtl value).
+ */
+uint32
+GetLocalDataChecksumVersion(void)
+{
+	return LocalDataChecksumVersion;
+}
+
+/*
+ * Get the *current* data_checksum_version (might not be written to control
+ * file yet).
+ */
+uint32
+GetCurrentDataChecksumVersion(void)
+{
+	return XLogCtl->data_checksum_version;
+}
+
+/* guc hook */
+const char *
+show_data_checksums(void)
+{
+	if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)
+		return "on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		return "inprogress-on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+		return "inprogress-off";
+	else
+		return "off";
 }
 
 /*
@@ -5179,6 +5604,11 @@ XLOGShmemInit(void)
 	XLogCtl->InstallXLogFileSegmentActive = false;
 	XLogCtl->WalWriterSleeping = false;
 
+	/* use the checksum info from control file */
+	XLogCtl->data_checksum_version = ControlFile->data_checksum_version;
+
+	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+
 	SpinLockInit(&XLogCtl->Insert.insertpos_lck);
 	SpinLockInit(&XLogCtl->info_lck);
 	pg_atomic_init_u64(&XLogCtl->logInsertResult, InvalidXLogRecPtr);
@@ -6317,6 +6747,33 @@ StartupXLOG(void)
 	 */
 	CompleteCommitTsInitialization();
 
+	/*
+	 * If we reach this point with checksums being enabled ("inprogress-on"
+	 * state), we notify the user that they need to manually restart the
+	 * process to enable checksums. This is because we cannot launch a dynamic
+	 * background worker directly from here, it has to be launched from a
+	 * regular backend.
+	 */
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		ereport(WARNING,
+				(errmsg("data checksums are being enabled, but no worker is running"),
+				 errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
+
+	/*
+	 * If data checksums were being disabled when the cluster was shut down,
+	 * we know that we have a state where all backends have stopped validating
+	 * checksums and we can move to off instead of prompting the user to
+	 * perform any action.
+	 */
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+	{
+		XLogChecksums(0);
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = 0;
+		SpinLockRelease(&XLogCtl->info_lck);
+	}
+
 	/*
 	 * All done with end-of-recovery actions.
 	 *
@@ -7172,6 +7629,12 @@ CreateCheckPoint(int flags)
 	checkPoint.fullPageWrites = Insert->fullPageWrites;
 	checkPoint.wal_level = wal_level;
 
+	/*
+	 * Get the current data_checksum_version value from xlogctl, valid at
+	 * the time of the checkpoint.
+	 */
+	checkPoint.data_checksum_version = XLogCtl->data_checksum_version;
+
 	if (shutdown)
 	{
 		XLogRecPtr	curInsert = XLogBytePosToRecPtr(Insert->CurrBytePos);
@@ -7427,6 +7890,9 @@ CreateCheckPoint(int flags)
 	ControlFile->minRecoveryPoint = InvalidXLogRecPtr;
 	ControlFile->minRecoveryPointTLI = 0;
 
+	/* make sure we start with the checksum version as of the checkpoint */
+	ControlFile->data_checksum_version = checkPoint.data_checksum_version;
+
 	/*
 	 * Persist unloggedLSN value. It's reset on crash recovery, so this goes
 	 * unused on non-shutdown checkpoints, but seems useful to store it always
@@ -7577,6 +8043,10 @@ CreateEndOfRecoveryRecord(void)
 	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
 	ControlFile->minRecoveryPoint = recptr;
 	ControlFile->minRecoveryPointTLI = xlrec.ThisTimeLineID;
+
+	/* start with the latest checksum version (as of the end of recovery) */
+	ControlFile->data_checksum_version = XLogCtl->data_checksum_version;
+
 	UpdateControlFile();
 	LWLockRelease(ControlFileLock);
 
@@ -7918,6 +8388,10 @@ CreateRestartPoint(int flags)
 			if (flags & CHECKPOINT_IS_SHUTDOWN)
 				ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY;
 		}
+
+		/* we shall start with the latest checksum version */
+		ControlFile->data_checksum_version = lastCheckPoint.data_checksum_version;
+
 		UpdateControlFile();
 	}
 	LWLockRelease(ControlFileLock);
@@ -8329,6 +8803,24 @@ XLogReportParameters(void)
 	}
 }
 
+/*
+ * Log the new state of checksums
+ */
+static void
+XLogChecksums(ChecksumType new_type)
+{
+	xl_checksum_state xlrec;
+	XLogRecPtr	recptr;
+
+	xlrec.new_checksumtype = new_type;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, sizeof(xl_checksum_state));
+
+	recptr = XLogInsert(RM_XLOG_ID, XLOG_CHECKSUMS);
+	XLogFlush(recptr);
+}
+
 /*
  * Update full_page_writes in shared memory, and write an
  * XLOG_FPW_CHANGE record if necessary.
@@ -8757,6 +9249,46 @@ xlog_redo(XLogReaderState *record)
 	{
 		/* nothing to do here, just for informational purposes */
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state state;
+		uint64		barrier;
+
+		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = state.new_checksumtype;
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		/*
+		 * Block on a procsignalbarrier to await all processes having seen the
+		 * change to checksum status. Once the barrier has been passed we can
+		 * initiate the corresponding processing.
+		 */
+		switch (state.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			default:
+				Assert(state.new_checksumtype == 0);
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+		}
+	}
 }
 
 /*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 8c3090165f0..6b1c2ed85ea 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -26,6 +26,7 @@
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "postmaster/datachecksumsworker.h"
 #include "replication/walreceiver.h"
 #include "storage/fd.h"
 #include "storage/latch.h"
@@ -748,3 +749,43 @@ pg_promote(PG_FUNCTION_ARGS)
 						   wait_seconds)));
 	PG_RETURN_BOOL(false);
 }
+
+/*
+ * Disables data checksums for the cluster, if applicable. Starts a background
+ * worker which turns off the data checksums.
+ */
+Datum
+disable_data_checksums(PG_FUNCTION_ARGS)
+{
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	StartDataChecksumsWorkerLauncher(false, 0, 0, false);
+	PG_RETURN_VOID();
+}
+
+/*
+ * Enables data checksums for the cluster, if applicable.  Supports vacuum-
+ * like cost based throttling to limit system load. Starts a background worker
+ * which updates data checksums on existing data.
+ */
+Datum
+enable_data_checksums(PG_FUNCTION_ARGS)
+{
+	int			cost_delay = PG_GETARG_INT32(0);
+	int			cost_limit = PG_GETARG_INT32(1);
+	bool		fast = PG_GETARG_BOOL(2);
+
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	if (cost_delay < 0)
+		ereport(ERROR, errmsg("cost delay cannot be a negative value"));
+
+	if (cost_limit <= 0)
+		ereport(ERROR, errmsg("cost limit must be greater than zero"));
+
+	StartDataChecksumsWorkerLauncher(true, cost_delay, cost_limit, fast);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index f0f88838dc2..13c181a3f8c 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1613,7 +1613,8 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	 * enabled for this cluster, and if this is a relation file, then verify
 	 * the checksum.
 	 */
-	if (!noverify_checksums && DataChecksumsEnabled() &&
+	if (!noverify_checksums &&
+		DataChecksumsNeedWrite() &&
 		RelFileNumberIsValid(relfilenumber))
 		verify_checksum = true;
 
@@ -2007,6 +2008,9 @@ verify_page_checksum(Page page, XLogRecPtr start_lsn, BlockNumber blkno,
 	if (PageIsNew(page) || PageGetLSN(page) >= start_lsn)
 		return true;
 
+	if (!DataChecksumsNeedVerify())
+		return true;
+
 	/* Perform the actual checksum calculation. */
 	checksum = pg_checksum_page(page, blkno);
 
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 566f308e443..b18db2b5dde 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -650,6 +650,13 @@ LANGUAGE INTERNAL
 CALLED ON NULL INPUT VOLATILE PARALLEL SAFE
 AS 'pg_stat_reset_slru';
 
+CREATE OR REPLACE FUNCTION
+  pg_enable_data_checksums(cost_delay integer DEFAULT 0,
+                           cost_limit integer DEFAULT 100,
+						   fast boolean DEFAULT false)
+  RETURNS void STRICT VOLATILE LANGUAGE internal AS 'enable_data_checksums'
+  PARALLEL RESTRICTED;
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
@@ -775,6 +782,10 @@ REVOKE EXECUTE ON FUNCTION pg_ls_logicalmapdir() FROM PUBLIC;
 
 REVOKE EXECUTE ON FUNCTION pg_ls_replslotdir(text) FROM PUBLIC;
 
+REVOKE EXECUTE ON FUNCTION pg_enable_data_checksums(integer, integer, boolean) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_disable_data_checksums() FROM public;
+
 --
 -- We also set up some things as accessible to standard roles.
 --
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index b2d5332effc..9df7c24d535 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1351,6 +1351,26 @@ CREATE VIEW pg_stat_progress_copy AS
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_data_checksums AS
+    SELECT
+        S.pid AS pid, S.datid, D.datname AS datname,
+        CASE S.param1 WHEN 0 THEN 'enabling'
+                      WHEN 1 THEN 'disabling'
+                      WHEN 2 THEN 'waiting'
+                      WHEN 3 THEN 'waiting on temporary tables'
+                      WHEN 4 THEN 'waiting on checkpoint'
+                      WHEN 5 THEN 'done'
+                      END AS phase,
+        CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+        S.param3 AS databases_done,
+        CASE S.param4 WHEN -1 THEN NULL ELSE S.param4 END AS relations_total,
+        CASE S.param5 WHEN -1 THEN NULL ELSE S.param5 END AS relations_done,
+        CASE S.param6 WHEN -1 THEN NULL ELSE S.param6 END AS blocks_total,
+        CASE S.param7 WHEN -1 THEN NULL ELSE S.param7 END AS blocks_done
+    FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid
+    ORDER BY S.datid; -- return the launcher process first
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/postmaster/Makefile b/src/backend/postmaster/Makefile
index 0f4435d2d97..0c36765acfe 100644
--- a/src/backend/postmaster/Makefile
+++ b/src/backend/postmaster/Makefile
@@ -18,6 +18,7 @@ OBJS = \
 	bgworker.o \
 	bgwriter.o \
 	checkpointer.o \
+	datachecksumsworker.o \
 	fork_process.o \
 	interrupt.o \
 	launch_backend.o \
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 116ddf7b835..5720f66b0fd 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -18,6 +18,7 @@
 #include "pgstat.h"
 #include "port/atomics.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "replication/logicallauncher.h"
 #include "replication/logicalworker.h"
@@ -132,6 +133,12 @@ static const struct
 	},
 	{
 		"TablesyncWorkerMain", TablesyncWorkerMain
+	},
+	{
+		"DataChecksumsWorkerLauncherMain", DataChecksumsWorkerLauncherMain
+	},
+	{
+		"DataChecksumsWorkerMain", DataChecksumsWorkerMain
 	}
 };
 
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
new file mode 100644
index 00000000000..6a201dca8de
--- /dev/null
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -0,0 +1,1439 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.c
+ *	  Background worker for enabling or disabling data checksums online
+ *
+ * When enabling data checksums on a database at initdb time or when shut down
+ * with pg_checksums, no extra process is required as each page is checksummed,
+ * and verified, when accessed.  When enabling checksums on an already running
+ * cluster, this worker will ensure that all pages are checksummed before
+ * verification of the checksums is turned on. In the case of disabling
+ * checksums, the state transition is performed only in the control file, no
+ * changes are performed on the data pages.
+ *
+ * Checksums can be either enabled or disabled cluster-wide, with on/off being
+ * the end state for data_checksums.
+ *
+ * Enabling checksums
+ * ------------------
+ * When enabling checksums in an online cluster, data_checksums will be set to
+ * "inprogress-on" which signals that write operations MUST compute and write
+ * the checksum on the data page, but during reading the checksum SHALL NOT be
+ * verified. This ensures that all objects created during checksumming will
+ * have checksums set, but no reads will fail due to incorrect checksum. The
+ * DataChecksumsWorker will compile a list of databases which exist at the
+ * start of checksumming, and all of these which haven't been dropped during
+ * the processing MUST have been processed successfully in order for checksums
+ * to be enabled. Any new relation created during processing will see the
+ * in-progress state and will automatically be checksummed.
+ *
+ * For each database, all relations which have storage are read and every data
+ * page is marked dirty to force a write with the checksum. This will generate
+ * a lot of WAL as the entire database is read and written.
+ *
+ * If the processing is interrupted by a cluster restart, it will be restarted
+ * from the beginning again as state isn't persisted.
+ *
+ * Disabling checksums
+ * -------------------
+ * When disabling checksums, data_checksums will be set to "inprogress-off"
+ * which signals that checksums are written but no longer verified. This ensure
+ * that backends which have yet to move from the "on" state will still be able
+ * to process data checksum validation.
+ *
+ * Synchronization and Correctness
+ * -------------------------------
+ * The processes involved in enabling, or disabling, data checksums in an
+ * online cluster must be properly synchronized with the normal backends
+ * serving concurrent queries to ensure correctness. Correctness is defined
+ * as the following:
+ *
+ *    - Backends SHALL NOT violate local data_checksums state
+ *    - Data checksums SHALL NOT be considered enabled cluster-wide until all
+ *      currently connected backends have the local state "enabled"
+ *
+ * There are two levels of synchronization required for enabling data checksums
+ * in an online cluster: (i) changing state in the active backends ("on",
+ * "off", "inprogress-on" and "inprogress-off"), and (ii) ensuring no
+ * incompatible objects and processes are left in a database when workers end.
+ * The former deals with cluster-wide agreement on data checksum state and the
+ * latter with ensuring that any concurrent activity cannot break the data
+ * checksum contract during processing.
+ *
+ * Synchronizing the state change is done with procsignal barriers, where the
+ * WAL logging backend updating the global state in the controlfile will wait
+ * for all other backends to absorb the barrier. Barrier absorption will happen
+ * during interrupt processing, which means that connected backends will change
+ * state at different times. To prevent data checksum state changes when
+ * writing and verifying checksums, interrupts shall be held off before
+ * interrogating state and resumed when the IO operation has been performed.
+ *
+ *   When Enabling Data Checksums
+ *   ----------------------------
+ *   A process which fails to observe data checksums being enabled can induce
+ *   two types of errors: failing to write the checksum when modifying the page
+ *   and failing to validate the data checksum on the page when reading it.
+ *
+ *   When processing starts all backends belong to one of the below sets, with
+ *   one set being empty:
+ *
+ *   Bd: Backends in "off" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   If processing is started in an online cluster then all backends are in Bd.
+ *   If processing was halted by the cluster shutting down, the controlfile
+ *   state "inprogress-on" will be observed on system startup and all backends
+ *   will be in Bd. Backends transition Bd -> Bi via a procsignalbarrier.  When
+ *   the DataChecksumsWorker has finished writing checksums on all pages and
+ *   enables data checksums cluster-wide, there are four sets of backends where
+ *   Bd shall be an empty set:
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   Backends in Bi and Be will write checksums when modifying a page, but only
+ *   backends in Be will verify the checksum during reading. The Bg backend is
+ *   blocked waiting for all backends in Bi to process interrupts and move to
+ *   Be. Any backend starting while Bg is waiting on the procsignalbarrier will
+ *   observe the global state being "on" and will thus automatically belong to
+ *   Be.  Checksums are enabled cluster-wide when Bi is an empty set. Bi and Be
+ *   are compatible sets while still operating based on their local state as
+ *   both write data checksums.
+ *
+ *   When Disabling Data Checksums
+ *   -----------------------------
+ *   A process which fails to observe that data checksums have been disabled
+ *   can induce two types of errors: writing the checksum when modifying the
+ *   page and validating a data checksum which is no longer correct due to
+ *   modifications to the page.
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bo: Backends in "inprogress-off" state
+ *
+ *   Backends transition from the Be state to Bd like so: Be -> Bo -> Bd
+ *
+ *   The goal is to transition all backends to Bd making the others empty sets.
+ *   Backends in Bo write data checksums, but don't validate them, such that
+ *   backends still in Be can continue to validate pages until the barrier has
+ *   been absorbed such that they are in Bo. Once all backends are in Bo, the
+ *   barrier to transition to "off" can be raised and all backends can safely
+ *   stop writing data checksums as no backend is enforcing data checksum
+ *   validation any longer.
+ *
+ *
+ * Potential optimizations
+ * -----------------------
+ * Below are some potential optimizations and improvements which were brought
+ * up during reviews of this feature, but which weren't implemented in the
+ * initial version. These are ideas listed without any validation on their
+ * feasibility or potential payoff. More discussion on these can be found on
+ * the -hackers threads linked to in the commit message of this feature.
+ *
+ *   * Launching datachecksumsworker for resuming operation from the startup
+ *     process: Currently users have to restart processing manually after a
+ *     restart since dynamic background worker cannot be started from the
+ *     postmaster. Changing the startup process could make restarting the
+ *     processing automatic on cluster restart.
+ *   * Avoid dirtying the page when checksums already match: Iff the checksum
+ *     on the page happens to already match we still dirty the page. It should
+ *     be enough to only do the log_newpage_buffer() call in that case.
+ *   * Invent a lightweight WAL record that doesn't contain the full-page
+ *     image but just the block number: On replay, the redo routine would read
+ *     the page from disk.
+ *   * Teach pg_checksums to avoid checksummed pages when pg_checksums is used
+ *     to enable checksums on a cluster which is in inprogress-on state and
+ *     may have checksummed pages (make pg_checksums be able to resume an
+ *     online operation).
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/postmaster/datachecksumsworker.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/htup_details.h"
+#include "access/xact.h"
+#include "access/xloginsert.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_database.h"
+#include "commands/progress.h"
+#include "commands/vacuum.h"
+#include "common/relpath.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "postmaster/bgworker.h"
+#include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/bufmgr.h"
+#include "storage/checksum.h"
+#include "storage/ipc.h"
+#include "storage/lmgr.h"
+#include "storage/lwlock.h"
+#include "storage/procarray.h"
+#include "storage/smgr.h"
+#include "tcop/tcopprot.h"
+#include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
+#include "utils/ps_status.h"
+#include "utils/syscache.h"
+
+/*
+ * Number of times we retry to open a database before giving up and consider
+ * it to have failed processing.
+ */
+#define DATACHECKSUMSWORKER_MAX_DB_RETRIES 5
+
+typedef enum
+{
+	DATACHECKSUMSWORKER_SUCCESSFUL = 0,
+	DATACHECKSUMSWORKER_ABORTED,
+	DATACHECKSUMSWORKER_FAILED,
+	DATACHECKSUMSWORKER_RETRYDB,
+} DataChecksumsWorkerResult;
+
+/*
+ * Signaling between backends calling pg_enable/disable_data_checksums, the
+ * checksums launcher process, and the checksums worker process.
+ *
+ * This struct is protected by DataChecksumsWorkerLock
+ */
+typedef struct DataChecksumsWorkerShmemStruct
+{
+	/*
+	 * These are set by pg_enable/disable_data_checksums, to tell the launcher
+	 * what the target state is.
+	 */
+	bool		launch_enable_checksums;	/* True if checksums are being
+											 * enabled, else false */
+	int			launch_cost_delay;
+	int			launch_cost_limit;
+	bool		launch_fast;
+
+	/*
+	 * Is a launcher process is currently running?
+	 *
+	 * This is set by the launcher process, after it has read the above
+	 * launch_* parameters.
+	 */
+	bool		launcher_running;
+
+	/*
+	 * These fields indicate the target state that the launcher is currently
+	 * working towards. They can be different from the corresponding launch_*
+	 * fields, if a new pg_enable/disable_data_checksums() call was made while
+	 * the launcher/worker was already running.
+	 *
+	 * The below members are set when the launcher starts, and are only
+	 * accessed read-only by the single worker. Thus, we can access these
+	 * without a lock. If multiple workers, or dynamic cost parameters, are
+	 * supported at some point then this would need to be revisited.
+	 */
+	bool		enabling_checksums; /* True if checksums are being enabled,
+									 * else false */
+	int			cost_delay;
+	int			cost_limit;
+	bool		immediate_checkpoint;
+
+	/*
+	 * Signaling between the launcher and the worker process.
+	 *
+	 * As there is only a single worker, and the launcher won't read these
+	 * until the worker exits, they can be accessed without the need for a
+	 * lock. If multiple workers are supported then this will have to be
+	 * revisited.
+	 */
+
+	/* result, set by worker before exiting */
+	DataChecksumsWorkerResult success;
+
+	/*
+	 * tells the worker process whether it should also process the shared
+	 * catalogs
+	 */
+	bool		process_shared_catalogs;
+} DataChecksumsWorkerShmemStruct;
+
+/* Shared memory segment for datachecksumsworker */
+static DataChecksumsWorkerShmemStruct *DataChecksumsWorkerShmem;
+
+/* Bookkeeping for work to do */
+typedef struct DataChecksumsWorkerDatabase
+{
+	Oid			dboid;
+	char	   *dbname;
+} DataChecksumsWorkerDatabase;
+
+typedef struct DataChecksumsWorkerResultEntry
+{
+	Oid			dboid;
+	DataChecksumsWorkerResult result;
+	int			retries;
+} DataChecksumsWorkerResultEntry;
+
+
+/*
+ * Flag set by the interrupt handler
+ */
+static volatile sig_atomic_t abort_requested = false;
+
+/*
+ * Have we set the DataChecksumsWorkerShmemStruct->launcher_running flag?
+ * If we have, we need to clear it before exiting!
+ */
+static volatile sig_atomic_t launcher_running = false;
+
+/*
+ * Are we enabling data checksums, or disabling them?
+ */
+static bool enabling_checksums;
+
+/* Prototypes */
+static List *BuildDatabaseList(void);
+static List *BuildRelationList(bool temp_relations, bool include_shared);
+static void FreeDatabaseList(List *dblist);
+static DataChecksumsWorkerResult ProcessDatabase(DataChecksumsWorkerDatabase *db);
+static bool ProcessAllDatabases(bool immediate_checkpoint);
+static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy);
+static void launcher_cancel_handler(SIGNAL_ARGS);
+static void WaitForAllTransactionsToFinish(void);
+
+/*
+ * StartDataChecksumsWorkerLauncher
+ *		Main entry point for datachecksumsworker launcher process
+ *
+ * The main entrypoint for starting data checksums processing for enabling as
+ * well as disabling.
+ */
+void
+StartDataChecksumsWorkerLauncher(bool enable_checksums,
+								 int cost_delay,
+								 int cost_limit,
+								 bool fast)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	bool		launcher_running;
+
+	/* the cost delay settings have no effect when disabling */
+	Assert(enable_checksums || cost_delay == 0);
+	Assert(enable_checksums || cost_limit == 0);
+
+	/* store the desired state in shared memory */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	DataChecksumsWorkerShmem->launch_enable_checksums = enable_checksums;
+	DataChecksumsWorkerShmem->launch_cost_delay = cost_delay;
+	DataChecksumsWorkerShmem->launch_cost_limit = cost_limit;
+	DataChecksumsWorkerShmem->launch_fast = fast;
+
+	/* is the launcher already running? */
+	launcher_running = DataChecksumsWorkerShmem->launcher_running;
+
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * Launch a new launcher process, if it's not running already.
+	 *
+	 * If the launcher is currently busy enabling the checksums, and we want
+	 * them disabled (or vice versa), the launcher will notice that at latest
+	 * when it's about to exit, and will loop back process the new request. So
+	 * if the launcher is already running, we don't need to do anything more
+	 * here to abort it.
+	 *
+	 * If you call pg_enable/disable_data_checksums() twice in a row, before
+	 * the launcher has had a chance to start up, we still end up launching it
+	 * twice.  That's OK, the second invocation will see that a launcher is
+	 * already running and exit quickly.
+	 *
+	 * TODO: We could optimize here and skip launching the launcher, if we are
+	 * already in the desired state, i.e. if the checksums are already enabled
+	 * and you call pg_enable_data_checksums().
+	 */
+	if (!launcher_running)
+	{
+		/*
+		 * Prepare the BackgroundWorker and launch it.
+		 */
+		memset(&bgw, 0, sizeof(bgw));
+		bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+		bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+		snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+		snprintf(bgw.bgw_function_name, BGW_MAXLEN, "DataChecksumsWorkerLauncherMain");
+		snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker launcher");
+		snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker launcher");
+		bgw.bgw_restart_time = BGW_NEVER_RESTART;
+		bgw.bgw_notify_pid = MyProcPid;
+		bgw.bgw_main_arg = (Datum) 0;
+
+		if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+			ereport(ERROR,
+					(errmsg("failed to start background worker to process data checksums")));
+	}
+}
+
+/*
+ * ProcessSingleRelationFork
+ *		Enable data checksums in a single relation/fork.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy)
+{
+	BlockNumber numblocks = RelationGetNumberOfBlocksInFork(reln, forkNum);
+	BlockNumber blknum;
+	char		activity[NAMEDATALEN * 2 + 128];
+	char	   *relns;
+
+	relns = get_namespace_name(RelationGetNamespace(reln));
+
+	if (!relns)
+		return false;
+
+	/* Report the current relation to pgstat_activity */
+	snprintf(activity, sizeof(activity) - 1, "processing: %s.%s (%s, %dblocks)",
+			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	/* XXX only do this for main forks, maybe we should do it for all? */
+	if (forkNum == MAIN_FORKNUM)
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+									 numblocks);
+
+	/*
+	 * We are looping over the blocks which existed at the time of process
+	 * start, which is safe since new blocks are created with checksums set
+	 * already due to the state being "inprogress-on".
+	 */
+	for (blknum = 0; blknum < numblocks; blknum++)
+	{
+		Buffer		buf = ReadBufferExtended(reln, forkNum, blknum, RBM_NORMAL, strategy);
+
+		/* Need to get an exclusive lock before we can flag as dirty */
+		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+
+		/*
+		 * Mark the buffer as dirty and force a full page write.  We have to
+		 * re-write the page to WAL even if the checksum hasn't changed,
+		 * because if there is a replica it might have a slightly different
+		 * version of the page with an invalid checksum, caused by unlogged
+		 * changes (e.g. hintbits) on the master happening while checksums
+		 * were off. This can happen if there was a valid checksum on the page
+		 * at one point in the past, so only when checksums are first on, then
+		 * off, and then turned on again. Iff wal_level is set to "minimal",
+		 * this could be avoided iff the checksum is calculated to be correct.
+		 */
+		START_CRIT_SECTION();
+		MarkBufferDirty(buf);
+		log_newpage_buffer(buf, false);
+		END_CRIT_SECTION();
+
+		UnlockReleaseBuffer(buf);
+
+		/*
+		 * This is the only place where we check if we are asked to abort, the
+		 * abortion will bubble up from here. It's safe to check this without
+		 * a lock, because if we miss it being set, we will try again soon.
+		 */
+		Assert(enabling_checksums);
+		if (!DataChecksumsWorkerShmem->launch_enable_checksums)
+			abort_requested = true;
+		if (abort_requested)
+			return false;
+
+		/* XXX only do this for main forks, maybe we should do it for all? */
+		if (forkNum == MAIN_FORKNUM)
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+										 (blknum + 1));
+
+		vacuum_delay_point(false);
+	}
+
+	pfree(relns);
+	return true;
+}
+
+/*
+ * ProcessSingleRelationByOid
+ *		Process a single relation based on oid.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationByOid(Oid relationId, BufferAccessStrategy strategy)
+{
+	Relation	rel;
+	ForkNumber	fnum;
+	bool		aborted = false;
+
+	StartTransactionCommand();
+
+	rel = try_relation_open(relationId, AccessShareLock);
+	if (rel == NULL)
+	{
+		/*
+		 * Relation no longer exists. We don't consider this an error since
+		 * there are no pages in it that need data checksums, and thus return
+		 * true. The worker operates off a list of relations generated at the
+		 * start of processing, so relations being dropped in the meantime is
+		 * to be expected.
+		 */
+		CommitTransactionCommand();
+		pgstat_report_activity(STATE_IDLE, NULL);
+		return true;
+	}
+	RelationGetSmgr(rel);
+
+	for (fnum = 0; fnum <= MAX_FORKNUM; fnum++)
+	{
+		if (smgrexists(rel->rd_smgr, fnum))
+		{
+			if (!ProcessSingleRelationFork(rel, fnum, strategy))
+			{
+				aborted = true;
+				break;
+			}
+		}
+	}
+	relation_close(rel, AccessShareLock);
+	elog(DEBUG2,
+		 "data checksum processing done for relation with OID %u: %s",
+		 relationId, (aborted ? "aborted" : "finished"));
+
+	CommitTransactionCommand();
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return !aborted;
+}
+
+/*
+ * ProcessDatabase
+ *		Enable data checksums in a single database.
+ *
+ * We do this by launching a dynamic background worker into this database, and
+ * waiting for it to finish.  We have to do this in a separate worker, since
+ * each process can only be connected to one database during its lifetime.
+ */
+static DataChecksumsWorkerResult
+ProcessDatabase(DataChecksumsWorkerDatabase *db)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	BgwHandleStatus status;
+	pid_t		pid;
+	char		activity[NAMEDATALEN + 64];
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_FAILED;
+
+	memset(&bgw, 0, sizeof(bgw));
+	bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+	bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+	snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+	snprintf(bgw.bgw_function_name, BGW_MAXLEN, "%s", "DataChecksumsWorkerMain");
+	snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker worker");
+	snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker worker");
+	bgw.bgw_restart_time = BGW_NEVER_RESTART;
+	bgw.bgw_notify_pid = MyProcPid;
+	bgw.bgw_main_arg = ObjectIdGetDatum(db->dboid);
+
+	/*
+	 * If there are no worker slots available, make sure we retry processing
+	 * this database. This will make the datachecksumsworker move on to the
+	 * next database and quite likely fail with the same problem. TODO: Maybe
+	 * we need a backoff to avoid running through all the databases here in
+	 * short order.
+	 */
+	if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+	{
+		ereport(WARNING,
+				(errmsg("failed to start worker for enabling data checksums in database \"%s\", retrying",
+						db->dbname),
+				 errhint("The max_worker_processes setting might be too low.")));
+		return DATACHECKSUMSWORKER_RETRYDB;
+	}
+
+	status = WaitForBackgroundWorkerStartup(bgw_handle, &pid);
+	if (status == BGWH_STOPPED)
+	{
+		ereport(WARNING,
+				(errmsg("could not start background worker for enabling data checksums in database \"%s\"",
+						db->dbname),
+				 errhint("More details on the error might be found in the server log.")));
+		return DATACHECKSUMSWORKER_FAILED;
+	}
+
+	/*
+	 * If the postmaster crashed we cannot end up with a processed database so
+	 * we have no alternative other than exiting. When enabling checksums we
+	 * won't at this time have changed the pg_control version to enabled so
+	 * when the cluster comes back up processing will have to be restarted.
+	 * When disabling, the pg_control version will be set to off before this
+	 * so when the cluster comes up checksums will be off as expected.
+	 */
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("cannot enable data checksums without the postmaster process"),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	Assert(status == BGWH_STARTED);
+	ereport(DEBUG1,
+			(errmsg("initiating data checksum processing in database \"%s\"",
+					db->dbname)));
+
+	snprintf(activity, sizeof(activity) - 1,
+			 "Waiting for worker in database %s (pid %ld)", db->dbname, (long) pid);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	status = WaitForBackgroundWorkerShutdown(bgw_handle);
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				(errmsg("postmaster exited during data checksum processing in \"%s\"",
+						db->dbname),
+				 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+	if (DataChecksumsWorkerShmem->success == DATACHECKSUMSWORKER_ABORTED)
+		ereport(LOG,
+				(errmsg("data checksums processing was aborted in database \"%s\"",
+						db->dbname)));
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return DataChecksumsWorkerShmem->success;
+}
+
+/*
+ * launcher_exit
+ *
+ * Internal routine for cleaning up state when the launcher process exits. We
+ * need to clean up the abort flag to ensure that processing can be restarted
+ * again after it was previously aborted.
+ */
+static void
+launcher_exit(int code, Datum arg)
+{
+	if (launcher_running)
+	{
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		launcher_running = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		LWLockRelease(DataChecksumsWorkerLock);
+	}
+}
+
+/*
+ * launcher_cancel_handler
+ *
+ * Internal routine for reacting to SIGINT and flagging the worker to abort.
+ * The worker won't be interrupted immediately but will check for abort flag
+ * between each block in a relation.
+ */
+static void
+launcher_cancel_handler(SIGNAL_ARGS)
+{
+	int			save_errno = errno;
+
+	abort_requested = true;
+
+	/*
+	 * There is no sleeping in the main loop, the flag will be checked
+	 * periodically in ProcessSingleRelationFork. The worker does however
+	 * sleep when waiting for concurrent transactions to end so we still need
+	 * to set the latch.
+	 */
+	SetLatch(MyLatch);
+
+	errno = save_errno;
+}
+
+/*
+ * WaitForAllTransactionsToFinish
+ *		Blocks awaiting all current transactions to finish
+ *
+ * Returns when all transactions which are active at the call of the function
+ * have ended, or if the postmaster dies while waiting. If the postmaster dies
+ * the abort flag will be set to indicate that the caller of this shouldn't
+ * proceed.
+ *
+ * NB: this will return early, if aborted by SIGINT or if the target state
+ * is changed while we're running.
+ */
+static void
+WaitForAllTransactionsToFinish(void)
+{
+	TransactionId waitforxid;
+
+	LWLockAcquire(XidGenLock, LW_SHARED);
+	waitforxid = XidFromFullTransactionId(TransamVariables->nextXid);
+	LWLockRelease(XidGenLock);
+
+	while (TransactionIdPrecedes(GetOldestActiveTransactionId(), waitforxid))
+	{
+		char		activity[64];
+		int			rc;
+
+		/* Oldest running xid is older than us, so wait */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for current transactions to finish (waiting for %u)",
+				 waitforxid);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		rc = WaitLatch(MyLatch,
+					   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+					   5000,
+					   WAIT_EVENT_CHECKSUM_ENABLE_STARTCONDITION);
+
+		/*
+		 * If the postmaster died we won't be able to enable checksums
+		 * cluster-wide so abort and hope to continue when restarted.
+		 */
+		if (rc & WL_POSTMASTER_DEATH)
+			ereport(FATAL,
+					(errmsg("postmaster exited during data checksum processing"),
+					 errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums().")));
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_SHARED);
+		if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			abort_requested = true;
+		LWLockRelease(DataChecksumsWorkerLock);
+		if (abort_requested)
+			break;
+	}
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+	return;
+}
+
+/*
+ * DataChecksumsWorkerLauncherMain
+ *
+ * Main function for launching dynamic background workers for processing data
+ * checksums in databases. This function has the bgworker management, with
+ * ProcessAllDatabases being responsible for looping over the databases and
+ * initiating processing.
+ */
+void
+DataChecksumsWorkerLauncherMain(Datum arg)
+{
+	on_shmem_exit(launcher_exit, 0);
+
+	ereport(DEBUG1,
+			errmsg("background worker \"datachecksumsworker\" launcher started"));
+
+	pqsignal(SIGTERM, die);
+	pqsignal(SIGINT, launcher_cancel_handler);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_LAUNCHER;
+	init_ps_display(NULL);
+
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	if (DataChecksumsWorkerShmem->launcher_running)
+	{
+		/* Launcher was already running, let it finish */
+		LWLockRelease(DataChecksumsWorkerLock);
+		return;
+	}
+
+	launcher_running = true;
+
+	/*
+	 * Initialize a connection to shared catalogs only.
+	 */
+	BackgroundWorkerInitializeConnectionByOid(InvalidOid, InvalidOid, 0);
+
+	enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+	DataChecksumsWorkerShmem->launcher_running = true;
+	DataChecksumsWorkerShmem->enabling_checksums = enabling_checksums;
+	DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+	DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+	DataChecksumsWorkerShmem->immediate_checkpoint = DataChecksumsWorkerShmem->launch_fast;
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * The target state can change while we are busy enabling/disabling
+	 * checksums, if the user calls pg_disable/enable_data_checksums() before
+	 * we are finished with the previous request. In that case, we will loop
+	 * back here, to process the new request.
+	 */
+again:
+
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	/*
+	 * If we're asked to enable checksums, we need to check if processing was
+	 * previously interrupted such that we should resume rather than start
+	 * from scratch.
+	 */
+	if (enabling_checksums)
+	{
+		/*
+		 * If we are asked to enable checksums in a cluster which already has
+		 * checksums enabled, exit immediately as there is nothing more to do.
+		 * Hold interrupts to make sure state doesn't change during checking.
+		 */
+		HOLD_INTERRUPTS();
+		if (DataChecksumsNeedVerify())
+		{
+			RESUME_INTERRUPTS();
+			goto done;
+		}
+		RESUME_INTERRUPTS();
+
+		/*
+		 * Set the state to inprogress-on and wait on the procsignal barrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_ENABLING);
+		SetDataChecksumsOnInProgress();
+
+		if (!ProcessAllDatabases(DataChecksumsWorkerShmem->immediate_checkpoint))
+		{
+			/*
+			 * If the target state changed during processing then it's not a
+			 * failure, so restart processing instead.
+			 */
+			LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+			if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+			{
+				LWLockRelease(DataChecksumsWorkerLock);
+				goto done;
+			}
+			LWLockRelease(DataChecksumsWorkerLock);
+			ereport(ERROR,
+					(errmsg("unable to enable data checksums in cluster")));
+		}
+
+		SetDataChecksumsOn();
+	}
+	else
+	{
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_DISABLING);
+		SetDataChecksumsOff();
+	}
+
+done:
+
+	/*
+	 * All done. But before we exit, check if the target state was changed
+	 * while we were running. In that case we will have to start all over
+	 * again.
+	 */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+	if (DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums)
+	{
+		DataChecksumsWorkerShmem->enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		enabling_checksums = DataChecksumsWorkerShmem->launch_enable_checksums;
+		DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+		DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+		LWLockRelease(DataChecksumsWorkerLock);
+		goto again;
+	}
+
+	/* Shut down progress reporting as we are done */
+	pgstat_progress_end_command();
+
+	launcher_running = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	LWLockRelease(DataChecksumsWorkerLock);
+}
+
+/*
+ * ProcessAllDatabases
+ *		Compute the list of all databases and process checksums in each
+ *
+ * This will repeatedly generate a list of databases to process for enabling
+ * checksums. Until no new databases are found, this will loop around computing
+ * a new list and comparing it to the already seen ones.
+ *
+ * If immediate_checkpoint is set to true then a CHECKPOINT_IMMEDIATE will be
+ * issued. This is useful for testing but should be avoided in production use
+ * as it may affect cluster performance drastically.
+ */
+static bool
+ProcessAllDatabases(bool immediate_checkpoint)
+{
+	List	   *DatabaseList;
+	HTAB	   *ProcessedDatabases = NULL;
+	HASHCTL		hash_ctl;
+	bool		found_failed = false;
+	int			flags;
+
+	/* Initialize a hash tracking all processed databases */
+	memset(&hash_ctl, 0, sizeof(hash_ctl));
+	hash_ctl.keysize = sizeof(Oid);
+	hash_ctl.entrysize = sizeof(DataChecksumsWorkerResultEntry);
+	ProcessedDatabases = hash_create("Processed databases",
+									 64,
+									 &hash_ctl,
+									 HASH_ELEM | HASH_BLOBS);
+
+	/*
+	 * Set up so first run processes shared catalogs, but not once in every
+	 * db.
+	 */
+	DataChecksumsWorkerShmem->process_shared_catalogs = true;
+
+	/*
+	 * Get a list of all databases to process. This may include databases that
+	 * were created during our runtime.  Since a database can be created as a
+	 * copy of any other database (which may not have existed in our last
+	 * run), we have to repeat this loop until no new databases show up in the
+	 * list.
+	 */
+	DatabaseList = BuildDatabaseList();
+
+	/*
+	 * Update progress reporting with the total number of databases we need to
+	 * process. This number should not be changed during processing, the
+	 * columns for processed databases is instead increased such that it can
+	 * be compared against the total.
+	 */
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_DBS_TOTAL,
+			PROGRESS_DATACHECKSUMS_DBS_DONE,
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE,
+			PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+			PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+		};
+
+		int64		vals[6];
+
+		vals[0] = list_length(DatabaseList);
+		vals[1] = 0;
+
+		/* translated to NULL */
+		vals[2] = -1;
+		vals[3] = -1;
+		vals[4] = -1;
+		vals[5] = -1;
+
+		pgstat_progress_update_multi_param(6, index, vals);
+	}
+
+	while (true)
+	{
+		int			processed_databases = 0;
+
+		foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+		{
+			DataChecksumsWorkerResult result;
+			DataChecksumsWorkerResultEntry *entry;
+			bool		found;
+
+			/*
+			 * Check if this database has been processed already, and if so
+			 * whether it should be retried or skipped.
+			 */
+			entry = (DataChecksumsWorkerResultEntry *) hash_search(ProcessedDatabases, &db->dboid,
+																   HASH_FIND, NULL);
+
+			if (entry)
+			{
+				if (entry->result == DATACHECKSUMSWORKER_RETRYDB)
+				{
+					/*
+					 * Limit the number of retries to avoid infinite looping
+					 * in case there simply won't be enough workers in the
+					 * cluster to finish this operation.
+					 */
+					if (entry->retries > DATACHECKSUMSWORKER_MAX_DB_RETRIES)
+						entry->result = DATACHECKSUMSWORKER_FAILED;
+				}
+
+				/* Skip if this database has been processed already */
+				if (entry->result != DATACHECKSUMSWORKER_RETRYDB)
+					continue;
+			}
+
+			result = ProcessDatabase(db);
+			processed_databases++;
+
+			/*
+			 * Update the number of processed databases in the progress
+			 * report.
+			 */
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_DBS_DONE,
+										 processed_databases);
+
+			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
+			{
+				/*
+				 * If one database has completed shared catalogs, we don't
+				 * have to process them again.
+				 */
+				if (DataChecksumsWorkerShmem->process_shared_catalogs)
+					DataChecksumsWorkerShmem->process_shared_catalogs = false;
+			}
+			else if (result == DATACHECKSUMSWORKER_ABORTED)
+			{
+				/* Abort flag set, so exit the whole process */
+				return false;
+			}
+
+			entry = hash_search(ProcessedDatabases, &db->dboid, HASH_ENTER, &found);
+			entry->dboid = db->dboid;
+			entry->result = result;
+			if (!found)
+				entry->retries = 0;
+			else
+				entry->retries++;
+		}
+
+		elog(DEBUG1,
+			 "%i databases processed for data checksum enabling, %s",
+			 processed_databases,
+			 (processed_databases ? "process with restart" : "process completed"));
+
+		FreeDatabaseList(DatabaseList);
+
+		/*
+		 * If no databases were processed in this run of the loop, we have now
+		 * finished all databases and no concurrently created ones can exist.
+		 */
+		if (processed_databases == 0)
+			break;
+
+		/*
+		 * Re-generate the list of databases for another pass. Since we wait
+		 * for all pre-existing transactions finish, this way we can be
+		 * certain that there are no databases left without checksums.
+		 */
+		WaitForAllTransactionsToFinish();
+		DatabaseList = BuildDatabaseList();
+	}
+
+	/*
+	 * ProcessedDatabases now has all databases and the results of their
+	 * processing. Failure to enable checksums for a database can be because
+	 * they actually failed for some reason, or because the database was
+	 * dropped between us getting the database list and trying to process it.
+	 * Get a fresh list of databases to detect the second case where the
+	 * database was dropped before we had started processing it. If a database
+	 * still exists, but enabling checksums failed then we fail the entire
+	 * checksumming process and exit with an error.
+	 */
+	WaitForAllTransactionsToFinish();
+	DatabaseList = BuildDatabaseList();
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+	{
+		DataChecksumsWorkerResult *entry;
+		bool		found;
+
+		entry = hash_search(ProcessedDatabases, (void *) &db->dboid,
+							HASH_FIND, &found);
+
+		/*
+		 * We are only interested in the databases where the failed database
+		 * still exists.
+		 */
+		if (found && *entry == DATACHECKSUMSWORKER_FAILED)
+		{
+			ereport(WARNING,
+					(errmsg("failed to enable data checksums in \"%s\"",
+							db->dbname)));
+			found_failed = found;
+			continue;
+		}
+	}
+
+	FreeDatabaseList(DatabaseList);
+
+	if (found_failed)
+	{
+		/* Disable checksums on cluster, because we failed */
+		SetDataChecksumsOff();
+		ereport(ERROR,
+				(errmsg("data checksums failed to get enabled in all databases, aborting"),
+				 errhint("The server log might have more information on the cause of the error.")));
+	}
+
+	/*
+	 * When enabling checksums, we have to wait for a checkpoint for the
+	 * checksums to change from in-progress to on.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT);
+
+	/*
+	 * Force a checkpoint to get everything out to disk. The use of immediate
+	 * checkpoints is for running tests, as they would otherwise not execute
+	 * in such a way that they can reliably be placed under timeout control.
+	 */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_IMMEDIATE;
+	RequestCheckpoint(flags);
+
+	return true;
+}
+
+/*
+ * DataChecksumsWorkerShmemSize
+ *		Compute required space for datachecksumsworker-related shared memory
+ */
+Size
+DataChecksumsWorkerShmemSize(void)
+{
+	Size		size;
+
+	size = sizeof(DataChecksumsWorkerShmemStruct);
+	size = MAXALIGN(size);
+
+	return size;
+}
+
+/*
+ * DataChecksumsWorkerShmemInit
+ *		Allocate and initialize datachecksumsworker-related shared memory
+ */
+void
+DataChecksumsWorkerShmemInit(void)
+{
+	bool		found;
+
+	DataChecksumsWorkerShmem = (DataChecksumsWorkerShmemStruct *)
+		ShmemInitStruct("DataChecksumsWorker Data",
+						DataChecksumsWorkerShmemSize(),
+						&found);
+
+	if (!found)
+	{
+		MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
+
+		/*
+		 * Even if this is a redundant assignment, we want to be explicit
+		 * about our intent for readability, since we want to be able to query
+		 * this state in case of restartability.
+		 */
+		DataChecksumsWorkerShmem->launch_enable_checksums = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		DataChecksumsWorkerShmem->launch_fast = false;
+	}
+}
+
+/*
+ * BuildDatabaseList
+ *		Compile a list of all currently available databases in the cluster
+ *
+ * This creates the list of databases for the datachecksumsworker workers to
+ * add checksums to. If the caller wants to ensure that no concurrently
+ * running CREATE DATABASE calls exist, this needs to be preceded by a call
+ * to WaitForAllTransactionsToFinish().
+ */
+static List *
+BuildDatabaseList(void)
+{
+	List	   *DatabaseList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(DatabaseRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_database pgdb = (Form_pg_database) GETSTRUCT(tup);
+		DataChecksumsWorkerDatabase *db;
+
+		oldctx = MemoryContextSwitchTo(ctx);
+
+		db = (DataChecksumsWorkerDatabase *) palloc0(sizeof(DataChecksumsWorkerDatabase));
+
+		db->dboid = pgdb->oid;
+		db->dbname = pstrdup(NameStr(pgdb->datname));
+
+		DatabaseList = lappend(DatabaseList, db);
+
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return DatabaseList;
+}
+
+static void
+FreeDatabaseList(List *dblist)
+{
+	if (!dblist)
+		return;
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, dblist)
+	{
+		if (db->dbname != NULL)
+			pfree(db->dbname);
+	}
+
+	list_free_deep(dblist);
+}
+
+/*
+ * BuildRelationList
+ *		Compile a list of relations in the database
+ *
+ * Returns a list of OIDs for the request relation types. If temp_relations
+ * is True then only temporary relations are returned. If temp_relations is
+ * False then non-temporary relations which have data checksums are returned.
+ * If include_shared is True then shared relations are included as well in a
+ * non-temporary list. include_shared has no relevance when building a list of
+ * temporary relations.
+ */
+static List *
+BuildRelationList(bool temp_relations, bool include_shared)
+{
+	List	   *RelationList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(RelationRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_class pgc = (Form_pg_class) GETSTRUCT(tup);
+
+		/*
+		 * Only include temporary relations when asked for a temp relation
+		 * list.
+		 */
+		if (pgc->relpersistence == RELPERSISTENCE_TEMP)
+		{
+			if (!temp_relations)
+				continue;
+		}
+		else
+		{
+			/*
+			 * If we are only interested in temp relations then continue
+			 * immediately as the current relation isn't a temp relation.
+			 */
+			if (temp_relations)
+				continue;
+
+			if (!RELKIND_HAS_STORAGE(pgc->relkind))
+				continue;
+
+			if (pgc->relisshared && !include_shared)
+				continue;
+		}
+
+		oldctx = MemoryContextSwitchTo(ctx);
+		RelationList = lappend_oid(RelationList, pgc->oid);
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return RelationList;
+}
+
+/*
+ * DataChecksumsWorkerMain
+ *
+ * Main function for enabling checksums in a single database, This is the
+ * function set as the bgw_function_name in the dynamic background worker
+ * process initiated for each database by the worker launcher. After enabling
+ * data checksums in each applicable relation in the database, it will wait for
+ * all temporary relations that were present when the function started to
+ * disappear before returning. This is required since we cannot rewrite
+ * existing temporary relations with data checksums.
+ */
+void
+DataChecksumsWorkerMain(Datum arg)
+{
+	Oid			dboid = DatumGetObjectId(arg);
+	List	   *RelationList = NIL;
+	List	   *InitialTempTableList = NIL;
+	BufferAccessStrategy strategy;
+	bool		aborted = false;
+	int64		rels_done;
+
+	enabling_checksums = true;
+
+	pqsignal(SIGTERM, die);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_WORKER;
+	init_ps_display(NULL);
+
+	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
+											  BGWORKER_BYPASS_ALLOWCONN);
+
+	/* worker will have a separate entry in pg_stat_progress_data_checksums */
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	/*
+	 * Get a list of all temp tables present as we start in this database. We
+	 * need to wait until they are all gone until we are done, since we cannot
+	 * access these relations and modify them.
+	 */
+	InitialTempTableList = BuildRelationList(true, false);
+
+	/*
+	 * Enable vacuum cost delay, if any.
+	 */
+	Assert(DataChecksumsWorkerShmem->enabling_checksums);
+	VacuumCostDelay = DataChecksumsWorkerShmem->cost_delay;
+	VacuumCostLimit = DataChecksumsWorkerShmem->cost_limit;
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumCostPageHit = 0;
+	VacuumCostPageMiss = 0;
+	VacuumCostPageDirty = 0;
+
+	/*
+	 * Create and set the vacuum strategy as our buffer strategy.
+	 */
+	strategy = GetAccessStrategy(BAS_VACUUM);
+
+	RelationList = BuildRelationList(false,
+									 DataChecksumsWorkerShmem->process_shared_catalogs);
+
+	/* Update the total number of relations to be processed in this DB. */
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE
+		};
+
+		int64		vals[2];
+
+		vals[0] = list_length(RelationList);
+		vals[1] = 0;
+
+		pgstat_progress_update_multi_param(2, index, vals);
+	}
+
+	/* process the relations */
+	rels_done = 0;
+	foreach_oid(reloid, RelationList)
+	{
+		if (!ProcessSingleRelationByOid(reloid, strategy))
+		{
+			aborted = true;
+			break;
+		}
+
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_RELS_DONE,
+									 ++rels_done);
+	}
+	list_free(RelationList);
+
+	if (aborted)
+	{
+		DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+		ereport(DEBUG1,
+				(errmsg("data checksum processing aborted in database OID %u",
+						dboid)));
+		return;
+	}
+
+	/* The worker is about to wait for temporary tables to go away. */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL);
+
+	/*
+	 * Wait for all temp tables that existed when we started to go away. This
+	 * is necessary since we cannot "reach" them to enable checksums. Any temp
+	 * tables created after we started will already have checksums in them
+	 * (due to the "inprogress-on" state), so no need to wait for those.
+	 */
+	for (;;)
+	{
+		List	   *CurrentTempTables;
+		int			numleft;
+		char		activity[64];
+
+		CurrentTempTables = BuildRelationList(true, false);
+		numleft = 0;
+		foreach_oid(tmptbloid, InitialTempTableList)
+		{
+			if (list_member_oid(CurrentTempTables, tmptbloid))
+				numleft++;
+		}
+		list_free(CurrentTempTables);
+
+		if (numleft == 0)
+			break;
+
+		/*
+		 * At least one temp table is left to wait for, indicate in pgstat
+		 * activity and progress reporting.
+		 */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for %d temp tables to be removed", numleft);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 5 seconds */
+		ResetLatch(MyLatch);
+		(void) WaitLatch(MyLatch,
+						 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+						 5000,
+						 WAIT_EVENT_CHECKSUM_ENABLE_FINISHCONDITION);
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		aborted = DataChecksumsWorkerShmem->launch_enable_checksums != enabling_checksums;
+		LWLockRelease(DataChecksumsWorkerLock);
+
+		if (aborted || abort_requested)
+		{
+			DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+			ereport(DEBUG1,
+					(errmsg("data checksum processing aborted in database OID %u",
+							dboid)));
+			return;
+		}
+	}
+
+	list_free(InitialTempTableList);
+
+	/* worker done */
+	pgstat_progress_end_command();
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
+}
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index bf6b55ee830..4a86d2588ff 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -204,6 +204,9 @@ static child_process_kind child_process_kinds[] = {
 	[B_WAL_SUMMARIZER] = {"wal_summarizer", WalSummarizerMain, true},
 	[B_WAL_WRITER] = {"wal_writer", WalWriterMain, true},
 
+	[B_DATACHECKSUMSWORKER_LAUNCHER] = {"datachecksum launcher", NULL, false},
+	[B_DATACHECKSUMSWORKER_WORKER] = {"datachecksum worker", NULL, false},
+
 	[B_LOGGER] = {"syslogger", SysLoggerMain, false},
 };
 
@@ -284,6 +287,16 @@ postmaster_child_launch(BackendType child_type, int child_slot,
 			memcpy(MyClientSocket, client_sock, sizeof(ClientSocket));
 		}
 
+		/*
+		 * update the LocalProcessControlFile to match XLogCtl->data_checksum_version
+		 *
+		 * XXX It seems the postmaster (which is what gets forked into the new
+		 * child process) does not absorb the checksum barriers, therefore it
+		 * does not update the value (except after a restart). Not sure if there
+		 * is some sort of race condition.
+		 */
+		InitLocalDataChecksumVersion();
+
 		/*
 		 * Run the appropriate Main function
 		 */
diff --git a/src/backend/postmaster/meson.build b/src/backend/postmaster/meson.build
index 0008603cfee..ce10ef1059a 100644
--- a/src/backend/postmaster/meson.build
+++ b/src/backend/postmaster/meson.build
@@ -6,6 +6,7 @@ backend_sources += files(
   'bgworker.c',
   'bgwriter.c',
   'checkpointer.c',
+  'datachecksumsworker.c',
   'fork_process.c',
   'interrupt.c',
   'launch_backend.c',
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 490f7ce3664..bfe17988e44 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2973,6 +2973,11 @@ PostmasterStateMachine(void)
 									B_INVALID,
 									B_STANDALONE_BACKEND);
 
+			/* also add checksumming processes */
+			remainMask = btmask_add(remainMask,
+									B_DATACHECKSUMSWORKER_LAUNCHER,
+									B_DATACHECKSUMSWORKER_WORKER);
+
 			/* All types should be included in targetMask or remainMask */
 			Assert((remainMask.mask | targetMask.mask) == BTYPE_MASK_ALL.mask);
 		}
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index cc03f0706e9..f9f06821a8f 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -186,6 +186,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 		case XLOG_FPW_CHANGE:
 		case XLOG_FPI_FOR_HINT:
 		case XLOG_FPI:
+		case XLOG_CHECKSUMS:
 		case XLOG_OVERWRITE_CONTRECORD:
 		case XLOG_CHECKPOINT_REDO:
 			break;
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 2fa045e6b0f..44213d140ae 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -30,6 +30,8 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "postmaster/postmaster.h"
 #include "postmaster/walsummarizer.h"
 #include "replication/logicallauncher.h"
 #include "replication/origin.h"
@@ -150,6 +152,7 @@ CalculateShmemSize(int *num_semaphores)
 	size = add_size(size, InjectionPointShmemSize());
 	size = add_size(size, SlotSyncShmemSize());
 	size = add_size(size, AioShmemSize());
+	size = add_size(size, DataChecksumsWorkerShmemSize());
 
 	/* include additional requested shmem from preload libraries */
 	size = add_size(size, total_addin_request);
@@ -332,6 +335,7 @@ CreateOrAttachShmemStructs(void)
 	PgArchShmemInit();
 	ApplyLauncherShmemInit();
 	SlotSyncShmemInit();
+	DataChecksumsWorkerShmemInit();
 
 	/*
 	 * Set up other modules that need some shared memory space
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index a9bb540b55a..c48abee2b60 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -18,6 +18,7 @@
 #include <unistd.h>
 
 #include "access/parallel.h"
+#include "access/xlog.h"
 #include "commands/async.h"
 #include "miscadmin.h"
 #include "pgstat.h"
@@ -576,6 +577,18 @@ ProcessProcSignalBarrier(void)
 					case PROCSIGNAL_BARRIER_SMGRRELEASE:
 						processed = ProcessBarrierSmgrRelease();
 						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON:
+						processed = AbsorbChecksumsOnInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_ON:
+						processed = AbsorbChecksumsOnBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF:
+						processed = AbsorbChecksumsOffInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_OFF:
+						processed = AbsorbChecksumsOffBarrier();
+						break;
 				}
 
 				/*
diff --git a/src/backend/storage/page/README b/src/backend/storage/page/README
index e30d7ac59ad..73c36a63908 100644
--- a/src/backend/storage/page/README
+++ b/src/backend/storage/page/README
@@ -10,7 +10,9 @@ http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf, discussed
 2010/12/22 on -hackers list.
 
 Current implementation requires this be enabled system-wide at initdb time, or
-by using the pg_checksums tool on an offline cluster.
+by using the pg_checksums tool on an offline cluster.  Checksums can also be
+enabled at runtime using pg_enable_data_checksums(), and disabled by using
+pg_disable_data_checksums().
 
 The checksum is not valid at all times on a data page!!
 The checksum is valid when the page leaves the shared pool and is checked
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index dbb49ed9197..19cf6512e52 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -107,7 +107,7 @@ PageIsVerified(PageData *page, BlockNumber blkno, int flags, bool *checksum_fail
 	 */
 	if (!PageIsNew(page))
 	{
-		if (DataChecksumsEnabled())
+		if (DataChecksumsNeedVerify())
 		{
 			checksum = pg_checksum_page(page, blkno);
 
@@ -1511,7 +1511,7 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 	static char *pageCopy = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return page;
 
 	/*
@@ -1541,7 +1541,7 @@ void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
 	/* If we don't need a checksum, just return */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return;
 
 	((PageHeader) page)->pd_checksum = pg_checksum_page(page, blkno);
diff --git a/src/backend/utils/activity/pgstat_backend.c b/src/backend/utils/activity/pgstat_backend.c
index 51256277e8d..dfef4a39bb0 100644
--- a/src/backend/utils/activity/pgstat_backend.c
+++ b/src/backend/utils/activity/pgstat_backend.c
@@ -388,6 +388,8 @@ pgstat_tracks_backend_bktype(BackendType bktype)
 		case B_CHECKPOINTER:
 		case B_IO_WORKER:
 		case B_STARTUP:
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 			return false;
 
 		case B_AUTOVAC_WORKER:
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index d8d26379a57..faa32dd9f83 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -370,6 +370,8 @@ pgstat_tracks_io_bktype(BackendType bktype)
 		case B_LOGGER:
 			return false;
 
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 		case B_AUTOVAC_LAUNCHER:
 		case B_AUTOVAC_WORKER:
 		case B_BACKEND:
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index 4da68312b5f..8cbf879a1c7 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -116,6 +116,8 @@ CHECKPOINT_DELAY_COMPLETE	"Waiting for a backend that blocks a checkpoint from c
 CHECKPOINT_DELAY_START	"Waiting for a backend that blocks a checkpoint from starting."
 CHECKPOINT_DONE	"Waiting for a checkpoint to complete."
 CHECKPOINT_START	"Waiting for a checkpoint to start."
+CHECKSUM_ENABLE_STARTCONDITION	"Waiting for data checksums enabling to start."
+CHECKSUM_ENABLE_FINISHCONDITION	"Waiting for data checksums to be enabled."
 EXECUTE_GATHER	"Waiting for activity from a child process while executing a <literal>Gather</literal> plan node."
 HASH_BATCH_ALLOCATE	"Waiting for an elected Parallel Hash participant to allocate a hash table."
 HASH_BATCH_ELECT	"Waiting to elect a Parallel Hash participant to allocate a hash table."
@@ -352,6 +354,7 @@ DSMRegistry	"Waiting to read or update the dynamic shared memory registry."
 InjectionPoint	"Waiting to read or update information related to injection points."
 SerialControl	"Waiting to read or update shared <filename>pg_serial</filename> state."
 AioWorkerSubmissionQueue	"Waiting to access AIO worker submission queue."
+DataChecksumsWorker	"Waiting for data checksumsworker."
 
 #
 # END OF PREDEFINED LWLOCKS (DO NOT CHANGE THIS LINE)
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 1c12ddbae49..9b867d25909 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -274,6 +274,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		cmdtype = PROGRESS_COMMAND_BASEBACKUP;
 	else if (pg_strcasecmp(cmd, "COPY") == 0)
 		cmdtype = PROGRESS_COMMAND_COPY;
+	else if (pg_strcasecmp(cmd, "DATACHECKSUMS") == 0)
+		cmdtype = PROGRESS_COMMAND_DATACHECKSUMS;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -1146,9 +1148,6 @@ pg_stat_get_db_checksum_failures(PG_FUNCTION_ARGS)
 	int64		result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
@@ -1164,9 +1163,6 @@ pg_stat_get_db_checksum_last_failure(PG_FUNCTION_ARGS)
 	TimestampTz result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index 43b4dbccc3d..e39498d1250 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -296,6 +296,12 @@ GetBackendTypeDesc(BackendType backendType)
 		case B_IO_WORKER:
 			backendDesc = gettext_noop("io worker");
 			break;
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+			backendDesc = "datachecksumsworker launcher";
+			break;
+		case B_DATACHECKSUMSWORKER_WORKER:
+			backendDesc = "datachecksumsworker worker";
+			break;
 		case B_LOGGER:
 			backendDesc = gettext_noop("logger");
 			break;
@@ -895,7 +901,8 @@ InitializeSessionUserIdStandalone(void)
 	 * workers, in slot sync worker and in background workers.
 	 */
 	Assert(!IsUnderPostmaster || AmAutoVacuumWorkerProcess() ||
-		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess());
+		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess() ||
+		   AmDataChecksumsWorkerProcess());
 
 	/* call only once */
 	Assert(!OidIsValid(AuthenticatedUserId));
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index c86ceefda94..c5202a8a8db 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -749,6 +749,11 @@ InitPostgres(const char *in_dbname, Oid dboid,
 	 */
 	SharedInvalBackendInit(false);
 
+	/*
+	 * Set up backend local cache of Controldata values.
+	 */
+	InitLocalControldata();
+
 	ProcSignalInit(MyCancelKey, MyCancelKeyLength);
 
 	/*
@@ -879,7 +884,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
 					 errhint("You should immediately run CREATE USER \"%s\" SUPERUSER;.",
 							 username != NULL ? username : "postgres")));
 	}
-	else if (AmBackgroundWorkerProcess())
+	else if (AmBackgroundWorkerProcess() || AmDataChecksumsWorkerProcess())
 	{
 		if (username == NULL && !OidIsValid(useroid))
 		{
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index d14b1678e7f..8539dab0e48 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -491,6 +491,14 @@ static const struct config_enum_entry file_copy_method_options[] = {
 	{NULL, 0, false}
 };
 
+static const struct config_enum_entry data_checksums_options[] = {
+	{"on", DATA_CHECKSUMS_ON, true},
+	{"off", DATA_CHECKSUMS_OFF, true},
+	{"inprogress-on", DATA_CHECKSUMS_INPROGRESS_ON, true},
+	{"inprogress-off", DATA_CHECKSUMS_INPROGRESS_OFF, true},
+	{NULL, 0, false}
+};
+
 /*
  * Options for enum values stored in other modules
  */
@@ -616,7 +624,6 @@ static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
 static int	num_os_semaphores;
-static bool data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
@@ -1968,17 +1975,6 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
-	{
-		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
-			gettext_noop("Shows whether data checksums are turned on for this cluster."),
-			NULL,
-			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
-		},
-		&data_checksums,
-		false,
-		NULL, NULL, NULL
-	},
-
 	{
 		{"syslog_sequence_numbers", PGC_SIGHUP, LOGGING_WHERE,
 			gettext_noop("Add sequence number to syslog messages to avoid duplicate suppression."),
@@ -5418,6 +5414,17 @@ struct config_enum ConfigureNamesEnum[] =
 		NULL, assign_io_method, NULL
 	},
 
+	{
+		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows whether data checksums are turned on for this cluster."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
+		},
+		&data_checksums,
+		DATA_CHECKSUMS_OFF, data_checksums_options,
+		NULL, NULL, show_data_checksums
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index f20be82862a..463384d323b 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -576,7 +576,7 @@ main(int argc, char *argv[])
 		mode == PG_MODE_DISABLE)
 		pg_fatal("data checksums are already disabled in cluster");
 
-	if (ControlFile->data_checksum_version > 0 &&
+	if (ControlFile->data_checksum_version == DATA_CHECKSUMS_ON &&
 		mode == PG_MODE_ENABLE)
 		pg_fatal("data checksums are already enabled in cluster");
 
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index 90cef0864de..29684e82440 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -15,6 +15,7 @@
 #include "access/xlog_internal.h"
 #include "common/string.h"
 #include "pg_upgrade.h"
+#include "storage/bufpage.h"
 
 
 /*
@@ -736,6 +737,14 @@ check_control_data(ControlData *oldctrl,
 	 * check_for_isn_and_int8_passing_mismatch().
 	 */
 
+	/*
+	 * If data checksums are in any in-progress state then disallow the
+	 * upgrade. The user should either let the process finish, or turn off
+	 * data checksums, before retrying.
+	 */
+	if (oldctrl->data_checksum_version > PG_DATA_CHECKSUM_VERSION)
+		pg_fatal("checksums are being enabled in the old cluster");
+
 	/*
 	 * We might eventually allow upgrades from checksum to no-checksum
 	 * clusters.
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index d313099c027..aec3ea0bc63 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -56,6 +56,7 @@ extern PGDLLIMPORT int CommitDelay;
 extern PGDLLIMPORT int CommitSiblings;
 extern PGDLLIMPORT bool track_wal_io_timing;
 extern PGDLLIMPORT int wal_decode_buffer_size;
+extern PGDLLIMPORT int data_checksums;
 
 extern PGDLLIMPORT int CheckPointSegments;
 
@@ -117,7 +118,7 @@ extern PGDLLIMPORT int wal_level;
  * of the bits make it to disk, but the checksum wouldn't match.  Also WAL-log
  * them if forced by wal_log_hints=on.
  */
-#define XLogHintBitIsNeeded() (DataChecksumsEnabled() || wal_log_hints)
+#define XLogHintBitIsNeeded() (wal_log_hints || DataChecksumsNeedWrite())
 
 /* Do we need to WAL-log information required only for Hot Standby and logical replication? */
 #define XLogStandbyInfoActive() (wal_level >= WAL_LEVEL_REPLICA)
@@ -230,7 +231,19 @@ extern XLogRecPtr GetXLogWriteRecPtr(void);
 
 extern uint64 GetSystemIdentifier(void);
 extern char *GetMockAuthenticationNonce(void);
-extern bool DataChecksumsEnabled(void);
+extern bool DataChecksumsNeedWrite(void);
+extern bool DataChecksumsNeedVerify(void);
+extern bool DataChecksumsOnInProgress(void);
+extern bool DataChecksumsOffInProgress(void);
+extern void SetDataChecksumsOnInProgress(void);
+extern void SetDataChecksumsOn(void);
+extern void SetDataChecksumsOff(void);
+extern bool AbsorbChecksumsOnInProgressBarrier(void);
+extern bool AbsorbChecksumsOffInProgressBarrier(void);
+extern bool AbsorbChecksumsOnBarrier(void);
+extern bool AbsorbChecksumsOffBarrier(void);
+extern const char *show_data_checksums(void);
+extern void InitLocalControldata(void);
 extern bool GetDefaultCharSignedness(void);
 extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
 extern Size XLOGShmemSize(void);
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index 2cf8d55d706..ee82d4ce73d 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,6 +25,7 @@
 #include "lib/stringinfo.h"
 #include "pgtime.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/relfilelocator.h"
 
 
@@ -289,6 +290,12 @@ typedef struct xl_restore_point
 	char		rp_name[MAXFNAMELEN];
 } xl_restore_point;
 
+/* Information logged when data checksum level is changed */
+typedef struct xl_checksum_state
+{
+	ChecksumType new_checksumtype;
+} xl_checksum_state;
+
 /* Overwrite of prior contrecord */
 typedef struct xl_overwrite_contrecord
 {
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 63e834a6ce4..727042b73be 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -62,6 +62,9 @@ typedef struct CheckPoint
 	 * set to InvalidTransactionId.
 	 */
 	TransactionId oldestActiveXid;
+
+	/* data checksums at the time of the checkpoint  */
+	uint32		data_checksum_version;
 } CheckPoint;
 
 /* XLOG info values for XLOG rmgr */
@@ -80,6 +83,7 @@ typedef struct CheckPoint
 /* 0xC0 is used in Postgres 9.5-11 */
 #define XLOG_OVERWRITE_CONTRECORD		0xD0
 #define XLOG_CHECKPOINT_REDO			0xE0
+#define XLOG_CHECKSUMS					0xF0
 
 
 /*
@@ -219,7 +223,7 @@ typedef struct ControlFileData
 	bool		float8ByVal;	/* float8, int8, etc pass-by-value? */
 
 	/* Are data pages protected by checksums? Zero if no checksum version */
-	uint32		data_checksum_version;
+	uint32		data_checksum_version;		/* persistent */
 
 	/*
 	 * True if the default signedness of char is "signed" on a platform where
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 1fc19146f46..945d50453be 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12352,6 +12352,23 @@
   proname => 'jsonb_subscript_handler', prorettype => 'internal',
   proargtypes => 'internal', prosrc => 'jsonb_subscript_handler' },
 
+# data checksum management functions
+{ oid => '9258',
+  descr => 'disable data checksums',
+  proname => 'pg_disable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => '',
+  prosrc => 'disable_data_checksums' },
+
+{ oid => '9257',
+  descr => 'enable data checksums',
+  proname => 'pg_enable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'int4 int4 bool', proallargtypes => '{int4,int4,bool}',
+  proargmodes => '{i,i,i}',
+  proargnames => '{cost_delay,cost_limit,fast}',
+  prosrc => 'enable_data_checksums' },
+
 # collation management functions
 { oid => '3445', descr => 'import collations from operating system',
   proname => 'pg_import_system_collations', procost => '100',
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 7c736e7b03b..b172a5f24ce 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -157,4 +157,20 @@
 #define PROGRESS_COPY_TYPE_PIPE 3
 #define PROGRESS_COPY_TYPE_CALLBACK 4
 
+/* Progress parameters for PROGRESS_DATACHECKSUMS */
+#define PROGRESS_DATACHECKSUMS_PHASE		0
+#define PROGRESS_DATACHECKSUMS_DBS_TOTAL	1
+#define PROGRESS_DATACHECKSUMS_DBS_DONE		2
+#define PROGRESS_DATACHECKSUMS_RELS_TOTAL	3
+#define PROGRESS_DATACHECKSUMS_RELS_DONE	4
+#define PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL	5
+#define PROGRESS_DATACHECKSUMS_BLOCKS_DONE	6
+
+/* Phases of datachecksumsworker operation */
+#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING			0
+#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING			1
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING			2
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL	3
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT 4
+
 #endif
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1bef98471c3..c86294b4d19 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -366,6 +366,9 @@ typedef enum BackendType
 	B_WAL_SUMMARIZER,
 	B_WAL_WRITER,
 
+	B_DATACHECKSUMSWORKER_LAUNCHER,
+	B_DATACHECKSUMSWORKER_WORKER,
+
 	/*
 	 * Logger is not connected to shared memory and does not have a PGPROC
 	 * entry.
@@ -391,6 +394,9 @@ extern PGDLLIMPORT BackendType MyBackendType;
 #define AmWalSummarizerProcess()	(MyBackendType == B_WAL_SUMMARIZER)
 #define AmWalWriterProcess()		(MyBackendType == B_WAL_WRITER)
 #define AmIoWorkerProcess()			(MyBackendType == B_IO_WORKER)
+#define AmDataChecksumsWorkerProcess() \
+	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || \
+	 MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
 
 #define AmSpecialWorkerProcess() \
 	(AmAutoVacuumLauncherProcess() || \
@@ -536,6 +542,10 @@ extern Size EstimateClientConnectionInfoSpace(void);
 extern void SerializeClientConnectionInfo(Size maxsize, char *start_address);
 extern void RestoreClientConnectionInfo(char *conninfo);
 
+extern uint32 GetLocalDataChecksumVersion(void);
+extern uint32 GetCurrentDataChecksumVersion(void);
+extern void InitLocalDataChecksumVersion(void);
+
 /* in executor/nodeHash.c */
 extern size_t get_hash_memory_limit(void);
 
diff --git a/src/include/postmaster/datachecksumsworker.h b/src/include/postmaster/datachecksumsworker.h
new file mode 100644
index 00000000000..59c9000d646
--- /dev/null
+++ b/src/include/postmaster/datachecksumsworker.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.h
+ *	  header file for checksum helper background worker
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/postmaster/datachecksumsworker.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef DATACHECKSUMSWORKER_H
+#define DATACHECKSUMSWORKER_H
+
+/* Shared memory */
+extern Size DataChecksumsWorkerShmemSize(void);
+extern void DataChecksumsWorkerShmemInit(void);
+
+/* Start the background processes for enabling or disabling checksums */
+void		StartDataChecksumsWorkerLauncher(bool enable_checksums,
+											 int cost_delay,
+											 int cost_limit,
+											 bool fast);
+
+/* Background worker entrypoints */
+void		DataChecksumsWorkerLauncherMain(Datum arg);
+void		DataChecksumsWorkerMain(Datum arg);
+
+#endif							/* DATACHECKSUMSWORKER_H */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index aeb67c498c5..20631844bac 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -205,7 +205,17 @@ typedef PageHeaderData *PageHeader;
  * handling pages.
  */
 #define PG_PAGE_LAYOUT_VERSION		4
+
+/*
+ * Checksum version 0 is used for when data checksums are disabled.
+ * PG_DATA_CHECKSUM_VERSION defines that data checksums are enabled in the
+ * cluster and PG_DATA_CHECKSUM_INPROGRESS_{ON|OFF}_VERSION defines that data
+ * checksums are either currently being enabled or disabled.
+ */
 #define PG_DATA_CHECKSUM_VERSION	1
+#define PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION	2
+#define PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION	3
+
 
 /* ----------------------------------------------------------------
  *						page support functions
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 25d13a798d1..6faff962ef0 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -15,6 +15,14 @@
 
 #include "storage/block.h"
 
+typedef enum ChecksumType
+{
+	DATA_CHECKSUMS_OFF = 0,
+	DATA_CHECKSUMS_ON,
+	DATA_CHECKSUMS_INPROGRESS_ON,
+	DATA_CHECKSUMS_INPROGRESS_OFF
+} ChecksumType;
+
 /*
  * Compute the checksum for a Postgres page.  The page must be aligned on a
  * 4-byte boundary.
diff --git a/src/include/storage/lwlocklist.h b/src/include/storage/lwlocklist.h
index a9681738146..ab3751636ac 100644
--- a/src/include/storage/lwlocklist.h
+++ b/src/include/storage/lwlocklist.h
@@ -84,3 +84,4 @@ PG_LWLOCK(50, DSMRegistry)
 PG_LWLOCK(51, InjectionPoint)
 PG_LWLOCK(52, SerialControl)
 PG_LWLOCK(53, AioWorkerSubmissionQueue)
+PG_LWLOCK(54, DataChecksumsWorker)
\ No newline at end of file
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 9f9b3fcfbf1..e2c1e178bc8 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -455,11 +455,11 @@ extern PGDLLIMPORT PGPROC *PreparedXactProcs;
  * Background writer, checkpointer, WAL writer, WAL summarizer, and archiver
  * run during normal operation.  Startup process and WAL receiver also consume
  * 2 slots, but WAL writer is launched only after startup has exited, so we
- * only need 6 slots.
+ * only need 6 slots to cover these. The DataChecksums worker and launcher
+ * can consume 2 slots when data checksums are enabled or disabled.
  */
 #define MAX_IO_WORKERS          32
-#define NUM_AUXILIARY_PROCS		(6 + MAX_IO_WORKERS)
-
+#define NUM_AUXILIARY_PROCS		(8 + MAX_IO_WORKERS)
 
 /* configurable options */
 extern PGDLLIMPORT int DeadlockTimeout;
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index afeeb1ca019..c54c61e2cd8 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -54,6 +54,11 @@ typedef enum
 typedef enum
 {
 	PROCSIGNAL_BARRIER_SMGRRELEASE, /* ask smgr to close files */
+
+	PROCSIGNAL_BARRIER_CHECKSUM_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_ON,
 } ProcSignalBarrierType;
 
 /*
diff --git a/src/include/utils/backend_progress.h b/src/include/utils/backend_progress.h
index dda813ab407..c664e92dbfe 100644
--- a/src/include/utils/backend_progress.h
+++ b/src/include/utils/backend_progress.h
@@ -28,6 +28,7 @@ typedef enum ProgressCommandType
 	PROGRESS_COMMAND_CREATE_INDEX,
 	PROGRESS_COMMAND_BASEBACKUP,
 	PROGRESS_COMMAND_COPY,
+	PROGRESS_COMMAND_DATACHECKSUMS,
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	20
diff --git a/src/test/Makefile b/src/test/Makefile
index 511a72e6238..278ce3e8a86 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,16 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = perl postmaster regress isolation modules authentication recovery subscription
+SUBDIRS = \
+		perl \
+		postmaster \
+		regress \
+		isolation \
+		modules \
+		authentication \
+		recovery \
+		subscription \
+		checksum
 
 ifeq ($(with_icu),yes)
 SUBDIRS += icu
diff --git a/src/test/checksum/.gitignore b/src/test/checksum/.gitignore
new file mode 100644
index 00000000000..871e943d50e
--- /dev/null
+++ b/src/test/checksum/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/checksum/Makefile b/src/test/checksum/Makefile
new file mode 100644
index 00000000000..fd03bf73df4
--- /dev/null
+++ b/src/test/checksum/Makefile
@@ -0,0 +1,23 @@
+#-------------------------------------------------------------------------
+#
+# Makefile for src/test/checksum
+#
+# Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/checksum/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/test/checksum
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
+
+clean distclean maintainer-clean:
+	rm -rf tmp_check
diff --git a/src/test/checksum/README b/src/test/checksum/README
new file mode 100644
index 00000000000..0f0317060b3
--- /dev/null
+++ b/src/test/checksum/README
@@ -0,0 +1,22 @@
+src/test/checksum/README
+
+Regression tests for data checksums
+===================================
+
+This directory contains a test suite for enabling data checksums
+in a running cluster.
+
+Running the tests
+=================
+
+    make check
+
+or
+
+    make installcheck
+
+NOTE: This creates a temporary installation (in the case of "check"),
+with multiple nodes, be they master or standby(s) for the purpose of
+the tests.
+
+NOTE: This requires the --enable-tap-tests argument to configure.
diff --git a/src/test/checksum/meson.build b/src/test/checksum/meson.build
new file mode 100644
index 00000000000..5f96b5c246d
--- /dev/null
+++ b/src/test/checksum/meson.build
@@ -0,0 +1,15 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+tests += {
+  'name': 'checksums',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'tests': [
+      't/001_basic.pl',
+      't/002_restarts.pl',
+      't/003_standby_restarts.pl',
+      't/004_offline.pl',
+    ],
+  },
+}
diff --git a/src/test/checksum/t/001_basic.pl b/src/test/checksum/t/001_basic.pl
new file mode 100644
index 00000000000..4c64f6a14fc
--- /dev/null
+++ b/src/test/checksum/t/001_basic.pl
@@ -0,0 +1,88 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are turned off
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable data checksums
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+# Wait for checksums to become enabled
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Enable data checksums again which should be a no-op..
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+# ..and make sure we can still read/write data
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums again
+$node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are disabled');
+
+# Test reading again
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure previously checksummed pages can be read back');
+
+# Re-enable checksums and make sure that the underlying data has changed to
+# ensure that checksums will be different.
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back the data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$node->stop;
+
+done_testing();
diff --git a/src/test/checksum/t/002_restarts.pl b/src/test/checksum/t/002_restarts.pl
new file mode 100644
index 00000000000..2697b722257
--- /dev/null
+++ b/src/test/checksum/t/002_restarts.pl
@@ -0,0 +1,92 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# restarting the processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, '1', "ensure checksums aren't enabled yet");
+
+$bsession->quit;
+$node->stop;
+$node->start;
+
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'inprogress-on', "ensure checksums aren't enabled yet");
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are turned on');
+
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+$result = $node->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are turned off');
+
+done_testing();
diff --git a/src/test/checksum/t/003_standby_restarts.pl b/src/test/checksum/t/003_standby_restarts.pl
new file mode 100644
index 00000000000..6782664f4e6
--- /dev/null
+++ b/src/test/checksum/t/003_standby_restarts.pl
@@ -0,0 +1,139 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# streaming replication
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize primary node
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+$node_primary->init(allows_streaming => 1, no_data_checksums => 1);
+$node_primary->start;
+
+my $slotname = 'physical_slot';
+$node_primary->safe_psql('postgres',
+	"SELECT pg_create_physical_replication_slot('$slotname')");
+
+# Take backup
+my $backup_name = 'my_backup';
+$node_primary->backup($backup_name);
+
+# Create streaming standby linking to primary
+my $node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $backup_name,
+	has_streaming => 1);
+$node_standby_1->append_conf(
+	'postgresql.conf', qq[
+primary_slot_name = '$slotname'
+]);
+$node_standby_1->start;
+
+# Create some content on the primary to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Wait for standbys to catch up
+$node_primary->wait_for_catchup($node_standby_1, 'replay',
+	$node_primary->lsn('insert'));
+
+# Check that checksums are turned off on all nodes
+my $result = $node_primary->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on primary');
+
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, "off", 'ensure checksums are turned off on standby_1');
+
+# Enable checksums for the cluster
+$node_primary->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+# Ensure that the primary switches to "inprogress-on"
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	"inprogress-on");
+is($result, 1, 'ensure checksums are in progress on primary');
+
+# Wait for checksum enable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Ensure that the standby has switched to "inprogress-on" or "on".  Normally it
+# would be "inprogress-on", but it is theoretically possible for the primary to
+# complete the checksum enabling *and* have the standby replay that record
+# before we reach the check below.
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting = 'off' FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'f');
+is($result, 1, 'ensure standby has absorbed the inprogress-on barrier');
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+
+is(($result eq 'inprogress-on' || $result eq 'on'),
+	1, 'ensure checksums are on, or in progress, on standby_1');
+
+# Insert some more data which should be checksummed on INSERT
+$node_primary->safe_psql('postgres',
+	"INSERT INTO t VALUES (generate_series(1, 10000));");
+
+# Wait for checksums enabled on the primary
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the primary');
+
+# Wait for checksums enabled on the standby
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'on');
+is($result, 1, 'ensure checksums are enabled on the standby');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, '20000', 'ensure we can safely read all data with checksums');
+
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+# Disable checksums and ensure it's propagated to standby and that we can
+# still read all data
+$node_primary->safe_psql('postgres', "SELECT pg_disable_data_checksums();");
+# Wait for checksum disable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure data checksums are disabled on the primary 2');
+
+# Ensure that the standby has switched to off
+$result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'off');
+is($result, 1, 'ensure checksums are off on standby_1');
+
+$result = $node_primary->safe_psql('postgres', "SELECT count(a) FROM t");
+is($result, "20000", 'ensure we can safely read all data without checksums');
+
+done_testing();
diff --git a/src/test/checksum/t/004_offline.pl b/src/test/checksum/t/004_offline.pl
new file mode 100644
index 00000000000..9cee62c9b52
--- /dev/null
+++ b/src/test/checksum/t/004_offline.pl
@@ -0,0 +1,101 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums offline from various states
+# of checksum processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# pg_enable_checksums take three params: cost_delay, cost_limit and fast. For
+# testing we always want to override the default value for 'fast' with True
+# which will cause immediate checkpoints. 0 and 100 are the defaults for
+# cost_delay and cost_limit which are fine to use for testing so let's keep
+# them.
+my $enable_params = '0, 100, true';
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+my $result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Enable checksums offline using pg_checksums
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are enabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums offline again using pg_checksums
+$node->stop;
+$node->checksum_disable_offline;
+$node->start;
+
+# Ensure that checksums are disabled
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'off', 'ensure checksums are disabled');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+$node->safe_psql('postgres',
+	"SELECT pg_enable_data_checksums($enable_params);");
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'inprogress-on');
+is($result, 1, 'ensure checksums are in the process of being enabled');
+
+# Turn the cluster off and enable checksums offline, then start back up
+$bsession->quit;
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are now enabled even though processing wasn't
+# restarted
+$result = $node->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+is($result, 'on', 'ensure checksums are enabled');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+done_testing();
diff --git a/src/test/meson.build b/src/test/meson.build
index ccc31d6a86a..c4bfef2c0e2 100644
--- a/src/test/meson.build
+++ b/src/test/meson.build
@@ -8,6 +8,7 @@ subdir('postmaster')
 subdir('recovery')
 subdir('subscription')
 subdir('modules')
+subdir('checksum')
 
 if ssl.found()
   subdir('ssl')
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 301766d2ed9..567af4b9624 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3845,6 +3845,42 @@ sub advance_wal
 	}
 }
 
+=item $node->checksum_enable_offline()
+
+Enable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_enable_offline
+{
+	my ($self) = @_;
+
+	print "### Enabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-e');
+	return;
+}
+
+=item checksum_disable_offline
+
+Disable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_disable_offline
+{
+	my ($self) = @_;
+
+	print "### Disabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-d');
+	return;
+}
+
 =pod
 
 =back
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index dce8c672b40..832e6bb6866 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2066,6 +2066,42 @@ pg_stat_progress_create_index| SELECT s.pid,
     s.param15 AS partitions_done
    FROM (pg_stat_get_progress_info('CREATE INDEX'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
+pg_stat_progress_data_checksums| SELECT s.pid,
+    s.datid,
+    d.datname,
+        CASE s.param1
+            WHEN 0 THEN 'enabling'::text
+            WHEN 1 THEN 'disabling'::text
+            WHEN 2 THEN 'waiting'::text
+            WHEN 3 THEN 'waiting on temporary tables'::text
+            WHEN 4 THEN 'waiting on checkpoint'::text
+            WHEN 5 THEN 'done'::text
+            ELSE NULL::text
+        END AS phase,
+        CASE s.param2
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param2
+        END AS databases_total,
+    s.param3 AS databases_done,
+        CASE s.param4
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param4
+        END AS relations_total,
+        CASE s.param5
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param5
+        END AS relations_done,
+        CASE s.param6
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param6
+        END AS blocks_total,
+        CASE s.param7
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param7
+        END AS blocks_done
+   FROM (pg_stat_get_progress_info('DATACHECKSUMS'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)))
+  ORDER BY s.datid;
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,
diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 776f1ad0e53..3634a4cde0b 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -51,6 +51,22 @@ client backend|relation|vacuum
 client backend|temp relation|normal
 client backend|wal|init
 client backend|wal|normal
+datachecksumsworker launcher|relation|bulkread
+datachecksumsworker launcher|relation|bulkwrite
+datachecksumsworker launcher|relation|init
+datachecksumsworker launcher|relation|normal
+datachecksumsworker launcher|relation|vacuum
+datachecksumsworker launcher|temp relation|normal
+datachecksumsworker launcher|wal|init
+datachecksumsworker launcher|wal|normal
+datachecksumsworker worker|relation|bulkread
+datachecksumsworker worker|relation|bulkwrite
+datachecksumsworker worker|relation|init
+datachecksumsworker worker|relation|normal
+datachecksumsworker worker|relation|vacuum
+datachecksumsworker worker|temp relation|normal
+datachecksumsworker worker|wal|init
+datachecksumsworker worker|wal|normal
 io worker|relation|bulkread
 io worker|relation|bulkwrite
 io worker|relation|init
@@ -95,7 +111,7 @@ walsummarizer|wal|init
 walsummarizer|wal|normal
 walwriter|wal|init
 walwriter|wal|normal
-(79 rows)
+(95 rows)
 \a
 -- ensure that both seqscan and indexscan plans are allowed
 SET enable_seqscan TO on;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 83192038571..0b2e0deabc1 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -419,6 +419,7 @@ CheckPointStmt
 CheckpointStatsData
 CheckpointerRequest
 CheckpointerShmemStruct
+ChecksumType
 Chromosome
 CkptSortItem
 CkptTsStatus
@@ -611,6 +612,10 @@ DataPageDeleteStack
 DataTypesUsageChecks
 DataTypesUsageVersionCheck
 DatabaseInfo
+DataChecksumsWorkerDatabase
+DataChecksumsWorkerResult
+DataChecksumsWorkerResultEntry
+DataChecksumsWorkerShmemStruct
 DateADT
 DateTimeErrorExtra
 Datum
@@ -4239,6 +4244,7 @@ xl_btree_split
 xl_btree_unlink_page
 xl_btree_update
 xl_btree_vacuum
+xl_checksum_state
 xl_clog_truncate
 xl_commit_ts_truncate
 xl_dbase_create_file_copy_rec
-- 
2.50.0

v20250711-0002-Reviewfixups.patchtext/x-patch; charset=UTF-8; name=v20250711-0002-Reviewfixups.patchDownload
From 7676617ec45f8aa08c6f8ef7c5fc889ade4a9994 Mon Sep 17 00:00:00 2001
From: Bernd Helmle <Bernd Helmle mailings@oopsware.de>
Date: Thu, 10 Jul 2025 18:23:51 +0200
Subject: [PATCH 2/4]  Reviewfixups

---
 src/backend/access/transam/xlog.c            | 26 +++++++++++++++++++-
 src/backend/postmaster/datachecksumsworker.c |  2 +-
 src/backend/utils/init/postinit.c            |  4 +--
 src/include/postmaster/datachecksumsworker.h |  2 +-
 src/test/checksum/Makefile                   |  2 +-
 5 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 4350f6abe3a..53890c3e726 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -671,6 +671,16 @@ static bool updateMinRecoveryPoint = true;
  */
 static uint32 LocalDataChecksumVersion = 0;
 
+/*
+ * Flag to remember if the procsignalbarrier being absorbed for enabling
+ * checksums is the first one or not. The first procsignalbarrier can in rare
+ * circumstances cause a transition from 'on' to 'on' when a new process is
+ * spawned between the update of XLogCtl->data_checksum_version and the
+ * barrier being emitted.  This can only happen on the very first barrier so
+ * mark that with this flag.
+ */
+static bool InitialDataChecksumTransition = true;
+
 /*
  * Variable backing the GUC, keep it in sync with LocalDataChecksumVersion.
  * See SetLocalDataChecksumVersion().
@@ -5049,7 +5059,20 @@ AbsorbChecksumsOnInProgressBarrier(void)
 bool
 AbsorbChecksumsOnBarrier(void)
 {
-	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	/*
+	 * If the process was spawned between updating XLogCtl and emitting the
+	 * barrier it will have seen the updated value, so for the first barrier
+	 * we accept both "on" and "inprogress-on".
+	 */
+	if (InitialDataChecksumTransition)
+	{
+		Assert((LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION) ||
+			   (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION));
+		InitialDataChecksumTransition = false;
+	}
+	else
+		Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+
 	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_VERSION);
 	return true;
 }
@@ -5435,6 +5458,7 @@ LocalProcessControlFile(bool reset)
 	Assert(reset || ControlFile == NULL);
 	ControlFile = palloc(sizeof(ControlFileData));
 	ReadControlFile();
+	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
 }
 
 /*
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
index 6a201dca8de..81be2808895 100644
--- a/src/backend/postmaster/datachecksumsworker.c
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -150,7 +150,7 @@
  *     online operation).
  *
  *
- * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
  *
  *
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index c5202a8a8db..785b8d4b04f 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -749,13 +749,13 @@ InitPostgres(const char *in_dbname, Oid dboid,
 	 */
 	SharedInvalBackendInit(false);
 
+	ProcSignalInit(MyCancelKey, MyCancelKeyLength);
+
 	/*
 	 * Set up backend local cache of Controldata values.
 	 */
 	InitLocalControldata();
 
-	ProcSignalInit(MyCancelKey, MyCancelKeyLength);
-
 	/*
 	 * Also set up timeout handlers needed for backend operation.  We need
 	 * these in every case except bootstrap.
diff --git a/src/include/postmaster/datachecksumsworker.h b/src/include/postmaster/datachecksumsworker.h
index 59c9000d646..0649232723d 100644
--- a/src/include/postmaster/datachecksumsworker.h
+++ b/src/include/postmaster/datachecksumsworker.h
@@ -4,7 +4,7 @@
  *	  header file for checksum helper background worker
  *
  *
- * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
  *
  * src/include/postmaster/datachecksumsworker.h
diff --git a/src/test/checksum/Makefile b/src/test/checksum/Makefile
index fd03bf73df4..f287001301e 100644
--- a/src/test/checksum/Makefile
+++ b/src/test/checksum/Makefile
@@ -2,7 +2,7 @@
 #
 # Makefile for src/test/checksum
 #
-# Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+# Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
 # Portions Copyright (c) 1994, Regents of the University of California
 #
 # src/test/checksum/Makefile
-- 
2.50.0

v20250711-0003-Rework-handling-of-procsignalbarrier-and-local-cache.patchtext/x-patch; charset=UTF-8; name*0=v20250711-0003-Rework-handling-of-procsignalbarrier-and-local-cac; name*1=he.patchDownload
From f14cf88958858c637f18c11c4221dfed0a04ccba Mon Sep 17 00:00:00 2001
From: Bernd Helmle <Bernd Helmle mailings@oopsware.de>
Date: Thu, 10 Jul 2025 18:35:51 +0200
Subject: [PATCH 3/4] Rework handling of procsignalbarrier and local cache for
 data_checksion_version.

---
 src/backend/access/transam/xlog.c       | 45 +++++++++----------------
 src/backend/postmaster/auxprocess.c     | 19 +++++++++++
 src/backend/postmaster/launch_backend.c | 10 ------
 src/backend/utils/init/postinit.c       | 17 ++++++++--
 src/include/access/xlog.h               |  2 +-
 src/include/miscadmin.h                 |  1 -
 6 files changed, 51 insertions(+), 43 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 53890c3e726..097db70fcfc 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -672,12 +672,15 @@ static bool updateMinRecoveryPoint = true;
 static uint32 LocalDataChecksumVersion = 0;
 
 /*
- * Flag to remember if the procsignalbarrier being absorbed for enabling
- * checksums is the first one or not. The first procsignalbarrier can in rare
- * circumstances cause a transition from 'on' to 'on' when a new process is
+ * Flag to remember if the procsignalbarrier being absorbed for checksums
+ * is the first one. The first procsignalbarrier can in rare cases be for
+ * the state we've initialized, i.e. a duplicate. This may happen for any
+ * data_checksum_version value, but for PG_DATA_CHECKSUM_ON_VERSION this
+ * would trigger an assert failure (this is the only transition with an
+ * assert) when processing the barrier. This may happen if the process is
  * spawned between the update of XLogCtl->data_checksum_version and the
- * barrier being emitted.  This can only happen on the very first barrier so
- * mark that with this flag.
+ * barrier being emitted. This can only happen on the very first barrier
+ * so mark that with this flag.
  */
 static bool InitialDataChecksumTransition = true;
 
@@ -5052,6 +5055,7 @@ SetDataChecksumsOff(void)
 bool
 AbsorbChecksumsOnInProgressBarrier(void)
 {
+	/* XXX can't we check we're in OFF or INPROGRESSS_ON? */
 	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
 	return true;
 }
@@ -5064,22 +5068,19 @@ AbsorbChecksumsOnBarrier(void)
 	 * barrier it will have seen the updated value, so for the first barrier
 	 * we accept both "on" and "inprogress-on".
 	 */
-	if (InitialDataChecksumTransition)
-	{
-		Assert((LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION) ||
-			   (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION));
-		InitialDataChecksumTransition = false;
-	}
-	else
-		Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	Assert((LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION) ||
+		   (InitialDataChecksumTransition &&
+			(LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)));
 
 	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_VERSION);
+	InitialDataChecksumTransition = false;
 	return true;
 }
 
 bool
 AbsorbChecksumsOffInProgressBarrier(void)
 {
+	/* XXX can't we check we're in ON or INPROGRESSS_OFF? */
 	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
 	return true;
 }
@@ -5087,6 +5088,7 @@ AbsorbChecksumsOffInProgressBarrier(void)
 bool
 AbsorbChecksumsOffBarrier(void)
 {
+	/* XXX can't we check we're in INPROGRESSS_OFF? */
 	SetLocalDataChecksumVersion(0);
 	return true;
 }
@@ -5100,7 +5102,7 @@ AbsorbChecksumsOffBarrier(void)
  * purpose enough to handle future cases.
  */
 void
-InitLocalControldata(void)
+InitLocalDataChecksumVersion(void)
 {
 	SpinLockAcquire(&XLogCtl->info_lck);
 	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
@@ -5138,21 +5140,6 @@ SetLocalDataChecksumVersion(uint32 data_checksum_version)
 	}
 }
 
-/*
- * Initialize the various data checksum values - GUC, local, ....
- */
-void
-InitLocalDataChecksumVersion(void)
-{
-	uint32	data_checksum_version;
-
-	SpinLockAcquire(&XLogCtl->info_lck);
-	data_checksum_version = XLogCtl->data_checksum_version;
-	SpinLockRelease(&XLogCtl->info_lck);
-
-	SetLocalDataChecksumVersion(data_checksum_version);
-}
-
 /*
  * Get the local data_checksum_version (cached XLogCtl value).
  */
diff --git a/src/backend/postmaster/auxprocess.c b/src/backend/postmaster/auxprocess.c
index a6d3630398f..4b56ef0eb81 100644
--- a/src/backend/postmaster/auxprocess.c
+++ b/src/backend/postmaster/auxprocess.c
@@ -15,6 +15,7 @@
 #include <unistd.h>
 #include <signal.h>
 
+#include "access/xlog.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/auxprocess.h"
@@ -68,6 +69,24 @@ AuxiliaryProcessMainCommon(void)
 
 	ProcSignalInit(NULL, 0);
 
+	/*
+	 * Initialize a local cache of the data_checksum_version, to be updated
+	 * by the procsignal-based barriers.
+	 *
+	 * This intentionally happens after initializing the procsignal, otherwise
+	 * we might miss a state change. This means we can get a barrier for the
+	 * state we've just initialized - but it can happen only once.
+	 *
+	 * The postmaster (which is what gets forked into the new child process)
+	 * does not handle barriers, therefore it may not have the current value
+	 * of LocalDataChecksumVersion value (it'll have the value read from the
+	 * control file, which may be arbitrarily old).
+	 *
+	 * NB: Even if the postmaster handled barriers, the value might still be
+	 * stale, as it might have changed after this process forked.
+	 */
+	InitLocalDataChecksumVersion();
+
 	/*
 	 * Auxiliary processes don't run transactions, but they may need a
 	 * resource owner anyway to manage buffer pins acquired outside
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 4a86d2588ff..955df32be5d 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -287,16 +287,6 @@ postmaster_child_launch(BackendType child_type, int child_slot,
 			memcpy(MyClientSocket, client_sock, sizeof(ClientSocket));
 		}
 
-		/*
-		 * update the LocalProcessControlFile to match XLogCtl->data_checksum_version
-		 *
-		 * XXX It seems the postmaster (which is what gets forked into the new
-		 * child process) does not absorb the checksum barriers, therefore it
-		 * does not update the value (except after a restart). Not sure if there
-		 * is some sort of race condition.
-		 */
-		InitLocalDataChecksumVersion();
-
 		/*
 		 * Run the appropriate Main function
 		 */
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 785b8d4b04f..7418deb10f5 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -752,9 +752,22 @@ InitPostgres(const char *in_dbname, Oid dboid,
 	ProcSignalInit(MyCancelKey, MyCancelKeyLength);
 
 	/*
-	 * Set up backend local cache of Controldata values.
+	 * Initialize a local cache of the data_checksum_version, to be updated
+	 * by the procsignal-based barriers.
+	 *
+	 * This intentionally happens after initializing the procsignal, otherwise
+	 * we might miss a state change. This means we can get a barrier for the
+	 * state we've just initialized - but it can happen only once.
+	 *
+	 * The postmaster (which is what gets forked into the new child process)
+	 * does not handle barriers, therefore it may not have the current value
+	 * of LocalDataChecksumVersion value (it'll have the value read from the
+	 * control file, which may be arbitrarily old).
+	 *
+	 * NB: Even if the postmaster handled barriers, the value might still be
+	 * stale, as it might have changed after this process forked.
 	 */
-	InitLocalControldata();
+	InitLocalDataChecksumVersion();
 
 	/*
 	 * Also set up timeout handlers needed for backend operation.  We need
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index aec3ea0bc63..615b2cf4ec8 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -243,7 +243,7 @@ extern bool AbsorbChecksumsOffInProgressBarrier(void);
 extern bool AbsorbChecksumsOnBarrier(void);
 extern bool AbsorbChecksumsOffBarrier(void);
 extern const char *show_data_checksums(void);
-extern void InitLocalControldata(void);
+extern void InitLocalDataChecksumVersion(void);
 extern bool GetDefaultCharSignedness(void);
 extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
 extern Size XLOGShmemSize(void);
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index c86294b4d19..5ce7a1cbc65 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -544,7 +544,6 @@ extern void RestoreClientConnectionInfo(char *conninfo);
 
 extern uint32 GetLocalDataChecksumVersion(void);
 extern uint32 GetCurrentDataChecksumVersion(void);
-extern void InitLocalDataChecksumVersion(void);
 
 /* in executor/nodeHash.c */
 extern size_t get_hash_memory_limit(void);
-- 
2.50.0

#46Daniel Gustafsson
daniel@yesql.se
In reply to: Bernd Helmle (#45)
Re: Changing the state of data checksums in a running cluster

On 11 Jul 2025, at 17:53, Bernd Helmle <mailings@oopsware.de> wrote:

Since i wanted to dig a little deeper in this patch i took the
opportunity and rebased it to current master, hopefully not having
broken something seriously.

Thanks, much appreciated!

--
Daniel Gustafsson

#47Daniel Gustafsson
daniel@yesql.se
In reply to: Bernd Helmle (#45)
1 attachment(s)
Re: Changing the state of data checksums in a running cluster

Attached is a rebase on top of the func.sgml changes which caused this to no
longer apply.

This version is also substantially updated with a new injection point based
test suite, fixed a few bugs (found by said test suite), added checkpoint to
disabling checksums, code cleanup, more granular wait events, comment rewrites
and additions and more smaller cleanups.

--
Daniel Gustafsson

Attachments:

v20250816-0001-Online-enabling-and-disabling-of-data-chec.patchapplication/octet-stream; name=v20250816-0001-Online-enabling-and-disabling-of-data-chec.patch; x-unix-mode=0644Download
From 298a9390b81749fef66bcba74ff93a4fadca5b66 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Fri, 15 Aug 2025 14:48:02 +0200
Subject: [PATCH v20250816] Online enabling and disabling of data checksums

This allows data checksums to be enabled, or disabled, in a running
cluster without restricting access to the cluster during processing.

Data checksums could prior to this only be enabled during initdb or
when the cluster is offline using the pg_checksums app. This commit
introduce functionality to enable, or disable, data checksums while
the cluster is running regardless of how it was initialized.

A background worker launcher process is responsible for launching a
dynamic per-database background worker which will mark all buffers
dirty for all relation with storage in order for them to have data
checksums calcuated on write.  Once all relations in all databases
have been processed, the data_checksums state will be set to on and
the cluster will at that point be identical to one which had data
checksums enabled during initialization or via offline processing.

When data checksums are being enabled, concurrent I/O operations
from backends other than the data checksums worker will write the
checksums but not verify them on reading.  Only when all backends
have absorbed the procsignalbarrier for setting data_checksums to
on will they also start verifying checksums on reading.  The same
process is repeated during disabling; all backends write checksums
but do not verify them until the barrier for setting the state to
off has been absorbed by all.  This in-progress state is used to
ensure there are no false negatives (or positives) due to reading
a checksum which is not in sync with the page.

This work is based on an earlier version of this patch which was
reviewed by among others Heikki Linnakangas, Robert Haas, Andres
Freund, Tomas Vondra, Michael Banck and Andrey Borodin.  Tomas
Vondra has given invaluable assistance with not only review but
very in-depth testing.

Author: Daniel Gustafsson <daniel@yesql.se>
Author: Magnus Hagander <magnus@hagander.net>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com
Discussion: https://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de
Discussion: https://postgr.es/m/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com
---
 doc/src/sgml/func/func-admin.sgml             |   71 +
 doc/src/sgml/glossary.sgml                    |   23 +
 doc/src/sgml/monitoring.sgml                  |  208 ++-
 doc/src/sgml/ref/pg_checksums.sgml            |    6 +
 doc/src/sgml/regress.sgml                     |   11 +
 doc/src/sgml/wal.sgml                         |   59 +-
 src/backend/access/rmgrdesc/xlogdesc.c        |   24 +
 src/backend/access/transam/xlog.c             |  547 +++++-
 src/backend/access/transam/xlogfuncs.c        |   43 +
 src/backend/backup/basebackup.c               |    6 +-
 src/backend/catalog/system_functions.sql      |   16 +
 src/backend/catalog/system_views.sql          |   20 +
 src/backend/postmaster/Makefile               |    1 +
 src/backend/postmaster/auxprocess.c           |   19 +
 src/backend/postmaster/bgworker.c             |    7 +
 src/backend/postmaster/datachecksumsworker.c  | 1463 +++++++++++++++++
 src/backend/postmaster/launch_backend.c       |    3 +
 src/backend/postmaster/meson.build            |    1 +
 src/backend/postmaster/postmaster.c           |    5 +
 src/backend/replication/logical/decode.c      |    1 +
 src/backend/storage/ipc/ipci.c                |    4 +
 src/backend/storage/ipc/procsignal.c          |   13 +
 src/backend/storage/page/README               |    4 +-
 src/backend/storage/page/bufpage.c            |    6 +-
 src/backend/utils/activity/pgstat_backend.c   |    2 +
 src/backend/utils/activity/pgstat_io.c        |    2 +
 .../utils/activity/wait_event_names.txt       |    4 +
 src/backend/utils/adt/pgstatfuncs.c           |    8 +-
 src/backend/utils/init/miscinit.c             |   12 +-
 src/backend/utils/init/postinit.c             |   20 +-
 src/backend/utils/misc/guc_tables.c           |   30 +-
 src/bin/pg_checksums/pg_checksums.c           |    2 +-
 src/bin/pg_controldata/pg_controldata.c       |    2 +
 src/bin/pg_upgrade/controldata.c              |    9 +
 src/include/access/xlog.h                     |   17 +-
 src/include/access/xlog_internal.h            |    7 +
 src/include/catalog/pg_control.h              |    6 +-
 src/include/catalog/pg_proc.dat               |   19 +
 src/include/commands/progress.h               |   16 +
 src/include/miscadmin.h                       |    6 +
 src/include/postmaster/datachecksumsworker.h  |   51 +
 src/include/storage/bufpage.h                 |    2 +-
 src/include/storage/checksum.h                |   14 +
 src/include/storage/lwlocklist.h              |    1 +
 src/include/storage/proc.h                    |    6 +-
 src/include/storage/procsignal.h              |    5 +
 src/include/utils/backend_progress.h          |    1 +
 src/test/Makefile                             |   11 +-
 src/test/modules/meson.build                  |    1 +
 src/test/modules/test_checksums/.gitignore    |    2 +
 src/test/modules/test_checksums/Makefile      |   36 +
 src/test/modules/test_checksums/README        |   22 +
 src/test/modules/test_checksums/meson.build   |   34 +
 .../modules/test_checksums/t/001_basic.pl     |   63 +
 .../modules/test_checksums/t/002_restarts.pl  |  110 ++
 .../test_checksums/t/003_standby_restarts.pl  |  114 ++
 .../modules/test_checksums/t/004_offline.pl   |   82 +
 .../modules/test_checksums/t/005_injection.pl |   76 +
 .../test_checksums/t/DataChecksums/Utils.pm   |  185 +++
 .../test_checksums/test_checksums--1.0.sql    |   20 +
 .../modules/test_checksums/test_checksums.c   |  173 ++
 .../test_checksums/test_checksums.control     |    4 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |   36 +
 src/test/regress/expected/rules.out           |   36 +
 src/test/regress/expected/stats.out           |   18 +-
 src/tools/pgindent/typedefs.list              |    6 +
 66 files changed, 3777 insertions(+), 55 deletions(-)
 create mode 100644 src/backend/postmaster/datachecksumsworker.c
 create mode 100644 src/include/postmaster/datachecksumsworker.h
 create mode 100644 src/test/modules/test_checksums/.gitignore
 create mode 100644 src/test/modules/test_checksums/Makefile
 create mode 100644 src/test/modules/test_checksums/README
 create mode 100644 src/test/modules/test_checksums/meson.build
 create mode 100644 src/test/modules/test_checksums/t/001_basic.pl
 create mode 100644 src/test/modules/test_checksums/t/002_restarts.pl
 create mode 100644 src/test/modules/test_checksums/t/003_standby_restarts.pl
 create mode 100644 src/test/modules/test_checksums/t/004_offline.pl
 create mode 100644 src/test/modules/test_checksums/t/005_injection.pl
 create mode 100644 src/test/modules/test_checksums/t/DataChecksums/Utils.pm
 create mode 100644 src/test/modules/test_checksums/test_checksums--1.0.sql
 create mode 100644 src/test/modules/test_checksums/test_checksums.c
 create mode 100644 src/test/modules/test_checksums/test_checksums.control

diff --git a/doc/src/sgml/func/func-admin.sgml b/doc/src/sgml/func/func-admin.sgml
index 446fdfe56f4..69afb61ceb8 100644
--- a/doc/src/sgml/func/func-admin.sgml
+++ b/doc/src/sgml/func/func-admin.sgml
@@ -2959,4 +2959,75 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 
   </sect2>
 
+  <sect2 id="functions-admin-checksum">
+   <title>Data Checksum Functions</title>
+
+   <para>
+    The functions shown in <xref linkend="functions-checksums-table" /> can
+    be used to enable or disable data checksums in a running cluster.
+    See <xref linkend="checksums" /> for details.
+   </para>
+
+   <table id="functions-checksums-table">
+    <title>Data Checksum Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_enable_data_checksums</primary>
+        </indexterm>
+        <function>pg_enable_data_checksums</function> ( <optional><parameter>cost_delay</parameter> <type>int</type>, <parameter>cost_limit</parameter> <type>int</type></optional> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Initiates data checksums for the cluster. This will switch the data
+        checksums mode to <literal>inprogress-on</literal> as well as start a
+        background worker that will process all pages in the database and
+        enable checksums on them. When all data pages have had checksums
+        enabled, the cluster will automatically switch data checksums mode to
+        <literal>on</literal>.
+       </para>
+       <para>
+        If <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter> are
+        specified, the speed of the process is throttled using the same principles as
+        <link linkend="runtime-config-resource-vacuum-cost">Cost-based Vacuum Delay</link>.
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_disable_data_checksums</primary>
+        </indexterm>
+        <function>pg_disable_data_checksums</function> ()
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Disables data checksum validation and calculation for the cluster. This
+        will switch the data checksum mode to <literal>inprogress-off</literal>
+        while data checksums are being disabled. When all active backends have
+        stopped validating data checksums, the data checksum mode will be
+        changed to <literal>off</literal>.  At this point the data pages will
+        still have checksums recorded but they are not updated when pages are
+        modified.
+       </para></entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+
+  </sect2>
+
   </sect1>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index b88cac598e9..a4e16d03aae 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -184,6 +184,8 @@
      (but not the autovacuum workers),
      the <glossterm linkend="glossary-background-writer">background writer</glossterm>,
      the <glossterm linkend="glossary-checkpointer">checkpointer</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker">data checksums worker</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker-launcher">data checksums worker launcher</glossterm>,
      the <glossterm linkend="glossary-logger">logger</glossterm>,
      the <glossterm linkend="glossary-startup-process">startup process</glossterm>,
      the <glossterm linkend="glossary-wal-archiver">WAL archiver</glossterm>,
@@ -573,6 +575,27 @@
    </glossdef>
   </glossentry>
 
+  <glossentry id="glossary-data-checksums-worker">
+   <glossterm>Data Checksums Worker</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which enables or disables data checksums in a specific database.
+    </para>
+   </glossdef>
+  </glossentry>
+
+  <glossentry id="glossary-data-checksums-worker-launcher">
+   <glossterm>Data Checksums Worker Launcher</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which starts <glossterm linkend="glossary-data-checksums-worker"> processes</glossterm>
+     for each database.
+    </para>
+   </glossdef>
+  </glossentry>
+
   <glossentry id="glossary-db-cluster">
    <glossterm>Database cluster</glossterm>
    <glossdef>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 3f4a27a736e..6082d991497 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3527,8 +3527,9 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Number of data page checksum failures detected in this
-       database (or on a shared object), or NULL if data checksums are
-       disabled.
+       database (or on a shared object).
+       Detected failures are reported regardless of the
+       <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -3538,8 +3539,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Time at which the last data page checksum failure was detected in
-       this database (or on a shared object), or NULL if data checksums are
-       disabled.
+       this database (or on a shared object). Last failure is reported
+       regardless of the <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -6877,6 +6878,205 @@ FROM pg_stat_get_backend_idset() AS backendid;
 
  </sect2>
 
+ <sect2 id="data-checksum-progress-reporting">
+  <title>Data Checksum Progress Reporting</title>
+
+  <indexterm>
+   <primary>pg_stat_progress_data_checksums</primary>
+  </indexterm>
+
+  <para>
+   When data checksums are being enabled on a running cluster, the
+   <structname>pg_stat_progress_data_checksums</structname> view will contain
+   a row for the launcher process, and one row for each worker process which
+   is currently calculating checksums for the data pages in one database.
+  </para>
+
+  <table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
+   <title><structname>pg_stat_progress_data_checksums</structname> View</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        Column Type
+       </para>
+       <para>
+        Description>
+       </para>
+      </entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>pid</structfield> <type>integer</type>
+       </para>
+       <para>
+        Process ID of a datachecksumworker process.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datid</structfield> <type>oid</type>
+      </para>
+      <para>
+       OID of this database, or 0 for the launcher process
+       relation
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datname</structfield> <type>name</type>
+      </para>
+      <para>
+       Name of this database, or <literal>NULL</literal> for the
+       launcher process.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>phase</structfield> <type>text</type>
+       </para>
+       <para>
+        Current processing phase, see <xref linkend="datachecksum-phases"/>
+        for description of the phases.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of databases which will be processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this set to <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of databases which have been processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this set to <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of relations which will be processed, or
+        <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of relations yet. The launcher process has
+        this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of relations which have been processed. The launcher
+        process has this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>blocks_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks in the current relation which will be processed,
+        or <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of blocks yet. The launcher process has
+        this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>blocks_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks in the current relation which have been processed.
+        The launcher process has this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+  <table id="datachecksum-phases">
+   <title>Data Checksum Phases</title>
+   <tgroup cols="2">
+    <colspec colname="col1" colwidth="1*"/>
+    <colspec colname="col2" colwidth="2*"/>
+    <thead>
+     <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+    <tbody>
+     <row>
+      <entry><literal>enabling</literal></entry>
+      <entry>
+       The command is currently enabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>disabling</literal></entry>
+      <entry>
+       The command is currently disabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on temporary tables</literal></entry>
+      <entry>
+       The command is currently waiting for all temporary tables which existed
+       at the time the command was started to be removed.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on checkpoint</literal></entry>
+      <entry>
+       The command is currently waiting for a checkpoint to update the checksum
+       state before finishing.
+      </entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+ </sect2>
+
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index 95043aa329c..0343710af53 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -45,6 +45,12 @@ PostgreSQL documentation
    exit status is nonzero if the operation failed.
   </para>
 
+  <para>
+   When enabling checksums, if checksums were in the process of being enabled
+   when the cluster was shut down, <application>pg_checksums</application>
+   will still process all relations regardless of the online processing.
+  </para>
+
   <para>
    When verifying checksums, every file in the cluster is scanned. When
    enabling checksums, each relation file block with a changed checksum is
diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index 8838fe7f022..efa3f5b0bcc 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -263,6 +263,17 @@ make check-world PG_TEST_EXTRA='kerberos ldap ssl load_balance libpq_encryption'
 </programlisting>
    The following values are currently supported:
    <variablelist>
+    <varlistentry>
+     <term><literal>checksum_delays</literal></term>
+     <listitem>
+      <para>
+       Runs additional tests for enabling data checksums which inject delays
+       and re-tries in the processing.  Some of these test suites requires
+       injection points enabled in the installation.
+      </para>
+     </listitem>
+    </varlistentry>
+
     <varlistentry>
      <term><literal>kerberos</literal></term>
      <listitem>
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index f3b86b26be9..0ada90ca0b1 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -246,9 +246,10 @@
   <para>
    Checksums can be disabled when the cluster is initialized using <link
    linkend="app-initdb-data-checksums"><application>initdb</application></link>.
-   They can also be enabled or disabled at a later time as an offline
-   operation. Data checksums are enabled or disabled at the full cluster
-   level, and cannot be specified individually for databases or tables.
+   They can also be enabled or disabled at a later time either as an offline
+   operation or online in a running cluster allowing concurrent access. Data
+   checksums are enabled or disabled at the full cluster level, and cannot be
+   specified individually for databases or tables.
   </para>
 
   <para>
@@ -265,7 +266,7 @@
   </para>
 
   <sect2 id="checksums-offline-enable-disable">
-   <title>Off-line Enabling of Checksums</title>
+   <title>Offline Enabling of Checksums</title>
 
    <para>
     The <link linkend="app-pgchecksums"><application>pg_checksums</application></link>
@@ -274,6 +275,56 @@
    </para>
 
   </sect2>
+
+  <sect2 id="checksums-online-enable-disable">
+   <title>Online Enabling of Checksums</title>
+
+   <para>
+    Checksums can be enabled or disabled online, by calling the appropriate
+    <link linkend="functions-admin-checksum">functions</link>.
+   </para>
+
+   <para>
+    Enabling checksums will put the cluster checksum mode in
+    <literal>inprogress-on</literal> mode.  During this time, checksums will be
+    written but not verified. In addition to this, a background worker process
+    is started that enables checksums on all existing data in the cluster. Once
+    this worker has completed processing all databases in the cluster, the
+    checksum mode will automatically switch to <literal>on</literal>. The
+    processing will consume two background worker processes, make sure that
+    <varname>max_worker_processes</varname> allows for at least two more
+    additional processes.
+   </para>
+
+   <para>
+    The process will initially wait for all open transactions to finish before
+    it starts, so that it can be certain that there are no tables that have been
+    created inside a transaction that has not committed yet and thus would not
+    be visible to the process enabling checksums. It will also, for each database,
+    wait for all pre-existing temporary tables to get removed before it finishes.
+    If long-lived temporary tables are used in the application it may be necessary
+    to terminate these application connections to allow the process to complete.
+   </para>
+
+   <para>
+    If the cluster is stopped while in <literal>inprogress-on</literal> mode, for
+    any reason, then this process must be restarted manually. To do this,
+    re-execute the function <function>pg_enable_data_checksums()</function>
+    once the cluster has been restarted. The process will start over, there is
+    no support for resuming work from where it was interrupted.
+   </para>
+
+   <note>
+    <para>
+     Enabling checksums can cause significant I/O to the system, as most of the
+     database pages will need to be rewritten, and will be written both to the
+     data files and the WAL. The impact may be limited by throttling using the
+     <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter>
+     parameters of the <function>pg_enable_data_checksums</function> function.
+    </para>
+   </note>
+
+  </sect2>
  </sect1>
 
   <sect1 id="wal-intro">
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index cd6c2a2f650..c50d654db30 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -18,6 +18,7 @@
 #include "access/xlog.h"
 #include "access/xlog_internal.h"
 #include "catalog/pg_control.h"
+#include "storage/bufpage.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -167,6 +168,26 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
 		memcpy(&wal_level, rec, sizeof(int));
 		appendStringInfo(buf, "wal_level %s", get_wal_level_string(wal_level));
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state xlrec;
+
+		memcpy(&xlrec, rec, sizeof(xl_checksum_state));
+		switch (xlrec.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_VERSION:
+				appendStringInfoString(buf, "on");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				appendStringInfoString(buf, "inprogress-off");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				appendStringInfoString(buf, "inprogress-on");
+				break;
+			default:
+				appendStringInfoString(buf, "off");
+		}
+	}
 }
 
 const char *
@@ -218,6 +239,9 @@ xlog_identify(uint8 info)
 		case XLOG_CHECKPOINT_REDO:
 			id = "CHECKPOINT_REDO";
 			break;
+		case XLOG_CHECKSUMS:
+			id = "CHECKSUMS";
+			break;
 	}
 
 	return id;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index e8909406686..914359c8a65 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -561,6 +561,9 @@ typedef struct XLogCtlData
 	 */
 	XLogRecPtr	lastFpwDisableRecPtr;
 
+	/* last data_checksum_version we've seen */
+	uint32		data_checksum_version;
+
 	slock_t		info_lck;		/* locks shared variables shown above */
 } XLogCtlData;
 
@@ -658,6 +661,36 @@ static XLogRecPtr LocalMinRecoveryPoint;
 static TimeLineID LocalMinRecoveryPointTLI;
 static bool updateMinRecoveryPoint = true;
 
+/*
+ * Local state fror Controlfile data_checksum_version.  After initialization
+ * this is only updated when absorbing a procsignal barrier during interrupt
+ * processing.  The reason for keeping a copy in backend-private memory is to
+ * avoid locking for interrogating checksum state.  Possible values are the
+ * checksum versions defined in storage/bufpage.h as well as zero when data
+ * checksums are disabled.
+ */
+static uint32 LocalDataChecksumVersion = 0;
+
+/*
+ * Flag to remember if the procsignalbarrier being absorbed for checksums is
+ * the first one.  The first procsignalbarrier can in rare cases be for the
+ * state we've initialized, i.e. a duplicate.  This may happen for any
+ * data_checksum_version value, but for PG_DATA_CHECKSUM_ON_VERSION this would
+ * trigger an assert failure (this is the only transition with an assert) when
+ * processing the barrier.  This may happen if the process is spawned between
+ * the update of XLogCtl->data_checksum_version and the barrier being emitted.
+ * This can only happen on the very first barrier so mark that with this flag.
+ */
+static bool InitialDataChecksumTransition = true;
+
+/*
+ * Variable backing the GUC, keep it in sync with LocalDataChecksumVersion.
+ * See SetLocalDataChecksumVersion().
+ */
+int			data_checksums = 0;
+
+static void SetLocalDataChecksumVersion(uint32 data_checksum_version);
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -726,6 +759,8 @@ static void WALInsertLockAcquireExclusive(void);
 static void WALInsertLockRelease(void);
 static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);
 
+static void XLogChecksums(uint32 new_type);
+
 /*
  * Insert an XLOG record represented by an already-constructed chain of data
  * chunks.  This is a low-level routine; to construct the WAL record header
@@ -839,9 +874,10 @@ XLogInsertRecord(XLogRecData *rdata,
 		 * only happen just after a checkpoint, so it's better to be slow in
 		 * this case and fast otherwise.
 		 *
-		 * Also check to see if fullPageWrites was just turned on or there's a
-		 * running backup (which forces full-page writes); if we weren't
-		 * already doing full-page writes then go back and recompute.
+		 * Also check to see if fullPageWrites was just turned on, there's a
+		 * running backup or if checksums are enabled (all of which forces
+		 * full-page writes); if we weren't already doing full-page writes
+		 * then go back and recompute.
 		 *
 		 * If we aren't doing full-page writes then RedoRecPtr doesn't
 		 * actually affect the contents of the XLOG record, so we'll update
@@ -854,7 +890,9 @@ XLogInsertRecord(XLogRecData *rdata,
 			Assert(RedoRecPtr < Insert->RedoRecPtr);
 			RedoRecPtr = Insert->RedoRecPtr;
 		}
-		doPageWrites = (Insert->fullPageWrites || Insert->runningBackups > 0);
+		doPageWrites = (Insert->fullPageWrites ||
+						Insert->runningBackups > 0 ||
+						DataChecksumsNeedWrite());
 
 		if (doPageWrites &&
 			(!prevDoPageWrites ||
@@ -4362,6 +4400,12 @@ InitControlFile(uint64 sysidentifier, uint32 data_checksum_version)
 	ControlFile->wal_log_hints = wal_log_hints;
 	ControlFile->track_commit_timestamp = track_commit_timestamp;
 	ControlFile->data_checksum_version = data_checksum_version;
+
+	/*
+	 * Set the data_checksum_version value into XLogCtl, which is where all
+	 * processes get the current value from. (Maybe it should go just there?)
+	 */
+	XLogCtl->data_checksum_version = data_checksum_version;
 }
 
 static void
@@ -4685,10 +4729,6 @@ ReadControlFile(void)
 		(SizeOfXLogLongPHD - SizeOfXLogShortPHD);
 
 	CalculateCheckpointSegments();
-
-	/* Make the initdb settings visible as GUC variables, too */
-	SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no",
-					PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
 }
 
 /*
@@ -4722,13 +4762,374 @@ GetMockAuthenticationNonce(void)
 }
 
 /*
- * Are checksums enabled for data pages?
+ * DataChecksumsNeedWrite
+ *		Returns whether data checksums must be written or not
+ *
+ * Returns true iff data checksums are enabled or are in the process of being
+ * enabled. During "inprogress-on" and "inprogress-off" states checksums must
+ * be written even though they are not verified (see datachecksumsworker.c for
+ * a longer discussion).
+ *
+ * This function is intended for callsites which are about to write a data page
+ * to storage, and need to know whether to re-calculate the checksum for the
+ * page header. Calling this function must be performed as close to the write
+ * operation as possible to keep the critical section short.
+ */
+bool
+DataChecksumsNeedWrite(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * DataChecksumsNeedVerify
+ *		Returns whether data checksums must be verified or not
+ *
+ * Data checksums are only verified if they are fully enabled in the cluster.
+ * During the "inprogress-on" and "inprogress-off" states they are only
+ * updated, not verified (see datachecksumsworker.c for a longer discussion).
+ *
+ * This function is intended for callsites which have read data and are about
+ * to perform checksum validation based on the result of this.  Calling this
+ * function must be performed as close to the validation call as possible to
+ * keep the critical section short. This is in order to protect against time of
+ * check/time of use situations around data checksum validation.
+ */
+bool
+DataChecksumsNeedVerify(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
+}
+
+/*
+ * DataChecksumsOnInProgress
+ *		Returns whether data checksums are being enabled
+ *
+ * Most operations don't need to worry about the "inprogress" states, and
+ * should use DataChecksumsNeedVerify() or DataChecksumsNeedWrite(). The
+ * "inprogress-on" state for enabling checksums is used when the checksum
+ * worker is setting checksums on all pages, it can thus be used to check for
+ * aborted checksum processing which need to be restarted.
+ */
+inline bool
+DataChecksumsOnInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+}
+
+/*
+ * DataChecksumsOffInProgress
+ *		Returns whether data checksums are being disabled
+ *
+ * The "inprogress-off" state for disabling checksums is used for when the
+ * worker resets the catalog state.  DataChecksumsNeedVerify() or
+ * DataChecksumsNeedWrite() should be used for deciding whether to read/write
+ * checksums.
  */
 bool
-DataChecksumsEnabled(void)
+DataChecksumsOffInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * SetDataChecksumsOnInProgress
+ *		Sets the data checksum state to "inprogress-on" to enable checksums
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on". See
+ * SetDataChecksumsOn below for a description on how this state change works.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOnInProgress(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile != NULL);
+
+	/*
+	 * The state transition is performed in a critical section with
+	 * checkpoints held off to provide crash safety.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state change in all backends to ensure that all backends are in
+	 * "inprogress-on". Once done we know that all backends are writing data
+	 * checksums.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOn
+ *		Enables data checksums cluster-wide
+ *
+ * Enabling data checksums is performed using two barriers, the first one to
+ * set the state to "inprogress-on" (done by SetDataChecksumsOnInProgress())
+ * and the second one to set the state to "on" (done here). Below is a short
+ * description of the processing, a more detailed write-up can be found in
+ * datachecksumsworker.c.
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on".  This state
+ * requires data checksums to be written but not verified. This ensures that
+ * all data pages can be checksummed without the risk of false negatives in
+ * validation during the process.  When all existing pages are guaranteed to
+ * have checksums, and all new pages will be initiated with checksums, the
+ * state can be changed to "on". Once the state is "on" checksums will be both
+ * written and verified. See datachecksumsworker.c for a longer discussion on
+ * how data checksums can be enabled in a running cluster.
+ *
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOn(void)
 {
+	uint64		barrier;
+
 	Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+
+	/*
+	 * The only allowed state transition to "on" is from "inprogress-on" since
+	 * that state ensures that all pages will have data checksums written.
+	 */
+	if (XLogCtl->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		SpinLockRelease(&XLogCtl->info_lck);
+		elog(ERROR, "checksums not in \"inprogress-on\" mode");
+	}
+
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	INJECTION_POINT("datachecksums-enable-checksums-delay", NULL);
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_VERSION);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state transition of "on" in all backends. When done we know that
+	 * data checksums are enabled in all backends and data checksums are both
+	 * written and verified.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOff
+ *		Disables data checksums cluster-wide
+ *
+ * Disabling data checksums must be performed with two sets of barriers, each
+ * carrying a different state. The state is first set to "inprogress-off"
+ * during which checksums are still written but not verified. This ensures that
+ * backends which have yet to observe the state change from "on" won't get
+ * validation errors on concurrently modified pages. Once all backends have
+ * changed to "inprogress-off", the barrier for moving to "off" can be emitted.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOff(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+
+	/* If data checksums are already disabled there is nothing to do */
+	if (XLogCtl->data_checksum_version == 0)
+	{
+		SpinLockRelease(&XLogCtl->info_lck);
+		return;
+	}
+
+	/*
+	 * If data checksums are currently enabled we first transition to the
+	 * "inprogress-off" state during which backends continue to write
+	 * checksums without verifying them. When all backends are in
+	 * "inprogress-off" the next transition to "off" can be performed, after
+	 * which all data checksum processing is disabled.
+	 */
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
+	{
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+		START_CRIT_SECTION();
+
+		XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+
+		END_CRIT_SECTION();
+		MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+		/*
+		 * Update local state in all backends to ensure that any backend in
+		 * "on" state is changed to "inprogress-off".
+		 */
+		WaitForProcSignalBarrier(barrier);
+
+		/*
+		 * At this point we know that no backends are verifying data checksums
+		 * during reading. Next, we can safely move to state "off" to also
+		 * stop writing checksums.
+		 */
+	}
+	else
+	{
+		/*
+		 * Ending up here implies that the checksums state is "inprogress-on"
+		 * or "inprogress-off" and we can transition directly to "off" from
+		 * there.
+		 */
+		SpinLockRelease(&XLogCtl->info_lck);
+	}
+
+	/*
+	 * Ensure that we don't incur a checkpoint during disabling checksums.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(0);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = 0;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * ProcSignalBarrier absorption functions for enabling and disabling data
+ * checksums in a running cluster. The procsignalbarriers are emitted in the
+ * SetDataChecksums* functions.
+ */
+bool
+AbsorbChecksumsOnInProgressBarrier(void)
+{
+	Assert(LocalDataChecksumVersion != PG_DATA_CHECKSUM_VERSION);
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	return true;
+}
+
+bool
+AbsorbChecksumsOnBarrier(void)
+{
+	/*
+	 * If the process was spawned between updating XLogCtl and emitting the
+	 * barrier it will have seen the updated value, so for the first barrier
+	 * we accept both "on" and "inprogress-on".
+	 */
+	Assert((LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION) ||
+		   (InitialDataChecksumTransition &&
+			(LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)));
+
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_VERSION);
+	InitialDataChecksumTransition = false;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffInProgressBarrier(void)
+{
+	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+	return true;
+}
+
+bool
+AbsorbChecksumsOffBarrier(void)
+{
+	/*
+	 * We should never get here directly from a cluster with data checksums
+	 * enabled, an inprogress state should be in between.  When there are no
+	 * failures the inprogress-off state should preceed, but in case of error
+	 * in processing we can also reach here from the inprogress-on state.
+	 */
+	Assert((LocalDataChecksumVersion != PG_DATA_CHECKSUM_VERSION) &&
+		   (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION));
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_OFF);
+	return true;
+}
+
+/*
+ * InitLocalControlData
+ *
+ * Set up backend local caches of controldata variables which may change at
+ * any point during runtime and thus require special cased locking. So far
+ * this only applies to data_checksum_version, but it's intended to be general
+ * purpose enough to handle future cases.
+ */
+void
+InitLocalDataChecksumVersion(void)
+{
+	SpinLockAcquire(&XLogCtl->info_lck);
+	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+	SpinLockRelease(&XLogCtl->info_lck);
+}
+
+void
+SetLocalDataChecksumVersion(uint32 data_checksum_version)
+{
+	LocalDataChecksumVersion = data_checksum_version;
+
+	data_checksums = data_checksum_version;
+}
+
+/* guc hook */
+const char *
+show_data_checksums(void)
+{
+	if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)
+		return "on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		return "inprogress-on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+		return "inprogress-off";
+	else
+		return "off";
 }
 
 /*
@@ -5003,6 +5404,7 @@ LocalProcessControlFile(bool reset)
 	Assert(reset || ControlFile == NULL);
 	ControlFile = palloc(sizeof(ControlFileData));
 	ReadControlFile();
+	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
 }
 
 /*
@@ -5172,6 +5574,11 @@ XLOGShmemInit(void)
 	XLogCtl->InstallXLogFileSegmentActive = false;
 	XLogCtl->WalWriterSleeping = false;
 
+	/* Use the checksum info from control file */
+	XLogCtl->data_checksum_version = ControlFile->data_checksum_version;
+
+	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+
 	SpinLockInit(&XLogCtl->Insert.insertpos_lck);
 	SpinLockInit(&XLogCtl->info_lck);
 	pg_atomic_init_u64(&XLogCtl->logInsertResult, InvalidXLogRecPtr);
@@ -6320,6 +6727,47 @@ StartupXLOG(void)
 	pfree(endOfRecoveryInfo->recoveryStopReason);
 	pfree(endOfRecoveryInfo);
 
+	/*
+	 * If we reach this point with checksums in the state inprogress-on, it
+	 * means that data checksums were in the process of being enabled when the
+	 * cluster shut down. Since processing didn't finish, the operation will
+	 * have to be restarted from scratch since there is no capability to
+	 * continue where it was when the cluster shut down. Thus, revert the
+	 * state back to off, and inform the user with a warning message. Being
+	 * able to restart processing is a TODO, but it wouldn't be possible to
+	 * restart here since we cannot launch a dynamic background worker
+	 * directly from here (it has to be from a regular backend).
+	 */
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		XLogChecksums(0);
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = 0;
+		SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		ereport(WARNING,
+				(errmsg("data checksums state has been set of off"),
+				 errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
+	}
+
+	/*
+	 * If data checksums were being disabled when the cluster was shut down,
+	 * we know that we have a state where all backends have stopped validating
+	 * checksums and we can move to off instead of prompting the user to
+	 * perform any action.
+	 */
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+	{
+		XLogChecksums(0);
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = 0;
+		SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+		SpinLockRelease(&XLogCtl->info_lck);
+	}
+
 	/*
 	 * All done with end-of-recovery actions.
 	 *
@@ -6611,7 +7059,7 @@ GetRedoRecPtr(void)
 	XLogRecPtr	ptr;
 
 	/*
-	 * The possibly not up-to-date copy in XlogCtl is enough. Even if we
+	 * The possibly not up-to-date copy in XLogCtl is enough. Even if we
 	 * grabbed a WAL insertion lock to read the authoritative value in
 	 * Insert->RedoRecPtr, someone might update it just after we've released
 	 * the lock.
@@ -7175,6 +7623,12 @@ CreateCheckPoint(int flags)
 	checkPoint.fullPageWrites = Insert->fullPageWrites;
 	checkPoint.wal_level = wal_level;
 
+	/*
+	 * Get the current data_checksum_version value from xlogctl, valid at the
+	 * time of the checkpoint.
+	 */
+	checkPoint.data_checksum_version = XLogCtl->data_checksum_version;
+
 	if (shutdown)
 	{
 		XLogRecPtr	curInsert = XLogBytePosToRecPtr(Insert->CurrBytePos);
@@ -7430,6 +7884,9 @@ CreateCheckPoint(int flags)
 	ControlFile->minRecoveryPoint = InvalidXLogRecPtr;
 	ControlFile->minRecoveryPointTLI = 0;
 
+	/* make sure we start with the checksum version as of the checkpoint */
+	ControlFile->data_checksum_version = checkPoint.data_checksum_version;
+
 	/*
 	 * Persist unloggedLSN value. It's reset on crash recovery, so this goes
 	 * unused on non-shutdown checkpoints, but seems useful to store it always
@@ -7575,6 +8032,10 @@ CreateEndOfRecoveryRecord(void)
 	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
 	ControlFile->minRecoveryPoint = recptr;
 	ControlFile->minRecoveryPointTLI = xlrec.ThisTimeLineID;
+
+	/* start with the latest checksum version (as of the end of recovery) */
+	ControlFile->data_checksum_version = XLogCtl->data_checksum_version;
+
 	UpdateControlFile();
 	LWLockRelease(ControlFileLock);
 
@@ -7916,6 +8377,10 @@ CreateRestartPoint(int flags)
 			if (flags & CHECKPOINT_IS_SHUTDOWN)
 				ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY;
 		}
+
+		/* we shall start with the latest checksum version */
+		ControlFile->data_checksum_version = lastCheckPoint.data_checksum_version;
+
 		UpdateControlFile();
 	}
 	LWLockRelease(ControlFileLock);
@@ -8327,6 +8792,24 @@ XLogReportParameters(void)
 	}
 }
 
+/*
+ * Log the new state of checksums
+ */
+static void
+XLogChecksums(uint32 new_type)
+{
+	xl_checksum_state xlrec;
+	XLogRecPtr	recptr;
+
+	xlrec.new_checksumtype = new_type;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, sizeof(xl_checksum_state));
+
+	recptr = XLogInsert(RM_XLOG_ID, XLOG_CHECKSUMS);
+	XLogFlush(recptr);
+}
+
 /*
  * Update full_page_writes in shared memory, and write an
  * XLOG_FPW_CHANGE record if necessary.
@@ -8745,6 +9228,46 @@ xlog_redo(XLogReaderState *record)
 	{
 		/* nothing to do here, just for informational purposes */
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state state;
+		uint64		barrier;
+
+		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = state.new_checksumtype;
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		/*
+		 * Block on a procsignalbarrier to await all processes having seen the
+		 * change to checksum status. Once the barrier has been passed we can
+		 * initiate the corresponding processing.
+		 */
+		switch (state.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			default:
+				Assert(state.new_checksumtype == 0);
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+		}
+	}
 }
 
 /*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 8c3090165f0..337932a89e5 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -26,6 +26,7 @@
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "postmaster/datachecksumsworker.h"
 #include "replication/walreceiver.h"
 #include "storage/fd.h"
 #include "storage/latch.h"
@@ -748,3 +749,45 @@ pg_promote(PG_FUNCTION_ARGS)
 						   wait_seconds)));
 	PG_RETURN_BOOL(false);
 }
+
+/*
+ * Disables data checksums for the cluster, if applicable. Starts a background
+ * worker which turns off the data checksums.
+ */
+Datum
+disable_data_checksums(PG_FUNCTION_ARGS)
+{
+	bool		fast = PG_GETARG_BOOL(0);
+
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	StartDataChecksumsWorkerLauncher(DISABLE_DATACHECKSUMS, 0, 0, fast);
+	PG_RETURN_VOID();
+}
+
+/*
+ * Enables data checksums for the cluster, if applicable.  Supports vacuum-
+ * like cost based throttling to limit system load. Starts a background worker
+ * which updates data checksums on existing data.
+ */
+Datum
+enable_data_checksums(PG_FUNCTION_ARGS)
+{
+	int			cost_delay = PG_GETARG_INT32(0);
+	int			cost_limit = PG_GETARG_INT32(1);
+	bool		fast = PG_GETARG_BOOL(2);
+
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	if (cost_delay < 0)
+		ereport(ERROR, errmsg("cost delay cannot be a negative value"));
+
+	if (cost_limit <= 0)
+		ereport(ERROR, errmsg("cost limit must be greater than zero"));
+
+	StartDataChecksumsWorkerLauncher(ENABLE_DATACHECKSUMS, cost_delay, cost_limit, fast);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index bb7d90aa5d9..54dcfbcb333 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1613,7 +1613,8 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	 * enabled for this cluster, and if this is a relation file, then verify
 	 * the checksum.
 	 */
-	if (!noverify_checksums && DataChecksumsEnabled() &&
+	if (!noverify_checksums &&
+		DataChecksumsNeedWrite() &&
 		RelFileNumberIsValid(relfilenumber))
 		verify_checksum = true;
 
@@ -2007,6 +2008,9 @@ verify_page_checksum(Page page, XLogRecPtr start_lsn, BlockNumber blkno,
 	if (PageIsNew(page) || PageGetLSN(page) >= start_lsn)
 		return true;
 
+	if (!DataChecksumsNeedVerify())
+		return true;
+
 	/* Perform the actual checksum calculation. */
 	checksum = pg_checksum_page(page, blkno);
 
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 566f308e443..dea7ad3cf30 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -650,6 +650,18 @@ LANGUAGE INTERNAL
 CALLED ON NULL INPUT VOLATILE PARALLEL SAFE
 AS 'pg_stat_reset_slru';
 
+CREATE OR REPLACE FUNCTION
+  pg_enable_data_checksums(cost_delay integer DEFAULT 0,
+                           cost_limit integer DEFAULT 100,
+						   fast boolean DEFAULT false)
+  RETURNS void STRICT VOLATILE LANGUAGE internal AS 'enable_data_checksums'
+  PARALLEL RESTRICTED;
+
+CREATE OR REPLACE FUNCTION
+  pg_disable_data_checksums(fast boolean DEFAULT false)
+  RETURNS void STRICT VOLATILE LANGUAGE internal AS 'disable_data_checksums'
+  PARALLEL RESTRICTED;
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
@@ -775,6 +787,10 @@ REVOKE EXECUTE ON FUNCTION pg_ls_logicalmapdir() FROM PUBLIC;
 
 REVOKE EXECUTE ON FUNCTION pg_ls_replslotdir(text) FROM PUBLIC;
 
+REVOKE EXECUTE ON FUNCTION pg_enable_data_checksums(integer, integer, boolean) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_disable_data_checksums(boolean) FROM public;
+
 --
 -- We also set up some things as accessible to standard roles.
 --
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 1b3c5a55882..22f67c7ee4a 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1354,6 +1354,26 @@ CREATE VIEW pg_stat_progress_copy AS
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_data_checksums AS
+    SELECT
+        S.pid AS pid, S.datid, D.datname AS datname,
+        CASE S.param1 WHEN 0 THEN 'enabling'
+                      WHEN 1 THEN 'disabling'
+                      WHEN 2 THEN 'waiting'
+                      WHEN 3 THEN 'waiting on temporary tables'
+                      WHEN 4 THEN 'waiting on checkpoint'
+                      WHEN 5 THEN 'done'
+                      END AS phase,
+        CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+        S.param3 AS databases_done,
+        CASE S.param4 WHEN -1 THEN NULL ELSE S.param4 END AS relations_total,
+        CASE S.param5 WHEN -1 THEN NULL ELSE S.param5 END AS relations_done,
+        CASE S.param6 WHEN -1 THEN NULL ELSE S.param6 END AS blocks_total,
+        CASE S.param7 WHEN -1 THEN NULL ELSE S.param7 END AS blocks_done
+    FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid
+    ORDER BY S.datid; -- return the launcher process first
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/postmaster/Makefile b/src/backend/postmaster/Makefile
index 0f4435d2d97..0c36765acfe 100644
--- a/src/backend/postmaster/Makefile
+++ b/src/backend/postmaster/Makefile
@@ -18,6 +18,7 @@ OBJS = \
 	bgworker.o \
 	bgwriter.o \
 	checkpointer.o \
+	datachecksumsworker.o \
 	fork_process.o \
 	interrupt.o \
 	launch_backend.o \
diff --git a/src/backend/postmaster/auxprocess.c b/src/backend/postmaster/auxprocess.c
index a6d3630398f..5742a1dd724 100644
--- a/src/backend/postmaster/auxprocess.c
+++ b/src/backend/postmaster/auxprocess.c
@@ -15,6 +15,7 @@
 #include <unistd.h>
 #include <signal.h>
 
+#include "access/xlog.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/auxprocess.h"
@@ -68,6 +69,24 @@ AuxiliaryProcessMainCommon(void)
 
 	ProcSignalInit(NULL, 0);
 
+	/*
+	 * Initialize a local cache of the data_checksum_version, to be updated by
+	 * the procsignal-based barriers.
+	 *
+	 * This intentionally happens after initializing the procsignal, otherwise
+	 * we might miss a state change. This means we can get a barrier for the
+	 * state we've just initialized - but it can happen only once.
+	 *
+	 * The postmaster (which is what gets forked into the new child process)
+	 * does not handle barriers, therefore it may not have the current value
+	 * of LocalDataChecksumVersion value (it'll have the value read from the
+	 * control file, which may be arbitrarily old).
+	 *
+	 * NB: Even if the postmaster handled barriers, the value might still be
+	 * stale, as it might have changed after this process forked.
+	 */
+	InitLocalDataChecksumVersion();
+
 	/*
 	 * Auxiliary processes don't run transactions, but they may need a
 	 * resource owner anyway to manage buffer pins acquired outside
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 1ad65c237c3..0d2ade1f905 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -18,6 +18,7 @@
 #include "pgstat.h"
 #include "port/atomics.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "replication/logicallauncher.h"
 #include "replication/logicalworker.h"
@@ -132,6 +133,12 @@ static const struct
 	},
 	{
 		"TablesyncWorkerMain", TablesyncWorkerMain
+	},
+	{
+		"DataChecksumsWorkerLauncherMain", DataChecksumsWorkerLauncherMain
+	},
+	{
+		"DataChecksumsWorkerMain", DataChecksumsWorkerMain
 	}
 };
 
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
new file mode 100644
index 00000000000..ff451d502ba
--- /dev/null
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -0,0 +1,1463 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.c
+ *	  Background worker for enabling or disabling data checksums online
+ *
+ * When enabling data checksums on a database at initdb time or when shut down
+ * with pg_checksums, no extra process is required as each page is checksummed,
+ * and verified, when accessed.  When enabling checksums on an already running
+ * cluster, this worker will ensure that all pages are checksummed before
+ * verification of the checksums is turned on. In the case of disabling
+ * checksums, the state transition is performed only in the control file, no
+ * changes are performed on the data pages.
+ *
+ * Checksums can be either enabled or disabled cluster-wide, with on/off being
+ * the end state for data_checksums.
+ *
+ * Enabling checksums
+ * ------------------
+ * When enabling checksums in an online cluster, data_checksums will be set to
+ * "inprogress-on" which signals that write operations MUST compute and write
+ * the checksum on the data page, but during reading the checksum SHALL NOT be
+ * verified. This ensures that all objects created during checksumming will
+ * have checksums set, but no reads will fail due to incorrect checksum. The
+ * DataChecksumsWorker will compile a list of databases which exist at the
+ * start of checksumming, and all of these which haven't been dropped during
+ * the processing MUST have been processed successfully in order for checksums
+ * to be enabled. Any new relation created during processing will see the
+ * in-progress state and will automatically be checksummed.
+ *
+ * For each database, all relations which have storage are read and every data
+ * page is marked dirty to force a write with the checksum. This will generate
+ * a lot of WAL as the entire database is read and written.
+ *
+ * If the processing is interrupted by a cluster restart, it will be restarted
+ * from the beginning again as state isn't persisted.
+ *
+ * Disabling checksums
+ * -------------------
+ * When disabling checksums, data_checksums will be set to "inprogress-off"
+ * which signals that checksums are written but no longer verified. This ensure
+ * that backends which have yet to move from the "on" state will still be able
+ * to process data checksum validation.
+ *
+ * Synchronization and Correctness
+ * -------------------------------
+ * The processes involved in enabling, or disabling, data checksums in an
+ * online cluster must be properly synchronized with the normal backends
+ * serving concurrent queries to ensure correctness. Correctness is defined
+ * as the following:
+ *
+ *    - Backends SHALL NOT violate local data_checksums state
+ *    - Data checksums SHALL NOT be considered enabled cluster-wide until all
+ *      currently connected backends have the local state "enabled"
+ *
+ * There are two levels of synchronization required for enabling data checksums
+ * in an online cluster: (i) changing state in the active backends ("on",
+ * "off", "inprogress-on" and "inprogress-off"), and (ii) ensuring no
+ * incompatible objects and processes are left in a database when workers end.
+ * The former deals with cluster-wide agreement on data checksum state and the
+ * latter with ensuring that any concurrent activity cannot break the data
+ * checksum contract during processing.
+ *
+ * Synchronizing the state change is done with procsignal barriers, where the
+ * WAL logging backend updating the global state in the controlfile will wait
+ * for all other backends to absorb the barrier. Barrier absorption will happen
+ * during interrupt processing, which means that connected backends will change
+ * state at different times. To prevent data checksum state changes when
+ * writing and verifying checksums, interrupts shall be held off before
+ * interrogating state and resumed when the IO operation has been performed.
+ *
+ *   When Enabling Data Checksums
+ *   ----------------------------
+ *   A process which fails to observe data checksums being enabled can induce
+ *   two types of errors: failing to write the checksum when modifying the page
+ *   and failing to validate the data checksum on the page when reading it.
+ *
+ *   When processing starts all backends belong to one of the below sets, with
+ *   one set being empty:
+ *
+ *   Bd: Backends in "off" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   If processing is started in an online cluster then all backends are in Bd.
+ *   If processing was halted by the cluster shutting down, the controlfile
+ *   state "inprogress-on" will be observed on system startup and all backends
+ *   will be in Bd. Backends transition Bd -> Bi via a procsignalbarrier.  When
+ *   the DataChecksumsWorker has finished writing checksums on all pages and
+ *   enables data checksums cluster-wide, there are four sets of backends where
+ *   Bd shall be an empty set:
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   Backends in Bi and Be will write checksums when modifying a page, but only
+ *   backends in Be will verify the checksum during reading. The Bg backend is
+ *   blocked waiting for all backends in Bi to process interrupts and move to
+ *   Be. Any backend starting while Bg is waiting on the procsignalbarrier will
+ *   observe the global state being "on" and will thus automatically belong to
+ *   Be.  Checksums are enabled cluster-wide when Bi is an empty set. Bi and Be
+ *   are compatible sets while still operating based on their local state as
+ *   both write data checksums.
+ *
+ *   When Disabling Data Checksums
+ *   -----------------------------
+ *   A process which fails to observe that data checksums have been disabled
+ *   can induce two types of errors: writing the checksum when modifying the
+ *   page and validating a data checksum which is no longer correct due to
+ *   modifications to the page.
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bo: Backends in "inprogress-off" state
+ *
+ *   Backends transition from the Be state to Bd like so: Be -> Bo -> Bd
+ *
+ *   The goal is to transition all backends to Bd making the others empty sets.
+ *   Backends in Bo write data checksums, but don't validate them, such that
+ *   backends still in Be can continue to validate pages until the barrier has
+ *   been absorbed such that they are in Bo. Once all backends are in Bo, the
+ *   barrier to transition to "off" can be raised and all backends can safely
+ *   stop writing data checksums as no backend is enforcing data checksum
+ *   validation any longer.
+ *
+ *
+ * Potential optimizations
+ * -----------------------
+ * Below are some potential optimizations and improvements which were brought
+ * up during reviews of this feature, but which weren't implemented in the
+ * initial version. These are ideas listed without any validation on their
+ * feasibility or potential payoff. More discussion on these can be found on
+ * the -hackers threads linked to in the commit message of this feature.
+ *
+ *   * Launching datachecksumsworker for resuming operation from the startup
+ *     process: Currently users have to restart processing manually after a
+ *     restart since dynamic background worker cannot be started from the
+ *     postmaster. Changing the startup process could make restarting the
+ *     processing automatic on cluster restart.
+ *   * Avoid dirtying the page when checksums already match: Iff the checksum
+ *     on the page happens to already match we still dirty the page. It should
+ *     be enough to only do the log_newpage_buffer() call in that case.
+ *   * Invent a lightweight WAL record that doesn't contain the full-page
+ *     image but just the block number: On replay, the redo routine would read
+ *     the page from disk.
+ *   * Teach pg_checksums to avoid checksummed pages when pg_checksums is used
+ *     to enable checksums on a cluster which is in inprogress-on state and
+ *     may have checksummed pages (make pg_checksums be able to resume an
+ *     online operation).
+ *
+ *
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/postmaster/datachecksumsworker.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/htup_details.h"
+#include "access/xact.h"
+#include "access/xloginsert.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_database.h"
+#include "commands/progress.h"
+#include "commands/vacuum.h"
+#include "common/relpath.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "postmaster/bgworker.h"
+#include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/bufmgr.h"
+#include "storage/checksum.h"
+#include "storage/ipc.h"
+#include "storage/lmgr.h"
+#include "storage/lwlock.h"
+#include "storage/procarray.h"
+#include "storage/smgr.h"
+#include "tcop/tcopprot.h"
+#include "utils/fmgroids.h"
+#include "utils/injection_point.h"
+#include "utils/lsyscache.h"
+#include "utils/ps_status.h"
+#include "utils/syscache.h"
+
+/*
+ * Number of times we retry to open a database before giving up and consider
+ * it to have failed processing.
+ */
+#define DATACHECKSUMSWORKER_MAX_DB_RETRIES 5
+
+/*
+ * Signaling between backends calling pg_enable/disable_data_checksums, the
+ * checksums launcher process, and the checksums worker process.
+ *
+ * This struct is protected by DataChecksumsWorkerLock
+ */
+typedef struct DataChecksumsWorkerShmemStruct
+{
+	/*
+	 * These are set by pg_{enable|disable|verify}_data_checksums, to tell the
+	 * launcher what the target state is.
+	 */
+	DataChecksumsWorkerOperation launch_operation;
+	int			launch_cost_delay;
+	int			launch_cost_limit;
+	bool		launch_fast;
+
+	/*
+	 * Is a launcher process is currently running?
+	 *
+	 * This is set by the launcher process, after it has read the above
+	 * launch_* parameters.
+	 */
+	bool		launcher_running;
+
+	/*
+	 * These fields indicate the target state that the launcher is currently
+	 * working towards. They can be different from the corresponding launch_*
+	 * fields, if a new pg_enable/disable_data_checksums() call was made while
+	 * the launcher/worker was already running.
+	 *
+	 * The below members are set when the launcher starts, and are only
+	 * accessed read-only by the single worker. Thus, we can access these
+	 * without a lock. If multiple workers, or dynamic cost parameters, are
+	 * supported at some point then this would need to be revisited.
+	 */
+	DataChecksumsWorkerOperation operation;
+	int			cost_delay;
+	int			cost_limit;
+	bool		immediate_checkpoint;
+
+	/*
+	 * Signaling between the launcher and the worker process.
+	 *
+	 * As there is only a single worker, and the launcher won't read these
+	 * until the worker exits, they can be accessed without the need for a
+	 * lock. If multiple workers are supported then this will have to be
+	 * revisited.
+	 */
+
+	/* result, set by worker before exiting */
+	DataChecksumsWorkerResult success;
+
+	/*
+	 * tells the worker process whether it should also process the shared
+	 * catalogs
+	 */
+	bool		process_shared_catalogs;
+} DataChecksumsWorkerShmemStruct;
+
+/* Shared memory segment for datachecksumsworker */
+static DataChecksumsWorkerShmemStruct *DataChecksumsWorkerShmem;
+
+typedef struct DataChecksumsWorkerDatabase
+{
+	Oid			dboid;
+	char	   *dbname;
+} DataChecksumsWorkerDatabase;
+
+typedef struct DataChecksumsWorkerResultEntry
+{
+	Oid			dboid;
+	DataChecksumsWorkerResult result;
+	int			retries;
+} DataChecksumsWorkerResultEntry;
+
+
+/*
+ * Flag set by the interrupt handler
+ */
+static volatile sig_atomic_t abort_requested = false;
+
+/*
+ * Have we set the DataChecksumsWorkerShmemStruct->launcher_running flag?
+ * If we have, we need to clear it before exiting!
+ */
+static volatile sig_atomic_t launcher_running = false;
+
+/*
+ * Are we enabling data checksums, or disabling them?
+ */
+static DataChecksumsWorkerOperation operation;
+
+/* Prototypes */
+static List *BuildDatabaseList(void);
+static List *BuildRelationList(bool temp_relations, bool include_shared);
+static void FreeDatabaseList(List *dblist);
+static DataChecksumsWorkerResult ProcessDatabase(DataChecksumsWorkerDatabase *db);
+static bool ProcessAllDatabases(bool immediate_checkpoint);
+static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy);
+static void launcher_cancel_handler(SIGNAL_ARGS);
+static void WaitForAllTransactionsToFinish(void);
+
+/*
+ * StartDataChecksumsWorkerLauncher
+ *		Main entry point for datachecksumsworker launcher process
+ *
+ * The main entrypoint for starting data checksums processing for enabling as
+ * well as disabling.
+ */
+void
+StartDataChecksumsWorkerLauncher(DataChecksumsWorkerOperation op,
+								 int cost_delay,
+								 int cost_limit,
+								 bool fast)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	bool		launcher_running;
+
+#ifdef USE_ASSERT_CHECKING
+	/* The cost delay settings have no effect when disabling */
+	if (op == DISABLE_DATACHECKSUMS)
+		Assert(cost_delay == 0 && cost_limit == 0);
+#endif
+
+	/* Store the desired state in shared memory */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	DataChecksumsWorkerShmem->launch_operation = op;
+	DataChecksumsWorkerShmem->launch_cost_delay = cost_delay;
+	DataChecksumsWorkerShmem->launch_cost_limit = cost_limit;
+	DataChecksumsWorkerShmem->launch_fast = fast;
+
+	/* is the launcher already running? */
+	launcher_running = DataChecksumsWorkerShmem->launcher_running;
+
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * Launch a new launcher process, if it's not running already.
+	 *
+	 * If the launcher is currently busy enabling the checksums, and we want
+	 * them disabled (or vice versa), the launcher will notice that at latest
+	 * when it's about to exit, and will loop back process the new request. So
+	 * if the launcher is already running, we don't need to do anything more
+	 * here to abort it.
+	 *
+	 * If you call pg_enable/disable_data_checksums() twice in a row, before
+	 * the launcher has had a chance to start up, we still end up launching it
+	 * twice.  That's OK, the second invocation will see that a launcher is
+	 * already running and exit quickly.
+	 *
+	 * TODO: We could optimize here and skip launching the launcher, if we are
+	 * already in the desired state, i.e. if the checksums are already enabled
+	 * and you call pg_enable_data_checksums().
+	 */
+	if (!launcher_running)
+	{
+		/*
+		 * Prepare the BackgroundWorker and launch it.
+		 */
+		memset(&bgw, 0, sizeof(bgw));
+		bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+		bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+		snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+		snprintf(bgw.bgw_function_name, BGW_MAXLEN, "DataChecksumsWorkerLauncherMain");
+		snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker launcher");
+		snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker launcher");
+		bgw.bgw_restart_time = BGW_NEVER_RESTART;
+		bgw.bgw_notify_pid = MyProcPid;
+		bgw.bgw_main_arg = (Datum) 0;
+
+		if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+			ereport(ERROR,
+					errmsg("failed to start background worker to process data checksums"));
+	}
+}
+
+/*
+ * ProcessSingleRelationFork
+ *		Enable data checksums in a single relation/fork.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy)
+{
+	BlockNumber numblocks = RelationGetNumberOfBlocksInFork(reln, forkNum);
+	char		activity[NAMEDATALEN * 2 + 128];
+	char	   *relns;
+
+	relns = get_namespace_name(RelationGetNamespace(reln));
+
+	if (!relns)
+		return false;
+
+	/* Report the current relation to pgstat_activity */
+	snprintf(activity, sizeof(activity) - 1, "processing: %s.%s (%s, %dblocks)",
+			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	/*
+	 * As of now we only update the block counter for main forks in order to
+	 * not cause too frequent calls. TODO: investigate whether we should do it
+	 * more frequent?
+	 */
+	if (forkNum == MAIN_FORKNUM)
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+									 numblocks);
+
+	/*
+	 * We are looping over the blocks which existed at the time of process
+	 * start, which is safe since new blocks are created with checksums set
+	 * already due to the state being "inprogress-on".
+	 */
+	for (BlockNumber blknum = 0; blknum < numblocks; blknum++)
+	{
+		Buffer		buf = ReadBufferExtended(reln, forkNum, blknum, RBM_NORMAL, strategy);
+
+		/* Need to get an exclusive lock before we can flag as dirty */
+		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+
+		/*
+		 * Mark the buffer as dirty and force a full page write.  We have to
+		 * re-write the page to WAL even if the checksum hasn't changed,
+		 * because if there is a replica it might have a slightly different
+		 * version of the page with an invalid checksum, caused by unlogged
+		 * changes (e.g. hintbits) on the master happening while checksums
+		 * were off. This can happen if there was a valid checksum on the page
+		 * at one point in the past, so only when checksums are first on, then
+		 * off, and then turned on again.  TODO: investigate if this could be
+		 * avoided if the checksum is calculated to be correct and wal_level
+		 * is set to "minimal",
+		 */
+		START_CRIT_SECTION();
+		MarkBufferDirty(buf);
+		log_newpage_buffer(buf, false);
+		END_CRIT_SECTION();
+
+		UnlockReleaseBuffer(buf);
+
+		/*
+		 * This is the only place where we check if we are asked to abort, the
+		 * abortion will bubble up from here. It's safe to check this without
+		 * a lock, because if we miss it being set, we will try again soon.
+		 */
+		Assert(operation == ENABLE_DATACHECKSUMS);
+		if (DataChecksumsWorkerShmem->launch_operation == DISABLE_DATACHECKSUMS)
+			abort_requested = true;
+
+		if (abort_requested)
+			return false;
+
+		/*
+		 * As of now we only update the block counter for main forks in order
+		 * to not cause too frequent calls. TODO: investigate whether we
+		 * should do it more frequent?
+		 */
+		if (forkNum == MAIN_FORKNUM)
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+										 (blknum + 1));
+
+		vacuum_delay_point(false);
+	}
+
+	pfree(relns);
+	return true;
+}
+
+/*
+ * ProcessSingleRelationByOid
+ *		Process a single relation based on oid.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationByOid(Oid relationId, BufferAccessStrategy strategy)
+{
+	Relation	rel;
+	bool		aborted = false;
+
+	StartTransactionCommand();
+
+	rel = try_relation_open(relationId, AccessShareLock);
+	if (rel == NULL)
+	{
+		/*
+		 * Relation no longer exists. We don't consider this an error since
+		 * there are no pages in it that need data checksums, and thus return
+		 * true. The worker operates off a list of relations generated at the
+		 * start of processing, so relations being dropped in the meantime is
+		 * to be expected.
+		 */
+		CommitTransactionCommand();
+		pgstat_report_activity(STATE_IDLE, NULL);
+		return true;
+	}
+	RelationGetSmgr(rel);
+
+	for (ForkNumber fnum = 0; fnum <= MAX_FORKNUM; fnum++)
+	{
+		if (smgrexists(rel->rd_smgr, fnum))
+		{
+			if (!ProcessSingleRelationFork(rel, fnum, strategy))
+			{
+				aborted = true;
+				break;
+			}
+		}
+	}
+	relation_close(rel, AccessShareLock);
+	elog(DEBUG2,
+		 "data checksum processing done for relation with OID %u: %s",
+		 relationId, (aborted ? "aborted" : "finished"));
+
+	CommitTransactionCommand();
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return !aborted;
+}
+
+/*
+ * ProcessDatabase
+ *		Enable data checksums in a single database.
+ *
+ * We do this by launching a dynamic background worker into this database, and
+ * waiting for it to finish.  We have to do this in a separate worker, since
+ * each process can only be connected to one database during its lifetime.
+ */
+static DataChecksumsWorkerResult
+ProcessDatabase(DataChecksumsWorkerDatabase *db)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	BgwHandleStatus status;
+	pid_t		pid;
+	char		activity[NAMEDATALEN + 64];
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_FAILED;
+
+	memset(&bgw, 0, sizeof(bgw));
+	bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+	bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+	snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+	snprintf(bgw.bgw_function_name, BGW_MAXLEN, "%s", "DataChecksumsWorkerMain");
+	snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker worker");
+	snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker worker");
+	bgw.bgw_restart_time = BGW_NEVER_RESTART;
+	bgw.bgw_notify_pid = MyProcPid;
+	bgw.bgw_main_arg = ObjectIdGetDatum(db->dboid);
+
+	/*
+	 * If there are no worker slots available, make sure we retry processing
+	 * this database. This will make the datachecksumsworker move on to the
+	 * next database and quite likely fail with the same problem. TODO: Maybe
+	 * we need a backoff to avoid running through all the databases here in
+	 * short order.
+	 */
+	if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+	{
+		ereport(WARNING,
+				errmsg("failed to start worker for enabling data checksums in database \"%s\", retrying",
+					   db->dbname),
+				errhint("The max_worker_processes setting might be too low."));
+		return DATACHECKSUMSWORKER_RETRYDB;
+	}
+
+	status = WaitForBackgroundWorkerStartup(bgw_handle, &pid);
+	if (status == BGWH_STOPPED)
+	{
+		ereport(WARNING,
+				errmsg("could not start background worker for enabling data checksums in database \"%s\"",
+					   db->dbname),
+				errhint("More details on the error might be found in the server log."));
+		return DATACHECKSUMSWORKER_FAILED;
+	}
+
+	/*
+	 * If the postmaster crashed we cannot end up with a processed database so
+	 * we have no alternative other than exiting. When enabling checksums we
+	 * won't at this time have changed the pg_control version to enabled so
+	 * when the cluster comes back up processing will have to be restarted.
+	 * When disabling, the pg_control version will be set to off before this
+	 * so when the cluster comes up checksums will be off as expected.
+	 */
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				errmsg("cannot enable data checksums without the postmaster process"),
+				errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums()."));
+
+	Assert(status == BGWH_STARTED);
+	ereport(DEBUG1,
+			errmsg("initiating data checksum processing in database \"%s\"",
+				   db->dbname));
+
+	snprintf(activity, sizeof(activity) - 1,
+			 "Waiting for worker in database %s (pid %ld)", db->dbname, (long) pid);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	status = WaitForBackgroundWorkerShutdown(bgw_handle);
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				errmsg("postmaster exited during data checksum processing in \"%s\"",
+					   db->dbname),
+				errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums()."));
+
+	if (DataChecksumsWorkerShmem->success == DATACHECKSUMSWORKER_ABORTED)
+		ereport(LOG,
+				errmsg("data checksums processing was aborted in database \"%s\"",
+					   db->dbname));
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return DataChecksumsWorkerShmem->success;
+}
+
+/*
+ * launcher_exit
+ *
+ * Internal routine for cleaning up state when the launcher process exits. We
+ * need to clean up the abort flag to ensure that processing can be restarted
+ * again after it was previously aborted.
+ */
+static void
+launcher_exit(int code, Datum arg)
+{
+	if (launcher_running)
+	{
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		launcher_running = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		LWLockRelease(DataChecksumsWorkerLock);
+	}
+}
+
+/*
+ * launcher_cancel_handler
+ *
+ * Internal routine for reacting to SIGINT and flagging the worker to abort.
+ * The worker won't be interrupted immediately but will check for abort flag
+ * between each block in a relation.
+ */
+static void
+launcher_cancel_handler(SIGNAL_ARGS)
+{
+	int			save_errno = errno;
+
+	abort_requested = true;
+
+	/*
+	 * There is no sleeping in the main loop, the flag will be checked
+	 * periodically in ProcessSingleRelationFork. The worker does however
+	 * sleep when waiting for concurrent transactions to end so we still need
+	 * to set the latch.
+	 */
+	SetLatch(MyLatch);
+
+	errno = save_errno;
+}
+
+/*
+ * WaitForAllTransactionsToFinish
+ *		Blocks awaiting all current transactions to finish
+ *
+ * Returns when all transactions which are active at the call of the function
+ * have ended, or if the postmaster dies while waiting. If the postmaster dies
+ * the abort flag will be set to indicate that the caller of this shouldn't
+ * proceed.
+ *
+ * NB: this will return early, if aborted by SIGINT or if the target state
+ * is changed while we're running.
+ */
+static void
+WaitForAllTransactionsToFinish(void)
+{
+	TransactionId waitforxid;
+
+	LWLockAcquire(XidGenLock, LW_SHARED);
+	waitforxid = XidFromFullTransactionId(TransamVariables->nextXid);
+	LWLockRelease(XidGenLock);
+
+	while (TransactionIdPrecedes(GetOldestActiveTransactionId(false, true), waitforxid))
+	{
+		char		activity[64];
+		int			rc;
+
+		/* Oldest running xid is older than us, so wait */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for current transactions to finish (waiting for %u)",
+				 waitforxid);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 3 seconds */
+		ResetLatch(MyLatch);
+		rc = WaitLatch(MyLatch,
+					   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+					   3000,
+					   WAIT_EVENT_CHECKSUM_ENABLE_STARTCONDITION);
+
+		/*
+		 * If the postmaster died we won't be able to enable checksums
+		 * cluster-wide so abort and hope to continue when restarted.
+		 */
+		if (rc & WL_POSTMASTER_DEATH)
+			ereport(FATAL,
+					errmsg("postmaster exited during data checksum processing"),
+					errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums()."));
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_SHARED);
+		if (DataChecksumsWorkerShmem->launch_operation != operation)
+			abort_requested = true;
+		LWLockRelease(DataChecksumsWorkerLock);
+		if (abort_requested)
+			break;
+	}
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+	return;
+}
+
+/*
+ * DataChecksumsWorkerLauncherMain
+ *
+ * Main function for launching dynamic background workers for processing data
+ * checksums in databases. This function has the bgworker management, with
+ * ProcessAllDatabases being responsible for looping over the databases and
+ * initiating processing.
+ */
+void
+DataChecksumsWorkerLauncherMain(Datum arg)
+{
+	on_shmem_exit(launcher_exit, 0);
+
+	ereport(DEBUG1,
+			errmsg("background worker \"datachecksumsworker\" launcher started"));
+
+	pqsignal(SIGTERM, die);
+	pqsignal(SIGINT, launcher_cancel_handler);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_LAUNCHER;
+	init_ps_display(NULL);
+
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	if (DataChecksumsWorkerShmem->launcher_running)
+	{
+		/* Launcher was already running, let it finish */
+		LWLockRelease(DataChecksumsWorkerLock);
+		return;
+	}
+
+	launcher_running = true;
+
+	/*
+	 * Initialize a connection to shared catalogs only.
+	 */
+	BackgroundWorkerInitializeConnectionByOid(InvalidOid, InvalidOid, 0);
+
+	operation = DataChecksumsWorkerShmem->launch_operation;
+	DataChecksumsWorkerShmem->launcher_running = true;
+	DataChecksumsWorkerShmem->operation = operation;
+	DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+	DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+	DataChecksumsWorkerShmem->immediate_checkpoint = DataChecksumsWorkerShmem->launch_fast;
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * The target state can change while we are busy enabling/disabling
+	 * checksums, if the user calls pg_disable/enable_data_checksums() before
+	 * we are finished with the previous request. In that case, we will loop
+	 * back here, to process the new request.
+	 */
+again:
+
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	if (operation == ENABLE_DATACHECKSUMS)
+	{
+		/*
+		 * If we are asked to enable checksums in a cluster which already has
+		 * checksums enabled, exit immediately as there is nothing more to do.
+		 * Hold interrupts to make sure state doesn't change during checking.
+		 */
+		HOLD_INTERRUPTS();
+		if (DataChecksumsNeedVerify())
+		{
+			RESUME_INTERRUPTS();
+			goto done;
+		}
+		RESUME_INTERRUPTS();
+
+		/*
+		 * Set the state to inprogress-on and wait on the procsignal barrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_ENABLING);
+		SetDataChecksumsOnInProgress();
+
+		/*
+		 * All backends are now in inprogress-on state and are writing data
+		 * checksums.  Start processing all data at rest.
+		 */
+		if (!ProcessAllDatabases(DataChecksumsWorkerShmem->immediate_checkpoint))
+		{
+			/*
+			 * If the target state changed during processing then it's not a
+			 * failure, so restart processing instead.
+			 */
+			LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+			if (DataChecksumsWorkerShmem->launch_operation != operation)
+			{
+				LWLockRelease(DataChecksumsWorkerLock);
+				goto done;
+			}
+			LWLockRelease(DataChecksumsWorkerLock);
+			ereport(ERROR,
+					errmsg("unable to enable data checksums in cluster"));
+		}
+
+		/*
+		 * Data checksums have been set on all pages, set the state to on in
+		 * order to instruct backends to validate checksums on reading.
+		 */
+		SetDataChecksumsOn();
+	}
+	else
+	{
+		int			flags;
+
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_DISABLING);
+		SetDataChecksumsOff();
+
+		flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+		if (DataChecksumsWorkerShmem->immediate_checkpoint)
+			flags = flags | CHECKPOINT_FAST;
+		RequestCheckpoint(flags);
+	}
+
+done:
+
+	/*
+	 * All done. But before we exit, check if the target state was changed
+	 * while we were running. In that case we will have to start all over
+	 * again.
+	 */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+	if (DataChecksumsWorkerShmem->launch_operation != operation)
+	{
+		DataChecksumsWorkerShmem->operation = DataChecksumsWorkerShmem->launch_operation;
+		operation = DataChecksumsWorkerShmem->launch_operation;
+		DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+		DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+		LWLockRelease(DataChecksumsWorkerLock);
+		goto again;
+	}
+
+	/* Shut down progress reporting as we are done */
+	pgstat_progress_end_command();
+
+	launcher_running = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	LWLockRelease(DataChecksumsWorkerLock);
+}
+
+/*
+ * ProcessAllDatabases
+ *		Compute the list of all databases and process checksums in each
+ *
+ * This will repeatedly generate a list of databases to process for enabling
+ * checksums. Until no new databases are found, this will loop around computing
+ * a new list and comparing it to the already seen ones.
+ *
+ * If immediate_checkpoint is set to true then a CHECKPOINT_FAST will be
+ * issued. This is useful for testing but should be avoided in production use
+ * as it may affect cluster performance drastically.
+ */
+static bool
+ProcessAllDatabases(bool immediate_checkpoint)
+{
+	List	   *DatabaseList;
+	HTAB	   *ProcessedDatabases = NULL;
+	HASHCTL		hash_ctl;
+	bool		found_failed = false;
+	int			flags;
+
+	/* Initialize a hash tracking all processed databases */
+	memset(&hash_ctl, 0, sizeof(hash_ctl));
+	hash_ctl.keysize = sizeof(Oid);
+	hash_ctl.entrysize = sizeof(DataChecksumsWorkerResultEntry);
+	ProcessedDatabases = hash_create("Processed databases",
+									 64,
+									 &hash_ctl,
+									 HASH_ELEM | HASH_BLOBS);
+
+	/*
+	 * Set up so first run processes shared catalogs, but not once in every
+	 * db.
+	 */
+	DataChecksumsWorkerShmem->process_shared_catalogs = true;
+
+	/*
+	 * Get a list of all databases to process. This may include databases that
+	 * were created during our runtime.  Since a database can be created as a
+	 * copy of any other database (which may not have existed in our last
+	 * run), we have to repeat this loop until no new databases show up in the
+	 * list.
+	 */
+	DatabaseList = BuildDatabaseList();
+
+	/* Allow a test case to modify the initial list of databases */
+	INJECTION_POINT("datachecksumsworker-initial-dblist", DatabaseList);
+
+	/*
+	 * Update progress reporting with the total number of databases we need to
+	 * process.  This number should not be changed during processing, the
+	 * columns for processed databases is instead increased such that it can
+	 * be compared against the total.
+	 */
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_DBS_TOTAL,
+			PROGRESS_DATACHECKSUMS_DBS_DONE,
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE,
+			PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+			PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+		};
+
+		int64		vals[6];
+
+		vals[0] = list_length(DatabaseList);
+		vals[1] = 0;
+
+		/* translated to NULL */
+		vals[2] = -1;
+		vals[3] = -1;
+		vals[4] = -1;
+		vals[5] = -1;
+
+		pgstat_progress_update_multi_param(6, index, vals);
+	}
+
+	while (true)
+	{
+		int			processed_databases = 0;
+
+		foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+		{
+			DataChecksumsWorkerResult result;
+			DataChecksumsWorkerResultEntry *entry;
+			bool		found;
+
+			/*
+			 * Check if this database has been processed already, and if so
+			 * whether it should be retried or skipped.
+			 */
+			entry = (DataChecksumsWorkerResultEntry *) hash_search(ProcessedDatabases, &db->dboid,
+																   HASH_FIND, NULL);
+
+			if (entry)
+			{
+				if (entry->result == DATACHECKSUMSWORKER_RETRYDB)
+				{
+					/*
+					 * Limit the number of retries to avoid infinite looping
+					 * in case there simply won't be enough workers in the
+					 * cluster to finish this operation.
+					 */
+					if (entry->retries > DATACHECKSUMSWORKER_MAX_DB_RETRIES)
+						entry->result = DATACHECKSUMSWORKER_FAILED;
+				}
+
+				/* Skip if this database has been processed already */
+				if (entry->result != DATACHECKSUMSWORKER_RETRYDB)
+					continue;
+			}
+
+			result = ProcessDatabase(db);
+			processed_databases++;
+
+			/*
+			 * Update the number of processed databases in the progress
+			 * report.
+			 */
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_DBS_DONE,
+										 processed_databases);
+
+			/* Allow a test process to alter the result of the operation */
+			INJECTION_POINT("datachecksumsworker-fail-db", &result);
+
+			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
+			{
+				/*
+				 * If one database has completed shared catalogs, we don't
+				 * have to process them again.
+				 */
+				if (DataChecksumsWorkerShmem->process_shared_catalogs)
+					DataChecksumsWorkerShmem->process_shared_catalogs = false;
+			}
+			else if (result == DATACHECKSUMSWORKER_ABORTED)
+			{
+				/* Abort flag set, so exit the whole process */
+				return false;
+			}
+
+			entry = hash_search(ProcessedDatabases, &db->dboid, HASH_ENTER, &found);
+			entry->dboid = db->dboid;
+			entry->result = result;
+			if (!found)
+				entry->retries = 0;
+			else
+				entry->retries++;
+		}
+
+		elog(DEBUG1,
+			 "%i databases processed for data checksum enabling, %s",
+			 processed_databases,
+			 (processed_databases ? "process with restart" : "process completed"));
+
+		FreeDatabaseList(DatabaseList);
+
+		/*
+		 * If no databases were processed in this run of the loop, we have now
+		 * finished all databases and no concurrently created ones can exist.
+		 */
+		if (processed_databases == 0)
+			break;
+
+		/*
+		 * Re-generate the list of databases for another pass. Since we wait
+		 * for all pre-existing transactions finish, this way we can be
+		 * certain that there are no databases left without checksums.
+		 */
+		WaitForAllTransactionsToFinish();
+		DatabaseList = BuildDatabaseList();
+	}
+
+	/*
+	 * ProcessedDatabases now has all databases and the results of their
+	 * processing. Failure to enable checksums for a database can be because
+	 * they actually failed for some reason, or because the database was
+	 * dropped between us getting the database list and trying to process it.
+	 * Get a fresh list of databases to detect the second case where the
+	 * database was dropped before we had started processing it. If a database
+	 * still exists, but enabling checksums failed then we fail the entire
+	 * checksumming process and exit with an error.
+	 */
+	WaitForAllTransactionsToFinish();
+	DatabaseList = BuildDatabaseList();
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+	{
+		DataChecksumsWorkerResultEntry *entry;
+		bool		found;
+
+		entry = hash_search(ProcessedDatabases, (void *) &db->dboid,
+							HASH_FIND, &found);
+
+		/*
+		 * We are only interested in the processed databases which failed, and
+		 * where the failed database still exists.  This indicates that
+		 * enabling checksums actually failed, and not that the failure was
+		 * due to the db being concurrently dropped.
+		 */
+		if (found && entry->result == DATACHECKSUMSWORKER_FAILED)
+		{
+			ereport(WARNING,
+					errmsg("failed to enable data checksums in \"%s\"", db->dbname));
+			found_failed = found;
+			continue;
+		}
+	}
+
+	FreeDatabaseList(DatabaseList);
+
+	if (found_failed)
+	{
+		/* Disable checksums on cluster, because we failed */
+		SetDataChecksumsOff();
+		/* Force a checkpoint to make everything consistent */
+		flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+		if (immediate_checkpoint)
+			flags = flags | CHECKPOINT_FAST;
+		RequestCheckpoint(flags);
+		ereport(ERROR,
+				errmsg("data checksums failed to get enabled in all databases, aborting"),
+				errhint("The server log might have more information on the cause of the error."));
+	}
+
+	/*
+	 * When enabling checksums, we have to wait for a checkpoint for the
+	 * checksums to change from in-progress to on.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT);
+
+	/*
+	 * Force a checkpoint to get everything out to disk. The use of immediate
+	 * checkpoints is for running tests, as they would otherwise not execute
+	 * in such a way that they can reliably be placed under timeout control.
+	 */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_FAST;
+	RequestCheckpoint(flags);
+
+	return true;
+}
+
+/*
+ * DataChecksumsWorkerShmemSize
+ *		Compute required space for datachecksumsworker-related shared memory
+ */
+Size
+DataChecksumsWorkerShmemSize(void)
+{
+	Size		size;
+
+	size = sizeof(DataChecksumsWorkerShmemStruct);
+	size = MAXALIGN(size);
+
+	return size;
+}
+
+/*
+ * DataChecksumsWorkerShmemInit
+ *		Allocate and initialize datachecksumsworker-related shared memory
+ */
+void
+DataChecksumsWorkerShmemInit(void)
+{
+	bool		found;
+
+	DataChecksumsWorkerShmem = (DataChecksumsWorkerShmemStruct *)
+		ShmemInitStruct("DataChecksumsWorker Data",
+						DataChecksumsWorkerShmemSize(),
+						&found);
+
+	if (!found)
+	{
+		MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
+
+		/*
+		 * Even if this is a redundant assignment, we want to be explicit
+		 * about our intent for readability, since we want to be able to query
+		 * this state in case of restartability.
+		 */
+		DataChecksumsWorkerShmem->launch_operation = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		DataChecksumsWorkerShmem->launch_fast = false;
+	}
+}
+
+/*
+ * BuildDatabaseList
+ *		Compile a list of all currently available databases in the cluster
+ *
+ * This creates the list of databases for the datachecksumsworker workers to
+ * add checksums to. If the caller wants to ensure that no concurrently
+ * running CREATE DATABASE calls exist, this needs to be preceded by a call
+ * to WaitForAllTransactionsToFinish().
+ */
+static List *
+BuildDatabaseList(void)
+{
+	List	   *DatabaseList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(DatabaseRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_database pgdb = (Form_pg_database) GETSTRUCT(tup);
+		DataChecksumsWorkerDatabase *db;
+
+		oldctx = MemoryContextSwitchTo(ctx);
+
+		db = (DataChecksumsWorkerDatabase *) palloc0(sizeof(DataChecksumsWorkerDatabase));
+
+		db->dboid = pgdb->oid;
+		db->dbname = pstrdup(NameStr(pgdb->datname));
+
+		DatabaseList = lappend(DatabaseList, db);
+
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return DatabaseList;
+}
+
+static void
+FreeDatabaseList(List *dblist)
+{
+	if (!dblist)
+		return;
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, dblist)
+	{
+		if (db->dbname != NULL)
+			pfree(db->dbname);
+	}
+
+	list_free_deep(dblist);
+}
+
+/*
+ * BuildRelationList
+ *		Compile a list of relations in the database
+ *
+ * Returns a list of OIDs for the request relation types. If temp_relations
+ * is True then only temporary relations are returned. If temp_relations is
+ * False then non-temporary relations which have data checksums are returned.
+ * If include_shared is True then shared relations are included as well in a
+ * non-temporary list. include_shared has no relevance when building a list of
+ * temporary relations.
+ */
+static List *
+BuildRelationList(bool temp_relations, bool include_shared)
+{
+	List	   *RelationList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(RelationRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_class pgc = (Form_pg_class) GETSTRUCT(tup);
+
+		/*
+		 * Only include temporary relations when asked for a temp relation
+		 * list.
+		 */
+		if (pgc->relpersistence == RELPERSISTENCE_TEMP)
+		{
+			if (!temp_relations)
+				continue;
+		}
+		else
+		{
+			/*
+			 * If we are only interested in temp relations then continue
+			 * immediately as the current relation isn't a temp relation.
+			 */
+			if (temp_relations)
+				continue;
+
+			if (!RELKIND_HAS_STORAGE(pgc->relkind))
+				continue;
+
+			if (pgc->relisshared && !include_shared)
+				continue;
+		}
+
+		oldctx = MemoryContextSwitchTo(ctx);
+		RelationList = lappend_oid(RelationList, pgc->oid);
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return RelationList;
+}
+
+/*
+ * DataChecksumsWorkerMain
+ *
+ * Main function for enabling checksums in a single database, This is the
+ * function set as the bgw_function_name in the dynamic background worker
+ * process initiated for each database by the worker launcher. After enabling
+ * data checksums in each applicable relation in the database, it will wait for
+ * all temporary relations that were present when the function started to
+ * disappear before returning. This is required since we cannot rewrite
+ * existing temporary relations with data checksums.
+ */
+void
+DataChecksumsWorkerMain(Datum arg)
+{
+	Oid			dboid = DatumGetObjectId(arg);
+	List	   *RelationList = NIL;
+	List	   *InitialTempTableList = NIL;
+	BufferAccessStrategy strategy;
+	bool		aborted = false;
+	int64		rels_done;
+
+	operation = ENABLE_DATACHECKSUMS;
+
+	pqsignal(SIGTERM, die);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_WORKER;
+	init_ps_display(NULL);
+
+	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
+											  BGWORKER_BYPASS_ALLOWCONN);
+
+	/* worker will have a separate entry in pg_stat_progress_data_checksums */
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	/*
+	 * Get a list of all temp tables present as we start in this database. We
+	 * need to wait until they are all gone until we are done, since we cannot
+	 * access these relations and modify them.
+	 */
+	InitialTempTableList = BuildRelationList(true, false);
+
+	/*
+	 * Enable vacuum cost delay, if any.
+	 */
+	Assert(DataChecksumsWorkerShmem->operation == ENABLE_DATACHECKSUMS);
+	VacuumCostDelay = DataChecksumsWorkerShmem->cost_delay;
+	VacuumCostLimit = DataChecksumsWorkerShmem->cost_limit;
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumCostPageHit = 0;
+	VacuumCostPageMiss = 0;
+	VacuumCostPageDirty = 0;
+
+	/*
+	 * Create and set the vacuum strategy as our buffer strategy.
+	 */
+	strategy = GetAccessStrategy(BAS_VACUUM);
+
+	RelationList = BuildRelationList(false,
+									 DataChecksumsWorkerShmem->process_shared_catalogs);
+
+	/* Update the total number of relations to be processed in this DB. */
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE
+		};
+
+		int64		vals[2];
+
+		vals[0] = list_length(RelationList);
+		vals[1] = 0;
+
+		pgstat_progress_update_multi_param(2, index, vals);
+	}
+
+	/* Process the relations */
+	rels_done = 0;
+	foreach_oid(reloid, RelationList)
+	{
+		if (!ProcessSingleRelationByOid(reloid, strategy))
+		{
+			aborted = true;
+			break;
+		}
+
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_RELS_DONE,
+									 ++rels_done);
+	}
+	list_free(RelationList);
+
+	if (aborted)
+	{
+		DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+		ereport(DEBUG1,
+				errmsg("data checksum processing aborted in database OID %u",
+					   dboid));
+		return;
+	}
+
+	/* The worker is about to wait for temporary tables to go away. */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL);
+
+	/*
+	 * Wait for all temp tables that existed when we started to go away. This
+	 * is necessary since we cannot "reach" them to enable checksums. Any temp
+	 * tables created after we started will already have checksums in them
+	 * (due to the "inprogress-on" state), so no need to wait for those.
+	 */
+	for (;;)
+	{
+		List	   *CurrentTempTables;
+		int			numleft;
+		char		activity[64];
+
+		CurrentTempTables = BuildRelationList(true, false);
+		numleft = 0;
+		foreach_oid(tmptbloid, InitialTempTableList)
+		{
+			if (list_member_oid(CurrentTempTables, tmptbloid))
+				numleft++;
+		}
+		list_free(CurrentTempTables);
+
+		INJECTION_POINT("datachecksumsworker-fake-temptable-wait", &numleft);
+
+		if (numleft == 0)
+			break;
+
+		/*
+		 * At least one temp table is left to wait for, indicate in pgstat
+		 * activity and progress reporting.
+		 */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for %d temp tables to be removed", numleft);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 3 seconds */
+		ResetLatch(MyLatch);
+		(void) WaitLatch(MyLatch,
+						 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+						 3000,
+						 WAIT_EVENT_CHECKSUM_ENABLE_TEMPTABLE_WAIT);
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		aborted = DataChecksumsWorkerShmem->launch_operation != operation;
+		LWLockRelease(DataChecksumsWorkerLock);
+
+		if (aborted || abort_requested)
+		{
+			DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+			ereport(DEBUG1,
+					errmsg("data checksum processing aborted in database OID %u",
+						   dboid));
+			return;
+		}
+	}
+
+	list_free(InitialTempTableList);
+
+	/* worker done */
+	pgstat_progress_end_command();
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
+}
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index bf6b55ee830..955df32be5d 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -204,6 +204,9 @@ static child_process_kind child_process_kinds[] = {
 	[B_WAL_SUMMARIZER] = {"wal_summarizer", WalSummarizerMain, true},
 	[B_WAL_WRITER] = {"wal_writer", WalWriterMain, true},
 
+	[B_DATACHECKSUMSWORKER_LAUNCHER] = {"datachecksum launcher", NULL, false},
+	[B_DATACHECKSUMSWORKER_WORKER] = {"datachecksum worker", NULL, false},
+
 	[B_LOGGER] = {"syslogger", SysLoggerMain, false},
 };
 
diff --git a/src/backend/postmaster/meson.build b/src/backend/postmaster/meson.build
index 0008603cfee..ce10ef1059a 100644
--- a/src/backend/postmaster/meson.build
+++ b/src/backend/postmaster/meson.build
@@ -6,6 +6,7 @@ backend_sources += files(
   'bgworker.c',
   'bgwriter.c',
   'checkpointer.c',
+  'datachecksumsworker.c',
   'fork_process.c',
   'interrupt.c',
   'launch_backend.c',
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index e01d9f0cfe8..f8303992ca5 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2980,6 +2980,11 @@ PostmasterStateMachine(void)
 									B_INVALID,
 									B_STANDALONE_BACKEND);
 
+			/* also add checksumming processes */
+			remainMask = btmask_add(remainMask,
+									B_DATACHECKSUMSWORKER_LAUNCHER,
+									B_DATACHECKSUMSWORKER_WORKER);
+
 			/* All types should be included in targetMask or remainMask */
 			Assert((remainMask.mask | targetMask.mask) == BTYPE_MASK_ALL.mask);
 		}
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index cc03f0706e9..f9f06821a8f 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -186,6 +186,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 		case XLOG_FPW_CHANGE:
 		case XLOG_FPI_FOR_HINT:
 		case XLOG_FPI:
+		case XLOG_CHECKSUMS:
 		case XLOG_OVERWRITE_CONTRECORD:
 		case XLOG_CHECKPOINT_REDO:
 			break;
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 2fa045e6b0f..44213d140ae 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -30,6 +30,8 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "postmaster/postmaster.h"
 #include "postmaster/walsummarizer.h"
 #include "replication/logicallauncher.h"
 #include "replication/origin.h"
@@ -150,6 +152,7 @@ CalculateShmemSize(int *num_semaphores)
 	size = add_size(size, InjectionPointShmemSize());
 	size = add_size(size, SlotSyncShmemSize());
 	size = add_size(size, AioShmemSize());
+	size = add_size(size, DataChecksumsWorkerShmemSize());
 
 	/* include additional requested shmem from preload libraries */
 	size = add_size(size, total_addin_request);
@@ -332,6 +335,7 @@ CreateOrAttachShmemStructs(void)
 	PgArchShmemInit();
 	ApplyLauncherShmemInit();
 	SlotSyncShmemInit();
+	DataChecksumsWorkerShmemInit();
 
 	/*
 	 * Set up other modules that need some shared memory space
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 087821311cc..6881c6f4069 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -18,6 +18,7 @@
 #include <unistd.h>
 
 #include "access/parallel.h"
+#include "access/xlog.h"
 #include "commands/async.h"
 #include "miscadmin.h"
 #include "pgstat.h"
@@ -576,6 +577,18 @@ ProcessProcSignalBarrier(void)
 					case PROCSIGNAL_BARRIER_SMGRRELEASE:
 						processed = ProcessBarrierSmgrRelease();
 						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON:
+						processed = AbsorbChecksumsOnInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_ON:
+						processed = AbsorbChecksumsOnBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF:
+						processed = AbsorbChecksumsOffInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_OFF:
+						processed = AbsorbChecksumsOffBarrier();
+						break;
 				}
 
 				/*
diff --git a/src/backend/storage/page/README b/src/backend/storage/page/README
index e30d7ac59ad..73c36a63908 100644
--- a/src/backend/storage/page/README
+++ b/src/backend/storage/page/README
@@ -10,7 +10,9 @@ http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf, discussed
 2010/12/22 on -hackers list.
 
 Current implementation requires this be enabled system-wide at initdb time, or
-by using the pg_checksums tool on an offline cluster.
+by using the pg_checksums tool on an offline cluster.  Checksums can also be
+enabled at runtime using pg_enable_data_checksums(), and disabled by using
+pg_disable_data_checksums().
 
 The checksum is not valid at all times on a data page!!
 The checksum is valid when the page leaves the shared pool and is checked
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index dbb49ed9197..19cf6512e52 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -107,7 +107,7 @@ PageIsVerified(PageData *page, BlockNumber blkno, int flags, bool *checksum_fail
 	 */
 	if (!PageIsNew(page))
 	{
-		if (DataChecksumsEnabled())
+		if (DataChecksumsNeedVerify())
 		{
 			checksum = pg_checksum_page(page, blkno);
 
@@ -1511,7 +1511,7 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 	static char *pageCopy = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return page;
 
 	/*
@@ -1541,7 +1541,7 @@ void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
 	/* If we don't need a checksum, just return */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return;
 
 	((PageHeader) page)->pd_checksum = pg_checksum_page(page, blkno);
diff --git a/src/backend/utils/activity/pgstat_backend.c b/src/backend/utils/activity/pgstat_backend.c
index 8714a85e2d9..edc2512d79f 100644
--- a/src/backend/utils/activity/pgstat_backend.c
+++ b/src/backend/utils/activity/pgstat_backend.c
@@ -378,6 +378,8 @@ pgstat_tracks_backend_bktype(BackendType bktype)
 		case B_CHECKPOINTER:
 		case B_IO_WORKER:
 		case B_STARTUP:
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 			return false;
 
 		case B_AUTOVAC_WORKER:
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index 13ae57ed649..a290d56f409 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -362,6 +362,8 @@ pgstat_tracks_io_bktype(BackendType bktype)
 		case B_LOGGER:
 			return false;
 
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 		case B_AUTOVAC_LAUNCHER:
 		case B_AUTOVAC_WORKER:
 		case B_BACKEND:
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index 0be307d2ca0..bf709b59543 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -116,6 +116,9 @@ CHECKPOINT_DELAY_COMPLETE	"Waiting for a backend that blocks a checkpoint from c
 CHECKPOINT_DELAY_START	"Waiting for a backend that blocks a checkpoint from starting."
 CHECKPOINT_DONE	"Waiting for a checkpoint to complete."
 CHECKPOINT_START	"Waiting for a checkpoint to start."
+CHECKSUM_ENABLE_STARTCONDITION	"Waiting for data checksums enabling to start."
+CHECKSUM_ENABLE_FINISHCONDITION	"Waiting for data checksums to be enabled."
+CHECKSUM_ENABLE_TEMPTABLE_WAIT	"Waiting for temporary tables to be dropped for data checksums to be enabled."
 EXECUTE_GATHER	"Waiting for activity from a child process while executing a <literal>Gather</literal> plan node."
 HASH_BATCH_ALLOCATE	"Waiting for an elected Parallel Hash participant to allocate a hash table."
 HASH_BATCH_ELECT	"Waiting to elect a Parallel Hash participant to allocate a hash table."
@@ -352,6 +355,7 @@ DSMRegistry	"Waiting to read or update the dynamic shared memory registry."
 InjectionPoint	"Waiting to read or update information related to injection points."
 SerialControl	"Waiting to read or update shared <filename>pg_serial</filename> state."
 AioWorkerSubmissionQueue	"Waiting to access AIO worker submission queue."
+DataChecksumsWorker	"Waiting for data checksumsworker."
 
 #
 # END OF PREDEFINED LWLOCKS (DO NOT CHANGE THIS LINE)
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index c756c2bebaa..f4e264ebf33 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -274,6 +274,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		cmdtype = PROGRESS_COMMAND_BASEBACKUP;
 	else if (pg_strcasecmp(cmd, "COPY") == 0)
 		cmdtype = PROGRESS_COMMAND_COPY;
+	else if (pg_strcasecmp(cmd, "DATACHECKSUMS") == 0)
+		cmdtype = PROGRESS_COMMAND_DATACHECKSUMS;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -1146,9 +1148,6 @@ pg_stat_get_db_checksum_failures(PG_FUNCTION_ARGS)
 	int64		result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
@@ -1164,9 +1163,6 @@ pg_stat_get_db_checksum_last_failure(PG_FUNCTION_ARGS)
 	TimestampTz result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index 545d1e90fbd..34cce2ce0be 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -293,9 +293,18 @@ GetBackendTypeDesc(BackendType backendType)
 		case B_CHECKPOINTER:
 			backendDesc = gettext_noop("checkpointer");
 			break;
+
 		case B_IO_WORKER:
 			backendDesc = gettext_noop("io worker");
 			break;
+
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+			backendDesc = "datachecksumsworker launcher";
+			break;
+		case B_DATACHECKSUMSWORKER_WORKER:
+			backendDesc = "datachecksumsworker worker";
+			break;
+
 		case B_LOGGER:
 			backendDesc = gettext_noop("logger");
 			break;
@@ -895,7 +904,8 @@ InitializeSessionUserIdStandalone(void)
 	 * workers, in slot sync worker and in background workers.
 	 */
 	Assert(!IsUnderPostmaster || AmAutoVacuumWorkerProcess() ||
-		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess());
+		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess() ||
+		   AmDataChecksumsWorkerProcess());
 
 	/* call only once */
 	Assert(!OidIsValid(AuthenticatedUserId));
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 641e535a73c..589e7eab9e8 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -750,6 +750,24 @@ InitPostgres(const char *in_dbname, Oid dboid,
 
 	ProcSignalInit(MyCancelKey, MyCancelKeyLength);
 
+	/*
+	 * Initialize a local cache of the data_checksum_version, to be updated by
+	 * the procsignal-based barriers.
+	 *
+	 * This intentionally happens after initializing the procsignal, otherwise
+	 * we might miss a state change. This means we can get a barrier for the
+	 * state we've just initialized - but it can happen only once.
+	 *
+	 * The postmaster (which is what gets forked into the new child process)
+	 * does not handle barriers, therefore it may not have the current value
+	 * of LocalDataChecksumVersion value (it'll have the value read from the
+	 * control file, which may be arbitrarily old).
+	 *
+	 * NB: Even if the postmaster handled barriers, the value might still be
+	 * stale, as it might have changed after this process forked.
+	 */
+	InitLocalDataChecksumVersion();
+
 	/*
 	 * Also set up timeout handlers needed for backend operation.  We need
 	 * these in every case except bootstrap.
@@ -878,7 +896,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
 					 errhint("You should immediately run CREATE USER \"%s\" SUPERUSER;.",
 							 username != NULL ? username : "postgres")));
 	}
-	else if (AmBackgroundWorkerProcess())
+	else if (AmBackgroundWorkerProcess() || AmDataChecksumsWorkerProcess())
 	{
 		if (username == NULL && !OidIsValid(useroid))
 		{
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index d14b1678e7f..de4d21de504 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -491,6 +491,14 @@ static const struct config_enum_entry file_copy_method_options[] = {
 	{NULL, 0, false}
 };
 
+static const struct config_enum_entry data_checksums_options[] = {
+	{"on", PG_DATA_CHECKSUM_VERSION, true},
+	{"off", PG_DATA_CHECKSUM_OFF, true},
+	{"inprogress-on", PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION, true},
+	{"inprogress-off", PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION, true},
+	{NULL, 0, false}
+};
+
 /*
  * Options for enum values stored in other modules
  */
@@ -616,7 +624,6 @@ static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
 static int	num_os_semaphores;
-static bool data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
@@ -1968,17 +1975,6 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
-	{
-		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
-			gettext_noop("Shows whether data checksums are turned on for this cluster."),
-			NULL,
-			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
-		},
-		&data_checksums,
-		false,
-		NULL, NULL, NULL
-	},
-
 	{
 		{"syslog_sequence_numbers", PGC_SIGHUP, LOGGING_WHERE,
 			gettext_noop("Add sequence number to syslog messages to avoid duplicate suppression."),
@@ -5417,6 +5413,16 @@ struct config_enum ConfigureNamesEnum[] =
 		DEFAULT_IO_METHOD, io_method_options,
 		NULL, assign_io_method, NULL
 	},
+	{
+		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows whether data checksums are turned on for this cluster."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
+		},
+		&data_checksums,
+		PG_DATA_CHECKSUM_OFF, data_checksums_options,
+		NULL, NULL, show_data_checksums
+	},
 
 	/* End-of-list marker */
 	{
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index f20be82862a..4e9e85fdc54 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -576,7 +576,7 @@ main(int argc, char *argv[])
 		mode == PG_MODE_DISABLE)
 		pg_fatal("data checksums are already disabled in cluster");
 
-	if (ControlFile->data_checksum_version > 0 &&
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION &&
 		mode == PG_MODE_ENABLE)
 		pg_fatal("data checksums are already enabled in cluster");
 
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index 10de058ce91..acf5c7b026e 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -280,6 +280,8 @@ main(int argc, char *argv[])
 		   ControlFile->checkPointCopy.oldestCommitTsXid);
 	printf(_("Latest checkpoint's newestCommitTsXid:%u\n"),
 		   ControlFile->checkPointCopy.newestCommitTsXid);
+	printf(_("Latest checkpoint's data_checksum_version:%u\n"),
+		   ControlFile->checkPointCopy.data_checksum_version);
 	printf(_("Time of latest checkpoint:            %s\n"),
 		   ckpttime_str);
 	printf(_("Fake LSN counter for unlogged rels:   %X/%08X\n"),
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index 90cef0864de..29684e82440 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -15,6 +15,7 @@
 #include "access/xlog_internal.h"
 #include "common/string.h"
 #include "pg_upgrade.h"
+#include "storage/bufpage.h"
 
 
 /*
@@ -736,6 +737,14 @@ check_control_data(ControlData *oldctrl,
 	 * check_for_isn_and_int8_passing_mismatch().
 	 */
 
+	/*
+	 * If data checksums are in any in-progress state then disallow the
+	 * upgrade. The user should either let the process finish, or turn off
+	 * data checksums, before retrying.
+	 */
+	if (oldctrl->data_checksum_version > PG_DATA_CHECKSUM_VERSION)
+		pg_fatal("checksums are being enabled in the old cluster");
+
 	/*
 	 * We might eventually allow upgrades from checksum to no-checksum
 	 * clusters.
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index d12798be3d8..8bcc5aa8a63 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -56,6 +56,7 @@ extern PGDLLIMPORT int CommitDelay;
 extern PGDLLIMPORT int CommitSiblings;
 extern PGDLLIMPORT bool track_wal_io_timing;
 extern PGDLLIMPORT int wal_decode_buffer_size;
+extern PGDLLIMPORT int data_checksums;
 
 extern PGDLLIMPORT int CheckPointSegments;
 
@@ -117,7 +118,7 @@ extern PGDLLIMPORT int wal_level;
  * of the bits make it to disk, but the checksum wouldn't match.  Also WAL-log
  * them if forced by wal_log_hints=on.
  */
-#define XLogHintBitIsNeeded() (DataChecksumsEnabled() || wal_log_hints)
+#define XLogHintBitIsNeeded() (wal_log_hints || DataChecksumsNeedWrite())
 
 /* Do we need to WAL-log information required only for Hot Standby and logical replication? */
 #define XLogStandbyInfoActive() (wal_level >= WAL_LEVEL_REPLICA)
@@ -229,7 +230,19 @@ extern XLogRecPtr GetXLogWriteRecPtr(void);
 
 extern uint64 GetSystemIdentifier(void);
 extern char *GetMockAuthenticationNonce(void);
-extern bool DataChecksumsEnabled(void);
+extern bool DataChecksumsNeedWrite(void);
+extern bool DataChecksumsNeedVerify(void);
+extern bool DataChecksumsOnInProgress(void);
+extern bool DataChecksumsOffInProgress(void);
+extern void SetDataChecksumsOnInProgress(void);
+extern void SetDataChecksumsOn(void);
+extern void SetDataChecksumsOff(void);
+extern bool AbsorbChecksumsOnInProgressBarrier(void);
+extern bool AbsorbChecksumsOffInProgressBarrier(void);
+extern bool AbsorbChecksumsOnBarrier(void);
+extern bool AbsorbChecksumsOffBarrier(void);
+extern const char *show_data_checksums(void);
+extern void InitLocalDataChecksumVersion(void);
 extern bool GetDefaultCharSignedness(void);
 extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
 extern Size XLOGShmemSize(void);
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index cc06fc29ab2..cc78b00fe4c 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,6 +25,7 @@
 #include "lib/stringinfo.h"
 #include "pgtime.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/relfilelocator.h"
 
 
@@ -289,6 +290,12 @@ typedef struct xl_restore_point
 	char		rp_name[MAXFNAMELEN];
 } xl_restore_point;
 
+/* Information logged when data checksum level is changed */
+typedef struct xl_checksum_state
+{
+	uint32		new_checksumtype;
+} xl_checksum_state;
+
 /* Overwrite of prior contrecord */
 typedef struct xl_overwrite_contrecord
 {
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 63e834a6ce4..a8877fb87d1 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -62,6 +62,9 @@ typedef struct CheckPoint
 	 * set to InvalidTransactionId.
 	 */
 	TransactionId oldestActiveXid;
+
+	/* data checksums at the time of the checkpoint  */
+	uint32		data_checksum_version;
 } CheckPoint;
 
 /* XLOG info values for XLOG rmgr */
@@ -80,6 +83,7 @@ typedef struct CheckPoint
 /* 0xC0 is used in Postgres 9.5-11 */
 #define XLOG_OVERWRITE_CONTRECORD		0xD0
 #define XLOG_CHECKPOINT_REDO			0xE0
+#define XLOG_CHECKSUMS					0xF0
 
 
 /*
@@ -219,7 +223,7 @@ typedef struct ControlFileData
 	bool		float8ByVal;	/* float8, int8, etc pass-by-value? */
 
 	/* Are data pages protected by checksums? Zero if no checksum version */
-	uint32		data_checksum_version;
+	uint32		data_checksum_version;	/* persistent */
 
 	/*
 	 * True if the default signedness of char is "signed" on a platform where
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 118d6da1ace..c6f4e31a12f 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12356,6 +12356,25 @@
   proname => 'jsonb_subscript_handler', prorettype => 'internal',
   proargtypes => 'internal', prosrc => 'jsonb_subscript_handler' },
 
+# data checksum management functions
+{ oid => '9258',
+  descr => 'disable data checksums',
+  proname => 'pg_disable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'bool', proallargtypes => '{bool}',
+  proargmodes => '{i}',
+  proargnames => '{fast}',
+  prosrc => 'disable_data_checksums' },
+
+{ oid => '9257',
+  descr => 'enable data checksums',
+  proname => 'pg_enable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'int4 int4 bool', proallargtypes => '{int4,int4,bool}',
+  proargmodes => '{i,i,i}',
+  proargnames => '{cost_delay,cost_limit,fast}',
+  prosrc => 'enable_data_checksums' },
+
 # collation management functions
 { oid => '3445', descr => 'import collations from operating system',
   proname => 'pg_import_system_collations', procost => '100',
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 1cde4bd9bcf..cf6de4ef12d 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -162,4 +162,20 @@
 #define PROGRESS_COPY_TYPE_PIPE 3
 #define PROGRESS_COPY_TYPE_CALLBACK 4
 
+/* Progress parameters for PROGRESS_DATACHECKSUMS */
+#define PROGRESS_DATACHECKSUMS_PHASE		0
+#define PROGRESS_DATACHECKSUMS_DBS_TOTAL	1
+#define PROGRESS_DATACHECKSUMS_DBS_DONE		2
+#define PROGRESS_DATACHECKSUMS_RELS_TOTAL	3
+#define PROGRESS_DATACHECKSUMS_RELS_DONE	4
+#define PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL	5
+#define PROGRESS_DATACHECKSUMS_BLOCKS_DONE	6
+
+/* Phases of datachecksumsworker operation */
+#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING			0
+#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING			1
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING			2
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL	3
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT 4
+
 #endif
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1bef98471c3..2a0d7b6de42 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -366,6 +366,9 @@ typedef enum BackendType
 	B_WAL_SUMMARIZER,
 	B_WAL_WRITER,
 
+	B_DATACHECKSUMSWORKER_LAUNCHER,
+	B_DATACHECKSUMSWORKER_WORKER,
+
 	/*
 	 * Logger is not connected to shared memory and does not have a PGPROC
 	 * entry.
@@ -391,6 +394,9 @@ extern PGDLLIMPORT BackendType MyBackendType;
 #define AmWalSummarizerProcess()	(MyBackendType == B_WAL_SUMMARIZER)
 #define AmWalWriterProcess()		(MyBackendType == B_WAL_WRITER)
 #define AmIoWorkerProcess()			(MyBackendType == B_IO_WORKER)
+#define AmDataChecksumsWorkerProcess() \
+	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || \
+	 MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
 
 #define AmSpecialWorkerProcess() \
 	(AmAutoVacuumLauncherProcess() || \
diff --git a/src/include/postmaster/datachecksumsworker.h b/src/include/postmaster/datachecksumsworker.h
new file mode 100644
index 00000000000..2cd066fd0fe
--- /dev/null
+++ b/src/include/postmaster/datachecksumsworker.h
@@ -0,0 +1,51 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.h
+ *	  header file for data checksum helper background worker
+ *
+ *
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/postmaster/datachecksumsworker.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef DATACHECKSUMSWORKER_H
+#define DATACHECKSUMSWORKER_H
+
+/* Shared memory */
+extern Size DataChecksumsWorkerShmemSize(void);
+extern void DataChecksumsWorkerShmemInit(void);
+
+/* Possible operations the Datachecksumsworker can perform */
+typedef enum DataChecksumsWorkerOperation
+{
+	ENABLE_DATACHECKSUMS,
+	DISABLE_DATACHECKSUMS,
+	/* TODO: VERIFY_DATACHECKSUMS, */
+}			DataChecksumsWorkerOperation;
+
+/*
+ * Possible states for a database entry which has been processed. Exported
+ * here since we want to be able to reference this from injection point tests.
+ */
+typedef enum
+{
+	DATACHECKSUMSWORKER_SUCCESSFUL = 0,
+	DATACHECKSUMSWORKER_ABORTED,
+	DATACHECKSUMSWORKER_FAILED,
+	DATACHECKSUMSWORKER_RETRYDB,
+} DataChecksumsWorkerResult;
+
+/* Start the background processes for enabling or disabling checksums */
+void		StartDataChecksumsWorkerLauncher(DataChecksumsWorkerOperation op,
+											 int cost_delay,
+											 int cost_limit,
+											 bool fast);
+
+/* Background worker entrypoints */
+void		DataChecksumsWorkerLauncherMain(Datum arg);
+void		DataChecksumsWorkerMain(Datum arg);
+
+#endif							/* DATACHECKSUMSWORKER_H */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index aeb67c498c5..30fb0f62d4c 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -16,6 +16,7 @@
 
 #include "access/xlogdefs.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/item.h"
 #include "storage/off.h"
 
@@ -205,7 +206,6 @@ typedef PageHeaderData *PageHeader;
  * handling pages.
  */
 #define PG_PAGE_LAYOUT_VERSION		4
-#define PG_DATA_CHECKSUM_VERSION	1
 
 /* ----------------------------------------------------------------
  *						page support functions
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 25d13a798d1..b3f368a15b5 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -15,6 +15,20 @@
 
 #include "storage/block.h"
 
+/*
+ * Checksum version 0 is used for when data checksums are disabled (OFF).
+ * PG_DATA_CHECKSUM_VERSION defines that data checksums are enabled in the
+ * cluster and PG_DATA_CHECKSUM_INPROGRESS_{ON|OFF}_VERSION defines that data
+ * checksums are either currently being enabled or disabled.
+ */
+typedef enum ChecksumType
+{
+	PG_DATA_CHECKSUM_OFF = 0,
+	PG_DATA_CHECKSUM_VERSION,
+	PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION,
+	PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION
+} ChecksumType;
+
 /*
  * Compute the checksum for a Postgres page.  The page must be aligned on a
  * 4-byte boundary.
diff --git a/src/include/storage/lwlocklist.h b/src/include/storage/lwlocklist.h
index 208d2e3a8ed..4a90b7fa372 100644
--- a/src/include/storage/lwlocklist.h
+++ b/src/include/storage/lwlocklist.h
@@ -85,6 +85,7 @@ PG_LWLOCK(50, DSMRegistry)
 PG_LWLOCK(51, InjectionPoint)
 PG_LWLOCK(52, SerialControl)
 PG_LWLOCK(53, AioWorkerSubmissionQueue)
+PG_LWLOCK(54, DataChecksumsWorker)
 
 /*
  * There also exist several built-in LWLock tranches.  As with the predefined
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index c6f5ebceefd..d90d35b1d6f 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -463,11 +463,11 @@ extern PGDLLIMPORT PGPROC *PreparedXactProcs;
  * Background writer, checkpointer, WAL writer, WAL summarizer, and archiver
  * run during normal operation.  Startup process and WAL receiver also consume
  * 2 slots, but WAL writer is launched only after startup has exited, so we
- * only need 6 slots.
+ * only need 6 slots to cover these. The DataChecksums worker and launcher
+ * can consume 2 slots when data checksums are enabled or disabled.
  */
 #define MAX_IO_WORKERS          32
-#define NUM_AUXILIARY_PROCS		(6 + MAX_IO_WORKERS)
-
+#define NUM_AUXILIARY_PROCS		(8 + MAX_IO_WORKERS)
 
 /* configurable options */
 extern PGDLLIMPORT int DeadlockTimeout;
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index afeeb1ca019..c54c61e2cd8 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -54,6 +54,11 @@ typedef enum
 typedef enum
 {
 	PROCSIGNAL_BARRIER_SMGRRELEASE, /* ask smgr to close files */
+
+	PROCSIGNAL_BARRIER_CHECKSUM_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_ON,
 } ProcSignalBarrierType;
 
 /*
diff --git a/src/include/utils/backend_progress.h b/src/include/utils/backend_progress.h
index dda813ab407..c664e92dbfe 100644
--- a/src/include/utils/backend_progress.h
+++ b/src/include/utils/backend_progress.h
@@ -28,6 +28,7 @@ typedef enum ProgressCommandType
 	PROGRESS_COMMAND_CREATE_INDEX,
 	PROGRESS_COMMAND_BASEBACKUP,
 	PROGRESS_COMMAND_COPY,
+	PROGRESS_COMMAND_DATACHECKSUMS,
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	20
diff --git a/src/test/Makefile b/src/test/Makefile
index 511a72e6238..278ce3e8a86 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,16 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = perl postmaster regress isolation modules authentication recovery subscription
+SUBDIRS = \
+		perl \
+		postmaster \
+		regress \
+		isolation \
+		modules \
+		authentication \
+		recovery \
+		subscription \
+		checksum
 
 ifeq ($(with_icu),yes)
 SUBDIRS += icu
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 93be0f57289..6b4450eb473 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -16,6 +16,7 @@ subdir('ssl_passphrase_callback')
 subdir('test_aio')
 subdir('test_binaryheap')
 subdir('test_bloomfilter')
+subdir('test_checksums')
 subdir('test_copy_callbacks')
 subdir('test_custom_rmgrs')
 subdir('test_ddl_deparse')
diff --git a/src/test/modules/test_checksums/.gitignore b/src/test/modules/test_checksums/.gitignore
new file mode 100644
index 00000000000..871e943d50e
--- /dev/null
+++ b/src/test/modules/test_checksums/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/modules/test_checksums/Makefile b/src/test/modules/test_checksums/Makefile
new file mode 100644
index 00000000000..b9136bb513f
--- /dev/null
+++ b/src/test/modules/test_checksums/Makefile
@@ -0,0 +1,36 @@
+#-------------------------------------------------------------------------
+#
+# Makefile for src/test/checksum
+#
+# Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/checksum/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/test/checksum
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+export enable_injection_points
+
+MODULE_big = test_checksums
+OBJS = \
+	$(WIN32RES)
+	test_checksums.o
+PGFILEDESC = "test_checksums - test code for data checksums"
+
+EXTENSION = test_checksums
+DATA = test_checksums--1.0.sql
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
+
+clean distclean maintainer-clean:
+	rm -rf tmp_check
diff --git a/src/test/modules/test_checksums/README b/src/test/modules/test_checksums/README
new file mode 100644
index 00000000000..0f0317060b3
--- /dev/null
+++ b/src/test/modules/test_checksums/README
@@ -0,0 +1,22 @@
+src/test/checksum/README
+
+Regression tests for data checksums
+===================================
+
+This directory contains a test suite for enabling data checksums
+in a running cluster.
+
+Running the tests
+=================
+
+    make check
+
+or
+
+    make installcheck
+
+NOTE: This creates a temporary installation (in the case of "check"),
+with multiple nodes, be they master or standby(s) for the purpose of
+the tests.
+
+NOTE: This requires the --enable-tap-tests argument to configure.
diff --git a/src/test/modules/test_checksums/meson.build b/src/test/modules/test_checksums/meson.build
new file mode 100644
index 00000000000..4e57c5de09b
--- /dev/null
+++ b/src/test/modules/test_checksums/meson.build
@@ -0,0 +1,34 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+test_checksums_sources = files(
+	'test_checksums.c',
+)
+
+test_checksums = shared_module('test_checksums',
+	test_checksums_sources,
+	kwargs: pg_test_mod_args,
+)
+test_install_libs += test_checksums
+
+test_install_data += files(
+	'test_checksums.control',
+	'test_checksums--1.0.sql',
+)
+
+tests += {
+  'name': 'test_checksums',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'env': {
+       'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+	},
+    'tests': [
+      't/001_basic.pl',
+      't/002_restarts.pl',
+      't/003_standby_restarts.pl',
+      't/004_offline.pl',
+      't/005_injection.pl',
+    ],
+  },
+}
diff --git a/src/test/modules/test_checksums/t/001_basic.pl b/src/test/modules/test_checksums/t/001_basic.pl
new file mode 100644
index 00000000000..728a5c4510c
--- /dev/null
+++ b/src/test/modules/test_checksums/t/001_basic.pl
@@ -0,0 +1,63 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are turned off
+test_checksum_state($node, 'off');
+
+# Enable data checksums and wait for the state transition to 'on'
+enable_data_checksums($node, wait => 'on');
+
+# Run a dummy query just to make sure we can read back data
+my $result =
+  $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1 ");
+is($result, '9999', 'ensure checksummed pages can be read back');
+
+# Enable data checksums again which should be a no-op so we explicitly don't
+# wait for any state transition as none should happen here
+enable_data_checksums($node);
+test_checksum_state($node, 'on');
+# ..and make sure we can still read/write data
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums again and wait for the state transition
+disable_data_checksums($node, wait => 'on');
+
+# Test reading data again
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '10000', 'ensure previously checksummed pages can be read back');
+
+# Re-enable checksums and make sure that the underlying data has changed to
+# ensure that checksums will be different.
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+enable_data_checksums($node, wait => 'on');
+
+# Run a dummy query just to make sure we can read back the data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/002_restarts.pl b/src/test/modules/test_checksums/t/002_restarts.pl
new file mode 100644
index 00000000000..26e60ea3ee1
--- /dev/null
+++ b/src/test/modules/test_checksums/t/002_restarts.pl
@@ -0,0 +1,110 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with a
+# restart which breaks processing.
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Initialize result storage for queries
+my $result;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+test_checksum_state($node, 'off');
+
+SKIP:
+{
+	skip 'Data checksum delay tests not enabled in PG_TEST_EXTRA', 6
+	  if (!$ENV{PG_TEST_EXTRA}
+		|| $ENV{PG_TEST_EXTRA} !~ /\bchecksum_delays\b/);
+
+	# Create a barrier for checksumming to block on, in this case a pre-
+	# existing temporary table which is kept open while processing is started.
+	# We can accomplish this by setting up an interactive psql process which
+	# keeps the temporary table created as we enable checksums in another psql
+	# process.
+	#
+	# This is a similar test to the synthetic variant in 005_injection.pl
+	# which fakes this scenario.
+	my $bsession = $node->background_psql('postgres');
+	$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+	# In another session, make sure we can see the blocking temp table but
+	# start processing anyways and check that we are blocked with a proper
+	# wait event.
+	$result = $node->safe_psql('postgres',
+		"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';"
+	);
+	is($result, 't', 'ensure we can see the temporary table');
+
+	# Enabling data checksums shouldn't work as the process is blocked on the
+	# temporary table held open by $bsession. Ensure that we reach inprogress-
+	# on before we do more tests.
+	enable_data_checksums($node, wait => 'inprogress-on');
+
+	# Wait for processing to finish and the worker waiting for leftover temp
+	# relations to be able to actually finish
+	$result = $node->poll_query_until(
+		'postgres',
+		"SELECT wait_event FROM pg_catalog.pg_stat_activity "
+		  . "WHERE backend_type = 'datachecksumsworker worker';",
+		'ChecksumEnableTemptableWait');
+
+	# The datachecksumsworker waits for temporary tables to disappear for 3
+	# seconds before retrying, so sleep for 4 seconds to be guaranteed to see
+	# a retry cycle
+	sleep(4);
+
+	# Re-check the wait event to ensure we are blocked on the right thing.
+	$result = $node->safe_psql('postgres',
+			"SELECT wait_event FROM pg_catalog.pg_stat_activity "
+		  . "WHERE backend_type = 'datachecksumsworker worker';");
+	is($result, 'ChecksumEnableTemptableWait',
+		'ensure the correct wait condition is set');
+	test_checksum_state($node, 'inprogress-on');
+
+	# Stop the cluster while bsession is still attached.  We can't close the
+	# session first since the brief period between closing and stopping might
+	# be enough for checksums to get enabled.
+	$node->stop;
+	$bsession->quit;
+	$node->start;
+
+	# Ensure the checksums aren't enabled across the restart.  This leaves the
+	# cluster in the same state as before we entered the SKIP block.
+	test_checksum_state($node, 'off');
+}
+
+enable_data_checksums($node, wait => 'on');
+
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '9999', 'ensure checksummed pages can be read back');
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+disable_data_checksums($node, wait => 1);
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/003_standby_restarts.pl b/src/test/modules/test_checksums/t/003_standby_restarts.pl
new file mode 100644
index 00000000000..fe34b4d7d05
--- /dev/null
+++ b/src/test/modules/test_checksums/t/003_standby_restarts.pl
@@ -0,0 +1,114 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# streaming replication
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+# Initialize primary node
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+$node_primary->init(allows_streaming => 1, no_data_checksums => 1);
+$node_primary->start;
+
+my $slotname = 'physical_slot';
+$node_primary->safe_psql('postgres',
+	"SELECT pg_create_physical_replication_slot('$slotname')");
+
+# Take backup
+my $backup_name = 'my_backup';
+$node_primary->backup($backup_name);
+
+# Create streaming standby linking to primary
+my $node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $backup_name,
+	has_streaming => 1);
+$node_standby_1->append_conf(
+	'postgresql.conf', qq[
+primary_slot_name = '$slotname'
+]);
+$node_standby_1->start;
+
+# Create some content on the primary to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Wait for standbys to catch up
+$node_primary->wait_for_catchup($node_standby_1, 'replay',
+	$node_primary->lsn('insert'));
+
+# Check that checksums are turned off on all nodes
+test_checksum_state($node_primary, 'off');
+test_checksum_state($node_standby_1, 'off');
+
+# ---------------------------------------------------------------------------
+# Enable checksums for the cluster, and make sure that both the primary and
+# standby change state.
+#
+
+# Ensure that the primary switches to "inprogress-on"
+enable_data_checksums($node_primary, wait => 'inprogress-on');
+# Wait for checksum enable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Ensure that the standby has switched to "inprogress-on" or "on".  Normally it
+# would be "inprogress-on", but it is theoretically possible for the primary to
+# complete the checksum enabling *and* have the standby replay that record
+# before we reach the check below.
+my $result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting = 'off' FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'f');
+is($result, 1, 'ensure standby has absorbed the inprogress-on barrier');
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+
+is(($result eq 'inprogress-on' || $result eq 'on'),
+	1, 'ensure checksums are on, or in progress, on standby_1');
+
+# Insert some more data which should be checksummed on INSERT
+$node_primary->safe_psql('postgres',
+	"INSERT INTO t VALUES (generate_series(1, 10000));");
+
+# Wait for checksums enabled on the primary and standby
+wait_for_checksum_state($node_primary, 'on');
+wait_for_checksum_state($node_standby_1, 'on');
+
+$result =
+  $node_primary->safe_psql('postgres', "SELECT count(a) FROM t WHERE a > 1");
+is($result, '19998', 'ensure we can safely read all data with checksums');
+
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+#
+# Disable checksums and ensure it's propagated to standby and that we can
+# still read all data
+#
+
+# Disable checksums and wait for the operation to be replayed
+disable_data_checksums($node_primary);
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+# Ensure that the primary abd standby has switched to off
+wait_for_checksum_state($node_primary, 'off');
+wait_for_checksum_state($node_standby_1, 'off');
+# Doublecheck reading data withourt errors
+$result =
+  $node_primary->safe_psql('postgres', "SELECT count(a) FROM t WHERE a > 1");
+is($result, "19998", 'ensure we can safely read all data without checksums');
+
+$node_standby_1->stop;
+$node_primary->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/004_offline.pl b/src/test/modules/test_checksums/t/004_offline.pl
new file mode 100644
index 00000000000..e9fbcf77eab
--- /dev/null
+++ b/src/test/modules/test_checksums/t/004_offline.pl
@@ -0,0 +1,82 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums offline from various states
+# of checksum processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+test_checksum_state($node, 'off');
+
+# Enable checksums offline using pg_checksums
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are enabled
+test_checksum_state($node, 'on');
+
+# Run a dummy query just to make sure we can read back some data
+my $result =
+  $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '9999', 'ensure checksummed pages can be read back');
+
+# Disable checksums offline again using pg_checksums
+$node->stop;
+$node->checksum_disable_offline;
+$node->start;
+
+# Ensure that checksums are disabled
+test_checksum_state($node, 'off');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+enable_data_checksums($node, wait => 'inprogress-on');
+
+# Turn the cluster off and enable checksums offline, then start back up
+$bsession->quit;
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are now enabled even though processing wasn't
+# restarted
+test_checksum_state($node, 'on');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '9999', 'ensure checksummed pages can be read back');
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/005_injection.pl b/src/test/modules/test_checksums/t/005_injection.pl
new file mode 100644
index 00000000000..e9f4bfdcd91
--- /dev/null
+++ b/src/test/modules/test_checksums/t/005_injection.pl
@@ -0,0 +1,76 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# injection point tests injecting failures into the processing
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+# ---------------------------------------------------------------------------
+# Test cluster setup
+#
+
+# Initiate testcluster
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Set up test environment
+$node->safe_psql('postgres', 'CREATE EXTENSION test_checksums;');
+
+# ---------------------------------------------------------------------------
+# Inducing failures in processing
+
+# Force enabling checksums to fail by marking one of the databases as having
+# failed in processing.
+disable_data_checksums($node, wait => 1);
+$node->safe_psql('postgres', 'SELECT dcw_inject_fail_database(true);');
+enable_data_checksums($node, wait => 'off');
+$node->safe_psql('postgres', 'SELECT dcw_inject_fail_database(false);');
+
+# Force the enable checksums processing to make multiple passes by removing
+# one database from the list in the first pass.  This will simulate a CREATE
+# DATABASE during processing.  Doing this via fault injection makes the test
+# not be subject to exact timing.
+$node->safe_psql('postgres', 'SELECT dcw_prune_dblist(true);');
+enable_data_checksums($node, wait => 'on');
+
+# ---------------------------------------------------------------------------
+# Timing and retry related tests
+#
+SKIP:
+{
+	skip 'Data checksum delay tests not enabled in PG_TEST_EXTRA', 4
+	  if (!$ENV{PG_TEST_EXTRA}
+		|| $ENV{PG_TEST_EXTRA} !~ /\bchecksum_delays\b/);
+
+	# Inject a delay in the barrier for enabling checksums
+	disable_data_checksums($node, wait => 1);
+	$node->safe_psql('postgres', 'SELECT dcw_inject_delay_barrier();');
+	enable_data_checksums($node, wait => 'on');
+
+	# Fake the existence of a temporary table at the start of processing, which
+	# will force the processing to wait and retry in order to wait for it to
+	# disappear.
+	disable_data_checksums($node, wait => 1);
+	$node->safe_psql('postgres', 'SELECT dcw_fake_temptable(true);');
+	enable_data_checksums($node, wait => 'on');
+}
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/DataChecksums/Utils.pm b/src/test/modules/test_checksums/t/DataChecksums/Utils.pm
new file mode 100644
index 00000000000..ee2f2a1428f
--- /dev/null
+++ b/src/test/modules/test_checksums/t/DataChecksums/Utils.pm
@@ -0,0 +1,185 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+=pod
+
+=head1 NAME
+
+DataChecksums::Utils - Utility functions for testing data checksums in a running cluster
+
+=head1 SYNOPSIS
+
+  use PostgreSQL::Test::Cluster;
+  use DataChecksums::Utils qw( .. );
+
+  # Create, and start, a new cluster
+  my $node = PostgreSQL::Test::Cluster->new('primary');
+  $node->init;
+  $node->start;
+
+  test_checksum_state($node, 'off');
+
+  enable_data_checksums($node);
+
+  wait_for_checksum_state($node, 'on');
+
+
+=cut
+
+package DataChecksums::Utils;
+
+use strict;
+use warnings FATAL => 'all';
+use Exporter 'import';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+our @EXPORT = qw(
+  test_checksum_state
+  wait_for_checksum_state
+  enable_data_checksums
+  disable_data_checksums
+);
+
+=pod
+
+=head1 METHODS
+
+=over
+
+=item test_checksum_state(node, state)
+
+Test that the current value of the data checksum GUC in the server running
+at B<node> matches B<state>.  If the values differ, a test failure is logged.
+Returns True if the values match, otherwise False.
+
+=cut
+
+sub test_checksum_state
+{
+	my ($postgresnode, $state) = @_;
+
+	my $result = $postgresnode->safe_psql('postgres',
+		"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+	);
+	is($result, $state, 'ensure checksums are set to ' . $state);
+	return $result eq $state;
+}
+
+=item wait_for_checksum_state(node, state)
+
+Test the value of the data checksum GUC in the server running at B<node>
+repeatedly until it matches B<state> or times out.  Processing will run for
+$PostgreSQL::Test::Utils::timeout_default seconds before timing out.  If the
+values differ when the process times out, False is returned and a test failure
+is logged, otherwise True.
+
+=cut
+
+sub wait_for_checksum_state
+{
+	my ($postgresnode, $state) = @_;
+
+	my $res = $postgresnode->poll_query_until(
+		'postgres',
+		"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+		$state);
+	is($res, 1, 'ensure data checksums are transitioned to ' . $state);
+	return $res == 1;
+}
+
+=item enable_data_checksums($node, %params)
+
+Function for enabling data checksums in the cluster running at B<node>.
+
+=over
+
+=item cost_delay
+
+The C<cost_delay> to use when enabling data checksums, default is 0.
+
+=item cost_limit
+
+The C<cost_limit> to use when enabling data checksums, default is 100.
+
+=item fast
+
+If set to C<true> an immediate checkpoint will be issued after data
+checksums are enabled.  Setting this to false will lead to slower tests.
+The default is true.
+
+=item wait
+
+If defined, the function will wait for the state defined in this parameter,
+waiting timing out, before returning.  The function will wait for
+$PostgreSQL::Test::Utils::timeout_default seconds before timing out.
+
+=back
+
+=cut
+
+sub enable_data_checksums
+{
+	my $postgresnode = shift;
+	my %params = @_;
+
+	# Set sane defaults for the parameters
+	$params{cost_delay} = 0 unless (defined($params{cost_delay}));
+	$params{cost_limit} = 100 unless (defined($params{cost_limit}));
+	$params{fast} = 'true' unless (defined($params{fast}));
+
+	my $query = <<'EOQ';
+SELECT pg_enable_data_checksums(%s, %s, %s);
+EOQ
+
+	$postgresnode->safe_psql(
+		'postgres',
+		sprintf($query,
+			$params{cost_delay}, $params{cost_limit}, $params{fast}));
+
+	wait_for_checksum_state($postgresnode, $params{wait})
+	  if (defined($params{wait}));
+}
+
+=item disable_data_checksums($node, %params)
+
+Function for disabling data checksums in the cluster running at B<node>.
+
+=over
+
+=item wait
+
+If defined, the function will wait for the state to turn to B<off>, or
+waiting timing out, before returning.  The function will wait for
+$PostgreSQL::Test::Utils::timeout_default seconds before timing out.
+Unlike in C<enable_data_checksums> the value of the parameter is discarded.
+
+=back
+
+=cut
+
+sub disable_data_checksums
+{
+	my $postgresnode = shift;
+	my %params = @_;
+
+	# Set sane defaults for the parameters
+	$params{fast} = 'true' unless (defined($params{fast}));
+
+	my $query = <<'EOQ';
+SELECT pg_disable_data_checksums(%s);
+EOQ
+
+	$postgresnode->safe_psql('postgres', sprintf($query, $params{fast}));
+
+	wait_for_checksum_state($postgresnode, 'off') if (defined($params{wait}));
+}
+
+=pod
+
+=back
+
+=cut
+
+1;
diff --git a/src/test/modules/test_checksums/test_checksums--1.0.sql b/src/test/modules/test_checksums/test_checksums--1.0.sql
new file mode 100644
index 00000000000..704b45a3186
--- /dev/null
+++ b/src/test/modules/test_checksums/test_checksums--1.0.sql
@@ -0,0 +1,20 @@
+/* src/test/modules/test_checksums/test_checksums--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_checksums" to load this file. \quit
+
+CREATE FUNCTION dcw_inject_delay_barrier(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION dcw_inject_fail_database(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION dcw_prune_dblist(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION dcw_fake_temptable(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_checksums/test_checksums.c b/src/test/modules/test_checksums/test_checksums.c
new file mode 100644
index 00000000000..26897bff960
--- /dev/null
+++ b/src/test/modules/test_checksums/test_checksums.c
@@ -0,0 +1,173 @@
+/*--------------------------------------------------------------------------
+ *
+ * test_checksums.c
+ *		Test data checksums
+ *
+ * Copyright (c) 2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *		src/test/modules/test_checksums/test_checksums.c
+ *
+ * -------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "funcapi.h"
+#include "miscadmin.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/latch.h"
+#include "utils/injection_point.h"
+#include "utils/wait_event.h"
+
+#define USEC_PER_SEC    1000000
+
+
+PG_MODULE_MAGIC;
+
+extern PGDLLEXPORT void dc_delay_barrier(const char *name, const void *private_data, void *arg);
+extern PGDLLEXPORT void dc_fail_database(const char *name, const void *private_data, void *arg);
+extern PGDLLEXPORT void dc_dblist(const char *name, const void *private_data, void *arg);
+extern PGDLLEXPORT void dc_fake_temptable(const char *name, const void *private_data, void *arg);
+
+/*
+ * Test for delaying emission of procsignalbarriers.
+ */
+void
+dc_delay_barrier(const char *name, const void *private_data, void *arg)
+{
+	(void) name;
+	(void) private_data;
+
+	(void) WaitLatch(MyLatch,
+					 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+					 (3 * 1000),
+					 WAIT_EVENT_PG_SLEEP);
+}
+
+PG_FUNCTION_INFO_V1(dcw_inject_delay_barrier);
+Datum
+dcw_inject_delay_barrier(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksums-enable-checksums-delay",
+							 "test_checksums",
+							 "dc_delay_barrier",
+							 NULL,
+							 0);
+	else
+		InjectionPointDetach("datachecksums-enable-checksums-delay");
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
+
+void
+dc_fail_database(const char *name, const void *private_data, void *arg)
+{
+	static bool first_pass = true;
+	DataChecksumsWorkerResult *res = (DataChecksumsWorkerResult *) arg;
+
+	if (first_pass)
+		*res = DATACHECKSUMSWORKER_FAILED;
+	first_pass = false;
+}
+
+PG_FUNCTION_INFO_V1(dcw_inject_fail_database);
+Datum
+dcw_inject_fail_database(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksumsworker-fail-db",
+							 "test_checksums",
+							 "dc_fail_database",
+							 NULL,
+							 0);
+	else
+		InjectionPointDetach("datachecksumsworker-fail-db");
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
+
+/*
+ * Test to remove an entry from the Databaselist to force re-processing since
+ * not all databases could be processed in the first iteration of the loop.
+ */
+void
+dc_dblist(const char *name, const void *private_data, void *arg)
+{
+	static bool first_pass = true;
+	List	   *DatabaseList = (List *) arg;
+
+	if (first_pass)
+		DatabaseList = list_delete_last(DatabaseList);
+	first_pass = false;
+}
+
+PG_FUNCTION_INFO_V1(dcw_prune_dblist);
+Datum
+dcw_prune_dblist(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksumsworker-initial-dblist",
+							 "test_checksums",
+							 "dc_dblist",
+							 NULL,
+							 0);
+	else
+		InjectionPointDetach("datachecksumsworker-initial-dblist");
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
+
+/*
+ * Test to force waiting for existing temptables.
+ */
+void
+dc_fake_temptable(const char *name, const void *private_data, void *arg)
+{
+	static bool first_pass = true;
+	int		   *numleft = (int *) arg;
+
+	if (first_pass)
+		*numleft = 1;
+	first_pass = false;
+}
+
+PG_FUNCTION_INFO_V1(dcw_fake_temptable);
+Datum
+dcw_fake_temptable(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksumsworker-fake-temptable-wait",
+							 "test_checksums",
+							 "dc_fake_temptable",
+							 NULL,
+							 0);
+	else
+		InjectionPointDetach("datachecksumsworker-fake-temptable-wait");
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
diff --git a/src/test/modules/test_checksums/test_checksums.control b/src/test/modules/test_checksums/test_checksums.control
new file mode 100644
index 00000000000..84b4cc035a7
--- /dev/null
+++ b/src/test/modules/test_checksums/test_checksums.control
@@ -0,0 +1,4 @@
+comment = 'Test code for data checksums'
+default_version = '1.0'
+module_pathname = '$libdir/test_checksums'
+relocatable = true
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 35413f14019..950433ea929 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3872,6 +3872,42 @@ sub advance_wal
 	}
 }
 
+=item $node->checksum_enable_offline()
+
+Enable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_enable_offline
+{
+	my ($self) = @_;
+
+	print "### Enabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-e');
+	return;
+}
+
+=item checksum_disable_offline
+
+Disable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_disable_offline
+{
+	my ($self) = @_;
+
+	print "### Disabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-d');
+	return;
+}
+
 =pod
 
 =back
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 35e8aad7701..4b9c5526e50 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2071,6 +2071,42 @@ pg_stat_progress_create_index| SELECT s.pid,
     s.param15 AS partitions_done
    FROM (pg_stat_get_progress_info('CREATE INDEX'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
+pg_stat_progress_data_checksums| SELECT s.pid,
+    s.datid,
+    d.datname,
+        CASE s.param1
+            WHEN 0 THEN 'enabling'::text
+            WHEN 1 THEN 'disabling'::text
+            WHEN 2 THEN 'waiting'::text
+            WHEN 3 THEN 'waiting on temporary tables'::text
+            WHEN 4 THEN 'waiting on checkpoint'::text
+            WHEN 5 THEN 'done'::text
+            ELSE NULL::text
+        END AS phase,
+        CASE s.param2
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param2
+        END AS databases_total,
+    s.param3 AS databases_done,
+        CASE s.param4
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param4
+        END AS relations_total,
+        CASE s.param5
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param5
+        END AS relations_done,
+        CASE s.param6
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param6
+        END AS blocks_total,
+        CASE s.param7
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param7
+        END AS blocks_done
+   FROM (pg_stat_get_progress_info('DATACHECKSUMS'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)))
+  ORDER BY s.datid;
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,
diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 605f5070376..9042e4d38e3 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -59,6 +59,22 @@ io worker|relation|vacuum
 io worker|temp relation|normal
 io worker|wal|init
 io worker|wal|normal
+datachecksumsworker launcher|relation|bulkread
+datachecksumsworker launcher|relation|bulkwrite
+datachecksumsworker launcher|relation|init
+datachecksumsworker launcher|relation|normal
+datachecksumsworker launcher|relation|vacuum
+datachecksumsworker launcher|temp relation|normal
+datachecksumsworker launcher|wal|init
+datachecksumsworker launcher|wal|normal
+datachecksumsworker worker|relation|bulkread
+datachecksumsworker worker|relation|bulkwrite
+datachecksumsworker worker|relation|init
+datachecksumsworker worker|relation|normal
+datachecksumsworker worker|relation|vacuum
+datachecksumsworker worker|temp relation|normal
+datachecksumsworker worker|wal|init
+datachecksumsworker worker|wal|normal
 slotsync worker|relation|bulkread
 slotsync worker|relation|bulkwrite
 slotsync worker|relation|init
@@ -95,7 +111,7 @@ walsummarizer|wal|init
 walsummarizer|wal|normal
 walwriter|wal|init
 walwriter|wal|normal
-(79 rows)
+(87 rows)
 \a
 -- ensure that both seqscan and indexscan plans are allowed
 SET enable_seqscan TO on;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index e6f2e93b2d6..56c2bf09ae4 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -416,6 +416,7 @@ CheckPointStmt
 CheckpointStatsData
 CheckpointerRequest
 CheckpointerShmemStruct
+ChecksumType
 Chromosome
 CkptSortItem
 CkptTsStatus
@@ -608,6 +609,10 @@ DataPageDeleteStack
 DataTypesUsageChecks
 DataTypesUsageVersionCheck
 DatabaseInfo
+DataChecksumsWorkerDatabase
+DataChecksumsWorkerResult
+DataChecksumsWorkerResultEntry
+DataChecksumsWorkerShmemStruct
 DateADT
 DateTimeErrorExtra
 Datum
@@ -4244,6 +4249,7 @@ xl_btree_split
 xl_btree_unlink_page
 xl_btree_update
 xl_btree_vacuum
+xl_checksum_state
 xl_clog_truncate
 xl_commit_ts_truncate
 xl_dbase_create_file_copy_rec
-- 
2.39.3 (Apple Git-146)

#48Bruce Momjian
bruce@momjian.us
In reply to: Daniel Gustafsson (#47)
Re: Changing the state of data checksums in a running cluster

On Sat, Aug 16, 2025 at 09:34:03PM +0200, Daniel Gustafsson wrote:

Attached is a rebase on top of the func.sgml changes which caused this to no
longer apply.

This version is also substantially updated with a new injection point based
test suite, fixed a few bugs (found by said test suite), added checkpoint to
disabling checksums, code cleanup, more granular wait events, comment rewrites
and additions and more smaller cleanups.

I am very glad you went simple and didn't attempt restarting this
process from the place it stopped:

If the cluster is stopped while in <literal>inprogress-on</literal>
mode, for any reason, then this process must be
restarted manually. To do this, re-execute the function
<function>pg_enable_data_checksums()</function> once the cluster has
been restarted. The process will start over, there is no support for
resuming work from where it was interrupted.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Do not let urgent matters crowd out time for investment in the future.

#49Tomas Vondra
tomas@vondra.me
In reply to: Daniel Gustafsson (#47)
Re: Changing the state of data checksums in a running cluster

On 8/16/25 21:34, Daniel Gustafsson wrote:

Attached is a rebase on top of the func.sgml changes which caused this to no
longer apply.

This version is also substantially updated with a new injection point based
test suite, fixed a few bugs (found by said test suite), added checkpoint to
disabling checksums, code cleanup, more granular wait events, comment rewrites
and additions and more smaller cleanups.

Thanks for the updated patch.

The injection points seem like a huge improvement, allowing testing of
different code paths in a more deterministic way.

I started running the stress test, using pretty much exactly the version
posted in March [1]/messages/by-id/f528413c-477a-4ec3-a0df-e22a80ffbe41@vondra.me. And so far I noticed only one issue, when the
standby reports mismatched checksums on a fsm:

LOG: page verification failed, calculated checksum 24786 but expected 24760
CONTEXT: WAL redo at 0/0344A290 for Heap2/MULTI_INSERT+INIT: ntuples:
185, flags: 0x28; blkref #0: rel 1663/16384/16403, blk 0
LOG: invalid page in block 2 of relation base/16384/16403_fsm; zeroing
out page
CONTEXT: WAL redo at 0/0344A290 for Heap2/MULTI_INSERT+INIT: ntuples:
185, flags: 0x28; blkref #0: rel 1663/16384/16403, blk 0
WARNING: invalid page in block 2 of relation base/16384/16403_fsm;
zeroing out page
CONTEXT: WAL redo at 0/0344A290 for Heap2/MULTI_INSERT+INIT: ntuples:
185, flags: 0x28; blkref #0: rel 1663/16384/16403, blk 0
LOG: page verification failed, calculated checksum 37048 but expected 0
CONTEXT: WAL redo at 0/0344D7E0 for Heap2/MULTI_INSERT+INIT: ntuples:
61, flags: 0x28; blkref #0: rel 1663/16384/16400, blk 0
LOG: invalid page in block 2 of relation base/16384/16400_fsm; zeroing
out page

This happens quite regularly, it's not hard to hit. But I've only seen
it to happen on a FSM, and only right after immediate shutdown. I don't
think that's quite expected.

I believe the built-in TAP tests (with injection points) can't catch
this, because there's no concurrent activity while flipping checksums
on/off. It'd be good to do something like that, by running pgbench in
the background, or something like that.

I also don't see any restarts of the primary/standby. That might be good
to do too.

I plan to randomize the stress test a bit more, once this FSM issue gets
fixed. Maybe that'll find some additional issues.

[1]: /messages/by-id/f528413c-477a-4ec3-a0df-e22a80ffbe41@vondra.me
/messages/by-id/f528413c-477a-4ec3-a0df-e22a80ffbe41@vondra.me

--
Tomas Vondra

#50Tomas Vondra
tomas@vondra.me
In reply to: Tomas Vondra (#49)
Re: Changing the state of data checksums in a running cluster

Hi,

I think there's a minor issue in how pg_checksums validates state before
checking the data.

The current patch simply does:

if (ControlFile->data_checksum_version == 0 &&
mode == PG_MODE_CHECK)
pg_fatal("data checksums are not enabled in cluster");

and that worked when the version was either 0 or 1. But now it can be
also 2 or 3, for inprogress-on / inprogress-off, and if the cluster gets
shut down at the right moment, that can end in the control file.

It doesn't make sense to verify checksums in such cluster, pg_checksums
should handle that as "off", i.e. error out.

regards

--
Tomas Vondra

#51Daniel Gustafsson
daniel@yesql.se
In reply to: Tomas Vondra (#49)
1 attachment(s)
Re: Changing the state of data checksums in a running cluster

On 20 Aug 2025, at 16:37, Tomas Vondra <tomas@vondra.me> wrote:

This happens quite regularly, it's not hard to hit. But I've only seen
it to happen on a FSM, and only right after immediate shutdown. I don't
think that's quite expected.

I believe the built-in TAP tests (with injection points) can't catch
this, because there's no concurrent activity while flipping checksums
on/off. It'd be good to do something like that, by running pgbench in
the background, or something like that.

In searching for this bug I opted for implementing a version of the stress
tests as a TAP test, see 006_concurrent_pgbench.pl in the attached patch
version. It's gated behind PG_TEST_EXTRA since it's clearly not something
which can be enabled by default (if this goes in this need to be re-done to
provide two levels IMO, but during testing this is more convenient). I'm
curious to see which improvements you can think to make it stress the code to
the breaking point.

I think there's a minor issue in how pg_checksums validates state before
checking the data.

The current patch simply does:

if (ControlFile->data_checksum_version == 0 &&
mode == PG_MODE_CHECK)
pg_fatal("data checksums are not enabled in cluster");

and that worked when the version was either 0 or 1. But now it can be
also 2 or 3, for inprogress-on / inprogress-off, and if the cluster gets
shut down at the right moment, that can end in the control file.

Good point, I've changed the test to check for checksums being enabled rather
than checking if they are disabled.

--
Daniel Gustafsson

Attachments:

v20250825-0001-Online-enabling-and-disabling-of-data-chec.patchapplication/octet-stream; name=v20250825-0001-Online-enabling-and-disabling-of-data-chec.patch; x-unix-mode=0644Download
From 208eee4802ab9bb4ffc6aca6fc54b5c3e006b5e3 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Fri, 15 Aug 2025 14:48:02 +0200
Subject: [PATCH v20250825] Online enabling and disabling of data checksums

This allows data checksums to be enabled, or disabled, in a running
cluster without restricting access to the cluster during processing.

Data checksums could prior to this only be enabled during initdb or
when the cluster is offline using the pg_checksums app. This commit
introduce functionality to enable, or disable, data checksums while
the cluster is running regardless of how it was initialized.

A background worker launcher process is responsible for launching a
dynamic per-database background worker which will mark all buffers
dirty for all relation with storage in order for them to have data
checksums calcuated on write.  Once all relations in all databases
have been processed, the data_checksums state will be set to on and
the cluster will at that point be identical to one which had data
checksums enabled during initialization or via offline processing.

When data checksums are being enabled, concurrent I/O operations
from backends other than the data checksums worker will write the
checksums but not verify them on reading.  Only when all backends
have absorbed the procsignalbarrier for setting data_checksums to
on will they also start verifying checksums on reading.  The same
process is repeated during disabling; all backends write checksums
but do not verify them until the barrier for setting the state to
off has been absorbed by all.  This in-progress state is used to
ensure there are no false negatives (or positives) due to reading
a checksum which is not in sync with the page.

This work is based on an earlier version of this patch which was
reviewed by among others Heikki Linnakangas, Robert Haas, Andres
Freund, Tomas Vondra, Michael Banck and Andrey Borodin.  Tomas
Vondra has given invaluable assistance with not only review but
very in-depth testing.

Author: Daniel Gustafsson <daniel@yesql.se>
Author: Magnus Hagander <magnus@hagander.net>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com
Discussion: https://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de
Discussion: https://postgr.es/m/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com
---
 doc/src/sgml/func/func-admin.sgml             |   71 +
 doc/src/sgml/glossary.sgml                    |   23 +
 doc/src/sgml/monitoring.sgml                  |  208 ++-
 doc/src/sgml/ref/pg_checksums.sgml            |    6 +
 doc/src/sgml/regress.sgml                     |   12 +
 doc/src/sgml/wal.sgml                         |   59 +-
 src/backend/access/rmgrdesc/xlogdesc.c        |   24 +
 src/backend/access/transam/xlog.c             |  547 +++++-
 src/backend/access/transam/xlogfuncs.c        |   43 +
 src/backend/backup/basebackup.c               |    6 +-
 src/backend/catalog/system_functions.sql      |   16 +
 src/backend/catalog/system_views.sql          |   20 +
 src/backend/postmaster/Makefile               |    1 +
 src/backend/postmaster/auxprocess.c           |   19 +
 src/backend/postmaster/bgworker.c             |    7 +
 src/backend/postmaster/datachecksumsworker.c  | 1463 +++++++++++++++++
 src/backend/postmaster/launch_backend.c       |    3 +
 src/backend/postmaster/meson.build            |    1 +
 src/backend/postmaster/postmaster.c           |    5 +
 src/backend/replication/logical/decode.c      |    1 +
 src/backend/storage/ipc/ipci.c                |    4 +
 src/backend/storage/ipc/procsignal.c          |   13 +
 src/backend/storage/page/README               |    4 +-
 src/backend/storage/page/bufpage.c            |    6 +-
 src/backend/utils/activity/pgstat_backend.c   |    2 +
 src/backend/utils/activity/pgstat_io.c        |    2 +
 .../utils/activity/wait_event_names.txt       |    4 +
 src/backend/utils/adt/pgstatfuncs.c           |    8 +-
 src/backend/utils/init/miscinit.c             |   12 +-
 src/backend/utils/init/postinit.c             |   20 +-
 src/backend/utils/misc/guc_tables.c           |   30 +-
 src/bin/pg_checksums/pg_checksums.c           |    4 +-
 src/bin/pg_controldata/pg_controldata.c       |    2 +
 src/bin/pg_upgrade/controldata.c              |    9 +
 src/include/access/xlog.h                     |   17 +-
 src/include/access/xlog_internal.h            |    7 +
 src/include/catalog/pg_control.h              |    6 +-
 src/include/catalog/pg_proc.dat               |   19 +
 src/include/commands/progress.h               |   16 +
 src/include/miscadmin.h                       |    6 +
 src/include/postmaster/datachecksumsworker.h  |   51 +
 src/include/storage/bufpage.h                 |    2 +-
 src/include/storage/checksum.h                |   14 +
 src/include/storage/lwlocklist.h              |    1 +
 src/include/storage/proc.h                    |    6 +-
 src/include/storage/procsignal.h              |    5 +
 src/include/utils/backend_progress.h          |    1 +
 src/test/Makefile                             |   11 +-
 src/test/modules/meson.build                  |    1 +
 src/test/modules/test_checksums/.gitignore    |    2 +
 src/test/modules/test_checksums/Makefile      |   36 +
 src/test/modules/test_checksums/README        |   22 +
 src/test/modules/test_checksums/meson.build   |   35 +
 .../modules/test_checksums/t/001_basic.pl     |   63 +
 .../modules/test_checksums/t/002_restarts.pl  |  110 ++
 .../test_checksums/t/003_standby_restarts.pl  |  114 ++
 .../modules/test_checksums/t/004_offline.pl   |   82 +
 .../modules/test_checksums/t/005_injection.pl |   76 +
 .../t/006_concurrent_pgbench.pl               |  224 +++
 .../test_checksums/t/DataChecksums/Utils.pm   |  185 +++
 .../test_checksums/test_checksums--1.0.sql    |   20 +
 .../modules/test_checksums/test_checksums.c   |  173 ++
 .../test_checksums/test_checksums.control     |    4 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |   36 +
 src/test/regress/expected/rules.out           |   36 +
 src/test/regress/expected/stats.out           |   18 +-
 src/tools/pgindent/typedefs.list              |    6 +
 67 files changed, 4004 insertions(+), 56 deletions(-)
 create mode 100644 src/backend/postmaster/datachecksumsworker.c
 create mode 100644 src/include/postmaster/datachecksumsworker.h
 create mode 100644 src/test/modules/test_checksums/.gitignore
 create mode 100644 src/test/modules/test_checksums/Makefile
 create mode 100644 src/test/modules/test_checksums/README
 create mode 100644 src/test/modules/test_checksums/meson.build
 create mode 100644 src/test/modules/test_checksums/t/001_basic.pl
 create mode 100644 src/test/modules/test_checksums/t/002_restarts.pl
 create mode 100644 src/test/modules/test_checksums/t/003_standby_restarts.pl
 create mode 100644 src/test/modules/test_checksums/t/004_offline.pl
 create mode 100644 src/test/modules/test_checksums/t/005_injection.pl
 create mode 100644 src/test/modules/test_checksums/t/006_concurrent_pgbench.pl
 create mode 100644 src/test/modules/test_checksums/t/DataChecksums/Utils.pm
 create mode 100644 src/test/modules/test_checksums/test_checksums--1.0.sql
 create mode 100644 src/test/modules/test_checksums/test_checksums.c
 create mode 100644 src/test/modules/test_checksums/test_checksums.control

diff --git a/doc/src/sgml/func/func-admin.sgml b/doc/src/sgml/func/func-admin.sgml
index 57ff333159f..88d260795b8 100644
--- a/doc/src/sgml/func/func-admin.sgml
+++ b/doc/src/sgml/func/func-admin.sgml
@@ -2960,4 +2960,75 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 
   </sect2>
 
+  <sect2 id="functions-admin-checksum">
+   <title>Data Checksum Functions</title>
+
+   <para>
+    The functions shown in <xref linkend="functions-checksums-table" /> can
+    be used to enable or disable data checksums in a running cluster.
+    See <xref linkend="checksums" /> for details.
+   </para>
+
+   <table id="functions-checksums-table">
+    <title>Data Checksum Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_enable_data_checksums</primary>
+        </indexterm>
+        <function>pg_enable_data_checksums</function> ( <optional><parameter>cost_delay</parameter> <type>int</type>, <parameter>cost_limit</parameter> <type>int</type></optional> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Initiates data checksums for the cluster. This will switch the data
+        checksums mode to <literal>inprogress-on</literal> as well as start a
+        background worker that will process all pages in the database and
+        enable checksums on them. When all data pages have had checksums
+        enabled, the cluster will automatically switch data checksums mode to
+        <literal>on</literal>.
+       </para>
+       <para>
+        If <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter> are
+        specified, the speed of the process is throttled using the same principles as
+        <link linkend="runtime-config-resource-vacuum-cost">Cost-based Vacuum Delay</link>.
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_disable_data_checksums</primary>
+        </indexterm>
+        <function>pg_disable_data_checksums</function> ()
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Disables data checksum validation and calculation for the cluster. This
+        will switch the data checksum mode to <literal>inprogress-off</literal>
+        while data checksums are being disabled. When all active backends have
+        stopped validating data checksums, the data checksum mode will be
+        changed to <literal>off</literal>.  At this point the data pages will
+        still have checksums recorded but they are not updated when pages are
+        modified.
+       </para></entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+
+  </sect2>
+
   </sect1>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index b88cac598e9..a4e16d03aae 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -184,6 +184,8 @@
      (but not the autovacuum workers),
      the <glossterm linkend="glossary-background-writer">background writer</glossterm>,
      the <glossterm linkend="glossary-checkpointer">checkpointer</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker">data checksums worker</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker-launcher">data checksums worker launcher</glossterm>,
      the <glossterm linkend="glossary-logger">logger</glossterm>,
      the <glossterm linkend="glossary-startup-process">startup process</glossterm>,
      the <glossterm linkend="glossary-wal-archiver">WAL archiver</glossterm>,
@@ -573,6 +575,27 @@
    </glossdef>
   </glossentry>
 
+  <glossentry id="glossary-data-checksums-worker">
+   <glossterm>Data Checksums Worker</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which enables or disables data checksums in a specific database.
+    </para>
+   </glossdef>
+  </glossentry>
+
+  <glossentry id="glossary-data-checksums-worker-launcher">
+   <glossterm>Data Checksums Worker Launcher</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which starts <glossterm linkend="glossary-data-checksums-worker"> processes</glossterm>
+     for each database.
+    </para>
+   </glossdef>
+  </glossentry>
+
   <glossentry id="glossary-db-cluster">
    <glossterm>Database cluster</glossterm>
    <glossdef>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 3f4a27a736e..6082d991497 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3527,8 +3527,9 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Number of data page checksum failures detected in this
-       database (or on a shared object), or NULL if data checksums are
-       disabled.
+       database (or on a shared object).
+       Detected failures are reported regardless of the
+       <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -3538,8 +3539,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Time at which the last data page checksum failure was detected in
-       this database (or on a shared object), or NULL if data checksums are
-       disabled.
+       this database (or on a shared object). Last failure is reported
+       regardless of the <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -6877,6 +6878,205 @@ FROM pg_stat_get_backend_idset() AS backendid;
 
  </sect2>
 
+ <sect2 id="data-checksum-progress-reporting">
+  <title>Data Checksum Progress Reporting</title>
+
+  <indexterm>
+   <primary>pg_stat_progress_data_checksums</primary>
+  </indexterm>
+
+  <para>
+   When data checksums are being enabled on a running cluster, the
+   <structname>pg_stat_progress_data_checksums</structname> view will contain
+   a row for the launcher process, and one row for each worker process which
+   is currently calculating checksums for the data pages in one database.
+  </para>
+
+  <table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
+   <title><structname>pg_stat_progress_data_checksums</structname> View</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        Column Type
+       </para>
+       <para>
+        Description>
+       </para>
+      </entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>pid</structfield> <type>integer</type>
+       </para>
+       <para>
+        Process ID of a datachecksumworker process.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datid</structfield> <type>oid</type>
+      </para>
+      <para>
+       OID of this database, or 0 for the launcher process
+       relation
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datname</structfield> <type>name</type>
+      </para>
+      <para>
+       Name of this database, or <literal>NULL</literal> for the
+       launcher process.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>phase</structfield> <type>text</type>
+       </para>
+       <para>
+        Current processing phase, see <xref linkend="datachecksum-phases"/>
+        for description of the phases.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of databases which will be processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this set to <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of databases which have been processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this set to <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of relations which will be processed, or
+        <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of relations yet. The launcher process has
+        this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of relations which have been processed. The launcher
+        process has this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>blocks_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks in the current relation which will be processed,
+        or <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of blocks yet. The launcher process has
+        this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>blocks_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks in the current relation which have been processed.
+        The launcher process has this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+  <table id="datachecksum-phases">
+   <title>Data Checksum Phases</title>
+   <tgroup cols="2">
+    <colspec colname="col1" colwidth="1*"/>
+    <colspec colname="col2" colwidth="2*"/>
+    <thead>
+     <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+    <tbody>
+     <row>
+      <entry><literal>enabling</literal></entry>
+      <entry>
+       The command is currently enabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>disabling</literal></entry>
+      <entry>
+       The command is currently disabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on temporary tables</literal></entry>
+      <entry>
+       The command is currently waiting for all temporary tables which existed
+       at the time the command was started to be removed.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on checkpoint</literal></entry>
+      <entry>
+       The command is currently waiting for a checkpoint to update the checksum
+       state before finishing.
+      </entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+ </sect2>
+
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index 95043aa329c..0343710af53 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -45,6 +45,12 @@ PostgreSQL documentation
    exit status is nonzero if the operation failed.
   </para>
 
+  <para>
+   When enabling checksums, if checksums were in the process of being enabled
+   when the cluster was shut down, <application>pg_checksums</application>
+   will still process all relations regardless of the online processing.
+  </para>
+
   <para>
    When verifying checksums, every file in the cluster is scanned. When
    enabling checksums, each relation file block with a changed checksum is
diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index 8838fe7f022..7074751834e 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -263,6 +263,18 @@ make check-world PG_TEST_EXTRA='kerberos ldap ssl load_balance libpq_encryption'
 </programlisting>
    The following values are currently supported:
    <variablelist>
+    <varlistentry>
+     <term><literal>checksum_extended</literal></term>
+     <listitem>
+      <para>
+       Runs additional tests for enabling data checksums which inject delays
+       and re-tries in the processing, as well as tests that run pgbench
+       concurrently and randomly restarts the cluster.  Some of these test
+       suites requires injection points enabled in the installation.
+      </para>
+     </listitem>
+    </varlistentry>
+
     <varlistentry>
      <term><literal>kerberos</literal></term>
      <listitem>
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index f3b86b26be9..0ada90ca0b1 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -246,9 +246,10 @@
   <para>
    Checksums can be disabled when the cluster is initialized using <link
    linkend="app-initdb-data-checksums"><application>initdb</application></link>.
-   They can also be enabled or disabled at a later time as an offline
-   operation. Data checksums are enabled or disabled at the full cluster
-   level, and cannot be specified individually for databases or tables.
+   They can also be enabled or disabled at a later time either as an offline
+   operation or online in a running cluster allowing concurrent access. Data
+   checksums are enabled or disabled at the full cluster level, and cannot be
+   specified individually for databases or tables.
   </para>
 
   <para>
@@ -265,7 +266,7 @@
   </para>
 
   <sect2 id="checksums-offline-enable-disable">
-   <title>Off-line Enabling of Checksums</title>
+   <title>Offline Enabling of Checksums</title>
 
    <para>
     The <link linkend="app-pgchecksums"><application>pg_checksums</application></link>
@@ -274,6 +275,56 @@
    </para>
 
   </sect2>
+
+  <sect2 id="checksums-online-enable-disable">
+   <title>Online Enabling of Checksums</title>
+
+   <para>
+    Checksums can be enabled or disabled online, by calling the appropriate
+    <link linkend="functions-admin-checksum">functions</link>.
+   </para>
+
+   <para>
+    Enabling checksums will put the cluster checksum mode in
+    <literal>inprogress-on</literal> mode.  During this time, checksums will be
+    written but not verified. In addition to this, a background worker process
+    is started that enables checksums on all existing data in the cluster. Once
+    this worker has completed processing all databases in the cluster, the
+    checksum mode will automatically switch to <literal>on</literal>. The
+    processing will consume two background worker processes, make sure that
+    <varname>max_worker_processes</varname> allows for at least two more
+    additional processes.
+   </para>
+
+   <para>
+    The process will initially wait for all open transactions to finish before
+    it starts, so that it can be certain that there are no tables that have been
+    created inside a transaction that has not committed yet and thus would not
+    be visible to the process enabling checksums. It will also, for each database,
+    wait for all pre-existing temporary tables to get removed before it finishes.
+    If long-lived temporary tables are used in the application it may be necessary
+    to terminate these application connections to allow the process to complete.
+   </para>
+
+   <para>
+    If the cluster is stopped while in <literal>inprogress-on</literal> mode, for
+    any reason, then this process must be restarted manually. To do this,
+    re-execute the function <function>pg_enable_data_checksums()</function>
+    once the cluster has been restarted. The process will start over, there is
+    no support for resuming work from where it was interrupted.
+   </para>
+
+   <note>
+    <para>
+     Enabling checksums can cause significant I/O to the system, as most of the
+     database pages will need to be rewritten, and will be written both to the
+     data files and the WAL. The impact may be limited by throttling using the
+     <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter>
+     parameters of the <function>pg_enable_data_checksums</function> function.
+    </para>
+   </note>
+
+  </sect2>
  </sect1>
 
   <sect1 id="wal-intro">
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index cd6c2a2f650..c50d654db30 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -18,6 +18,7 @@
 #include "access/xlog.h"
 #include "access/xlog_internal.h"
 #include "catalog/pg_control.h"
+#include "storage/bufpage.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -167,6 +168,26 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
 		memcpy(&wal_level, rec, sizeof(int));
 		appendStringInfo(buf, "wal_level %s", get_wal_level_string(wal_level));
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state xlrec;
+
+		memcpy(&xlrec, rec, sizeof(xl_checksum_state));
+		switch (xlrec.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_VERSION:
+				appendStringInfoString(buf, "on");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				appendStringInfoString(buf, "inprogress-off");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				appendStringInfoString(buf, "inprogress-on");
+				break;
+			default:
+				appendStringInfoString(buf, "off");
+		}
+	}
 }
 
 const char *
@@ -218,6 +239,9 @@ xlog_identify(uint8 info)
 		case XLOG_CHECKPOINT_REDO:
 			id = "CHECKPOINT_REDO";
 			break;
+		case XLOG_CHECKSUMS:
+			id = "CHECKSUMS";
+			break;
 	}
 
 	return id;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 7ffb2179151..46edf531359 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -550,6 +550,9 @@ typedef struct XLogCtlData
 	 */
 	XLogRecPtr	lastFpwDisableRecPtr;
 
+	/* last data_checksum_version we've seen */
+	uint32		data_checksum_version;
+
 	slock_t		info_lck;		/* locks shared variables shown above */
 } XLogCtlData;
 
@@ -647,6 +650,36 @@ static XLogRecPtr LocalMinRecoveryPoint;
 static TimeLineID LocalMinRecoveryPointTLI;
 static bool updateMinRecoveryPoint = true;
 
+/*
+ * Local state fror Controlfile data_checksum_version.  After initialization
+ * this is only updated when absorbing a procsignal barrier during interrupt
+ * processing.  The reason for keeping a copy in backend-private memory is to
+ * avoid locking for interrogating checksum state.  Possible values are the
+ * checksum versions defined in storage/bufpage.h as well as zero when data
+ * checksums are disabled.
+ */
+static uint32 LocalDataChecksumVersion = 0;
+
+/*
+ * Flag to remember if the procsignalbarrier being absorbed for checksums is
+ * the first one.  The first procsignalbarrier can in rare cases be for the
+ * state we've initialized, i.e. a duplicate.  This may happen for any
+ * data_checksum_version value, but for PG_DATA_CHECKSUM_ON_VERSION this would
+ * trigger an assert failure (this is the only transition with an assert) when
+ * processing the barrier.  This may happen if the process is spawned between
+ * the update of XLogCtl->data_checksum_version and the barrier being emitted.
+ * This can only happen on the very first barrier so mark that with this flag.
+ */
+static bool InitialDataChecksumTransition = true;
+
+/*
+ * Variable backing the GUC, keep it in sync with LocalDataChecksumVersion.
+ * See SetLocalDataChecksumVersion().
+ */
+int			data_checksums = 0;
+
+static void SetLocalDataChecksumVersion(uint32 data_checksum_version);
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -715,6 +748,8 @@ static void WALInsertLockAcquireExclusive(void);
 static void WALInsertLockRelease(void);
 static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);
 
+static void XLogChecksums(uint32 new_type);
+
 /*
  * Insert an XLOG record represented by an already-constructed chain of data
  * chunks.  This is a low-level routine; to construct the WAL record header
@@ -828,9 +863,10 @@ XLogInsertRecord(XLogRecData *rdata,
 		 * only happen just after a checkpoint, so it's better to be slow in
 		 * this case and fast otherwise.
 		 *
-		 * Also check to see if fullPageWrites was just turned on or there's a
-		 * running backup (which forces full-page writes); if we weren't
-		 * already doing full-page writes then go back and recompute.
+		 * Also check to see if fullPageWrites was just turned on, there's a
+		 * running backup or if checksums are enabled (all of which forces
+		 * full-page writes); if we weren't already doing full-page writes
+		 * then go back and recompute.
 		 *
 		 * If we aren't doing full-page writes then RedoRecPtr doesn't
 		 * actually affect the contents of the XLOG record, so we'll update
@@ -843,7 +879,9 @@ XLogInsertRecord(XLogRecData *rdata,
 			Assert(RedoRecPtr < Insert->RedoRecPtr);
 			RedoRecPtr = Insert->RedoRecPtr;
 		}
-		doPageWrites = (Insert->fullPageWrites || Insert->runningBackups > 0);
+		doPageWrites = (Insert->fullPageWrites ||
+						Insert->runningBackups > 0 ||
+						DataChecksumsNeedWrite());
 
 		if (doPageWrites &&
 			(!prevDoPageWrites ||
@@ -4229,6 +4267,12 @@ InitControlFile(uint64 sysidentifier, uint32 data_checksum_version)
 	ControlFile->wal_log_hints = wal_log_hints;
 	ControlFile->track_commit_timestamp = track_commit_timestamp;
 	ControlFile->data_checksum_version = data_checksum_version;
+
+	/*
+	 * Set the data_checksum_version value into XLogCtl, which is where all
+	 * processes get the current value from. (Maybe it should go just there?)
+	 */
+	XLogCtl->data_checksum_version = data_checksum_version;
 }
 
 static void
@@ -4552,10 +4596,6 @@ ReadControlFile(void)
 		(SizeOfXLogLongPHD - SizeOfXLogShortPHD);
 
 	CalculateCheckpointSegments();
-
-	/* Make the initdb settings visible as GUC variables, too */
-	SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no",
-					PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
 }
 
 /*
@@ -4589,13 +4629,374 @@ GetMockAuthenticationNonce(void)
 }
 
 /*
- * Are checksums enabled for data pages?
+ * DataChecksumsNeedWrite
+ *		Returns whether data checksums must be written or not
+ *
+ * Returns true iff data checksums are enabled or are in the process of being
+ * enabled. During "inprogress-on" and "inprogress-off" states checksums must
+ * be written even though they are not verified (see datachecksumsworker.c for
+ * a longer discussion).
+ *
+ * This function is intended for callsites which are about to write a data page
+ * to storage, and need to know whether to re-calculate the checksum for the
+ * page header. Calling this function must be performed as close to the write
+ * operation as possible to keep the critical section short.
+ */
+bool
+DataChecksumsNeedWrite(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * DataChecksumsNeedVerify
+ *		Returns whether data checksums must be verified or not
+ *
+ * Data checksums are only verified if they are fully enabled in the cluster.
+ * During the "inprogress-on" and "inprogress-off" states they are only
+ * updated, not verified (see datachecksumsworker.c for a longer discussion).
+ *
+ * This function is intended for callsites which have read data and are about
+ * to perform checksum validation based on the result of this.  Calling this
+ * function must be performed as close to the validation call as possible to
+ * keep the critical section short. This is in order to protect against time of
+ * check/time of use situations around data checksum validation.
+ */
+bool
+DataChecksumsNeedVerify(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
+}
+
+/*
+ * DataChecksumsOnInProgress
+ *		Returns whether data checksums are being enabled
+ *
+ * Most operations don't need to worry about the "inprogress" states, and
+ * should use DataChecksumsNeedVerify() or DataChecksumsNeedWrite(). The
+ * "inprogress-on" state for enabling checksums is used when the checksum
+ * worker is setting checksums on all pages, it can thus be used to check for
+ * aborted checksum processing which need to be restarted.
+ */
+inline bool
+DataChecksumsOnInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+}
+
+/*
+ * DataChecksumsOffInProgress
+ *		Returns whether data checksums are being disabled
+ *
+ * The "inprogress-off" state for disabling checksums is used for when the
+ * worker resets the catalog state.  DataChecksumsNeedVerify() or
+ * DataChecksumsNeedWrite() should be used for deciding whether to read/write
+ * checksums.
  */
 bool
-DataChecksumsEnabled(void)
+DataChecksumsOffInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * SetDataChecksumsOnInProgress
+ *		Sets the data checksum state to "inprogress-on" to enable checksums
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on". See
+ * SetDataChecksumsOn below for a description on how this state change works.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOnInProgress(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile != NULL);
+
+	/*
+	 * The state transition is performed in a critical section with
+	 * checkpoints held off to provide crash safety.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state change in all backends to ensure that all backends are in
+	 * "inprogress-on". Once done we know that all backends are writing data
+	 * checksums.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOn
+ *		Enables data checksums cluster-wide
+ *
+ * Enabling data checksums is performed using two barriers, the first one to
+ * set the state to "inprogress-on" (done by SetDataChecksumsOnInProgress())
+ * and the second one to set the state to "on" (done here). Below is a short
+ * description of the processing, a more detailed write-up can be found in
+ * datachecksumsworker.c.
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on".  This state
+ * requires data checksums to be written but not verified. This ensures that
+ * all data pages can be checksummed without the risk of false negatives in
+ * validation during the process.  When all existing pages are guaranteed to
+ * have checksums, and all new pages will be initiated with checksums, the
+ * state can be changed to "on". Once the state is "on" checksums will be both
+ * written and verified. See datachecksumsworker.c for a longer discussion on
+ * how data checksums can be enabled in a running cluster.
+ *
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOn(void)
 {
+	uint64		barrier;
+
 	Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+
+	/*
+	 * The only allowed state transition to "on" is from "inprogress-on" since
+	 * that state ensures that all pages will have data checksums written.
+	 */
+	if (XLogCtl->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		SpinLockRelease(&XLogCtl->info_lck);
+		elog(ERROR, "checksums not in \"inprogress-on\" mode");
+	}
+
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	INJECTION_POINT("datachecksums-enable-checksums-delay", NULL);
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_VERSION);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state transition of "on" in all backends. When done we know that
+	 * data checksums are enabled in all backends and data checksums are both
+	 * written and verified.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOff
+ *		Disables data checksums cluster-wide
+ *
+ * Disabling data checksums must be performed with two sets of barriers, each
+ * carrying a different state. The state is first set to "inprogress-off"
+ * during which checksums are still written but not verified. This ensures that
+ * backends which have yet to observe the state change from "on" won't get
+ * validation errors on concurrently modified pages. Once all backends have
+ * changed to "inprogress-off", the barrier for moving to "off" can be emitted.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOff(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+
+	/* If data checksums are already disabled there is nothing to do */
+	if (XLogCtl->data_checksum_version == 0)
+	{
+		SpinLockRelease(&XLogCtl->info_lck);
+		return;
+	}
+
+	/*
+	 * If data checksums are currently enabled we first transition to the
+	 * "inprogress-off" state during which backends continue to write
+	 * checksums without verifying them. When all backends are in
+	 * "inprogress-off" the next transition to "off" can be performed, after
+	 * which all data checksum processing is disabled.
+	 */
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
+	{
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+		START_CRIT_SECTION();
+
+		XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+
+		END_CRIT_SECTION();
+		MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+		/*
+		 * Update local state in all backends to ensure that any backend in
+		 * "on" state is changed to "inprogress-off".
+		 */
+		WaitForProcSignalBarrier(barrier);
+
+		/*
+		 * At this point we know that no backends are verifying data checksums
+		 * during reading. Next, we can safely move to state "off" to also
+		 * stop writing checksums.
+		 */
+	}
+	else
+	{
+		/*
+		 * Ending up here implies that the checksums state is "inprogress-on"
+		 * or "inprogress-off" and we can transition directly to "off" from
+		 * there.
+		 */
+		SpinLockRelease(&XLogCtl->info_lck);
+	}
+
+	/*
+	 * Ensure that we don't incur a checkpoint during disabling checksums.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(0);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = 0;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * ProcSignalBarrier absorption functions for enabling and disabling data
+ * checksums in a running cluster. The procsignalbarriers are emitted in the
+ * SetDataChecksums* functions.
+ */
+bool
+AbsorbChecksumsOnInProgressBarrier(void)
+{
+	Assert(LocalDataChecksumVersion != PG_DATA_CHECKSUM_VERSION);
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	return true;
+}
+
+bool
+AbsorbChecksumsOnBarrier(void)
+{
+	/*
+	 * If the process was spawned between updating XLogCtl and emitting the
+	 * barrier it will have seen the updated value, so for the first barrier
+	 * we accept both "on" and "inprogress-on".
+	 */
+	Assert((LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION) ||
+		   (InitialDataChecksumTransition &&
+			(LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)));
+
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_VERSION);
+	InitialDataChecksumTransition = false;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffInProgressBarrier(void)
+{
+	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+	return true;
+}
+
+bool
+AbsorbChecksumsOffBarrier(void)
+{
+	/*
+	 * We should never get here directly from a cluster with data checksums
+	 * enabled, an inprogress state should be in between.  When there are no
+	 * failures the inprogress-off state should preceed, but in case of error
+	 * in processing we can also reach here from the inprogress-on state.
+	 */
+	Assert((LocalDataChecksumVersion != PG_DATA_CHECKSUM_VERSION) &&
+		   (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION));
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_OFF);
+	return true;
+}
+
+/*
+ * InitLocalControlData
+ *
+ * Set up backend local caches of controldata variables which may change at
+ * any point during runtime and thus require special cased locking. So far
+ * this only applies to data_checksum_version, but it's intended to be general
+ * purpose enough to handle future cases.
+ */
+void
+InitLocalDataChecksumVersion(void)
+{
+	SpinLockAcquire(&XLogCtl->info_lck);
+	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+	SpinLockRelease(&XLogCtl->info_lck);
+}
+
+void
+SetLocalDataChecksumVersion(uint32 data_checksum_version)
+{
+	LocalDataChecksumVersion = data_checksum_version;
+
+	data_checksums = data_checksum_version;
+}
+
+/* guc hook */
+const char *
+show_data_checksums(void)
+{
+	if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)
+		return "on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		return "inprogress-on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+		return "inprogress-off";
+	else
+		return "off";
 }
 
 /*
@@ -4870,6 +5271,7 @@ LocalProcessControlFile(bool reset)
 	Assert(reset || ControlFile == NULL);
 	ControlFile = palloc(sizeof(ControlFileData));
 	ReadControlFile();
+	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
 }
 
 /*
@@ -5039,6 +5441,11 @@ XLOGShmemInit(void)
 	XLogCtl->InstallXLogFileSegmentActive = false;
 	XLogCtl->WalWriterSleeping = false;
 
+	/* Use the checksum info from control file */
+	XLogCtl->data_checksum_version = ControlFile->data_checksum_version;
+
+	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+
 	SpinLockInit(&XLogCtl->Insert.insertpos_lck);
 	SpinLockInit(&XLogCtl->info_lck);
 	pg_atomic_init_u64(&XLogCtl->logInsertResult, InvalidXLogRecPtr);
@@ -6180,6 +6587,47 @@ StartupXLOG(void)
 	pfree(endOfRecoveryInfo->recoveryStopReason);
 	pfree(endOfRecoveryInfo);
 
+	/*
+	 * If we reach this point with checksums in the state inprogress-on, it
+	 * means that data checksums were in the process of being enabled when the
+	 * cluster shut down. Since processing didn't finish, the operation will
+	 * have to be restarted from scratch since there is no capability to
+	 * continue where it was when the cluster shut down. Thus, revert the
+	 * state back to off, and inform the user with a warning message. Being
+	 * able to restart processing is a TODO, but it wouldn't be possible to
+	 * restart here since we cannot launch a dynamic background worker
+	 * directly from here (it has to be from a regular backend).
+	 */
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		XLogChecksums(0);
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = 0;
+		SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		ereport(WARNING,
+				(errmsg("data checksums state has been set of off"),
+				 errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
+	}
+
+	/*
+	 * If data checksums were being disabled when the cluster was shut down,
+	 * we know that we have a state where all backends have stopped validating
+	 * checksums and we can move to off instead of prompting the user to
+	 * perform any action.
+	 */
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+	{
+		XLogChecksums(0);
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = 0;
+		SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+		SpinLockRelease(&XLogCtl->info_lck);
+	}
+
 	/*
 	 * All done with end-of-recovery actions.
 	 *
@@ -6471,7 +6919,7 @@ GetRedoRecPtr(void)
 	XLogRecPtr	ptr;
 
 	/*
-	 * The possibly not up-to-date copy in XlogCtl is enough. Even if we
+	 * The possibly not up-to-date copy in XLogCtl is enough. Even if we
 	 * grabbed a WAL insertion lock to read the authoritative value in
 	 * Insert->RedoRecPtr, someone might update it just after we've released
 	 * the lock.
@@ -7035,6 +7483,12 @@ CreateCheckPoint(int flags)
 	checkPoint.fullPageWrites = Insert->fullPageWrites;
 	checkPoint.wal_level = wal_level;
 
+	/*
+	 * Get the current data_checksum_version value from xlogctl, valid at the
+	 * time of the checkpoint.
+	 */
+	checkPoint.data_checksum_version = XLogCtl->data_checksum_version;
+
 	if (shutdown)
 	{
 		XLogRecPtr	curInsert = XLogBytePosToRecPtr(Insert->CurrBytePos);
@@ -7290,6 +7744,9 @@ CreateCheckPoint(int flags)
 	ControlFile->minRecoveryPoint = InvalidXLogRecPtr;
 	ControlFile->minRecoveryPointTLI = 0;
 
+	/* make sure we start with the checksum version as of the checkpoint */
+	ControlFile->data_checksum_version = checkPoint.data_checksum_version;
+
 	/*
 	 * Persist unloggedLSN value. It's reset on crash recovery, so this goes
 	 * unused on non-shutdown checkpoints, but seems useful to store it always
@@ -7435,6 +7892,10 @@ CreateEndOfRecoveryRecord(void)
 	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
 	ControlFile->minRecoveryPoint = recptr;
 	ControlFile->minRecoveryPointTLI = xlrec.ThisTimeLineID;
+
+	/* start with the latest checksum version (as of the end of recovery) */
+	ControlFile->data_checksum_version = XLogCtl->data_checksum_version;
+
 	UpdateControlFile();
 	LWLockRelease(ControlFileLock);
 
@@ -7776,6 +8237,10 @@ CreateRestartPoint(int flags)
 			if (flags & CHECKPOINT_IS_SHUTDOWN)
 				ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY;
 		}
+
+		/* we shall start with the latest checksum version */
+		ControlFile->data_checksum_version = lastCheckPoint.data_checksum_version;
+
 		UpdateControlFile();
 	}
 	LWLockRelease(ControlFileLock);
@@ -8187,6 +8652,24 @@ XLogReportParameters(void)
 	}
 }
 
+/*
+ * Log the new state of checksums
+ */
+static void
+XLogChecksums(uint32 new_type)
+{
+	xl_checksum_state xlrec;
+	XLogRecPtr	recptr;
+
+	xlrec.new_checksumtype = new_type;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, sizeof(xl_checksum_state));
+
+	recptr = XLogInsert(RM_XLOG_ID, XLOG_CHECKSUMS);
+	XLogFlush(recptr);
+}
+
 /*
  * Update full_page_writes in shared memory, and write an
  * XLOG_FPW_CHANGE record if necessary.
@@ -8605,6 +9088,46 @@ xlog_redo(XLogReaderState *record)
 	{
 		/* nothing to do here, just for informational purposes */
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state state;
+		uint64		barrier;
+
+		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = state.new_checksumtype;
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		/*
+		 * Block on a procsignalbarrier to await all processes having seen the
+		 * change to checksum status. Once the barrier has been passed we can
+		 * initiate the corresponding processing.
+		 */
+		switch (state.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			default:
+				Assert(state.new_checksumtype == 0);
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+		}
+	}
 }
 
 /*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 8c3090165f0..337932a89e5 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -26,6 +26,7 @@
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "postmaster/datachecksumsworker.h"
 #include "replication/walreceiver.h"
 #include "storage/fd.h"
 #include "storage/latch.h"
@@ -748,3 +749,45 @@ pg_promote(PG_FUNCTION_ARGS)
 						   wait_seconds)));
 	PG_RETURN_BOOL(false);
 }
+
+/*
+ * Disables data checksums for the cluster, if applicable. Starts a background
+ * worker which turns off the data checksums.
+ */
+Datum
+disable_data_checksums(PG_FUNCTION_ARGS)
+{
+	bool		fast = PG_GETARG_BOOL(0);
+
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	StartDataChecksumsWorkerLauncher(DISABLE_DATACHECKSUMS, 0, 0, fast);
+	PG_RETURN_VOID();
+}
+
+/*
+ * Enables data checksums for the cluster, if applicable.  Supports vacuum-
+ * like cost based throttling to limit system load. Starts a background worker
+ * which updates data checksums on existing data.
+ */
+Datum
+enable_data_checksums(PG_FUNCTION_ARGS)
+{
+	int			cost_delay = PG_GETARG_INT32(0);
+	int			cost_limit = PG_GETARG_INT32(1);
+	bool		fast = PG_GETARG_BOOL(2);
+
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	if (cost_delay < 0)
+		ereport(ERROR, errmsg("cost delay cannot be a negative value"));
+
+	if (cost_limit <= 0)
+		ereport(ERROR, errmsg("cost limit must be greater than zero"));
+
+	StartDataChecksumsWorkerLauncher(ENABLE_DATACHECKSUMS, cost_delay, cost_limit, fast);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index bb7d90aa5d9..54dcfbcb333 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1613,7 +1613,8 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	 * enabled for this cluster, and if this is a relation file, then verify
 	 * the checksum.
 	 */
-	if (!noverify_checksums && DataChecksumsEnabled() &&
+	if (!noverify_checksums &&
+		DataChecksumsNeedWrite() &&
 		RelFileNumberIsValid(relfilenumber))
 		verify_checksum = true;
 
@@ -2007,6 +2008,9 @@ verify_page_checksum(Page page, XLogRecPtr start_lsn, BlockNumber blkno,
 	if (PageIsNew(page) || PageGetLSN(page) >= start_lsn)
 		return true;
 
+	if (!DataChecksumsNeedVerify())
+		return true;
+
 	/* Perform the actual checksum calculation. */
 	checksum = pg_checksum_page(page, blkno);
 
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 566f308e443..dea7ad3cf30 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -650,6 +650,18 @@ LANGUAGE INTERNAL
 CALLED ON NULL INPUT VOLATILE PARALLEL SAFE
 AS 'pg_stat_reset_slru';
 
+CREATE OR REPLACE FUNCTION
+  pg_enable_data_checksums(cost_delay integer DEFAULT 0,
+                           cost_limit integer DEFAULT 100,
+						   fast boolean DEFAULT false)
+  RETURNS void STRICT VOLATILE LANGUAGE internal AS 'enable_data_checksums'
+  PARALLEL RESTRICTED;
+
+CREATE OR REPLACE FUNCTION
+  pg_disable_data_checksums(fast boolean DEFAULT false)
+  RETURNS void STRICT VOLATILE LANGUAGE internal AS 'disable_data_checksums'
+  PARALLEL RESTRICTED;
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
@@ -775,6 +787,10 @@ REVOKE EXECUTE ON FUNCTION pg_ls_logicalmapdir() FROM PUBLIC;
 
 REVOKE EXECUTE ON FUNCTION pg_ls_replslotdir(text) FROM PUBLIC;
 
+REVOKE EXECUTE ON FUNCTION pg_enable_data_checksums(integer, integer, boolean) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_disable_data_checksums(boolean) FROM public;
+
 --
 -- We also set up some things as accessible to standard roles.
 --
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 1b3c5a55882..22f67c7ee4a 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1354,6 +1354,26 @@ CREATE VIEW pg_stat_progress_copy AS
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_data_checksums AS
+    SELECT
+        S.pid AS pid, S.datid, D.datname AS datname,
+        CASE S.param1 WHEN 0 THEN 'enabling'
+                      WHEN 1 THEN 'disabling'
+                      WHEN 2 THEN 'waiting'
+                      WHEN 3 THEN 'waiting on temporary tables'
+                      WHEN 4 THEN 'waiting on checkpoint'
+                      WHEN 5 THEN 'done'
+                      END AS phase,
+        CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+        S.param3 AS databases_done,
+        CASE S.param4 WHEN -1 THEN NULL ELSE S.param4 END AS relations_total,
+        CASE S.param5 WHEN -1 THEN NULL ELSE S.param5 END AS relations_done,
+        CASE S.param6 WHEN -1 THEN NULL ELSE S.param6 END AS blocks_total,
+        CASE S.param7 WHEN -1 THEN NULL ELSE S.param7 END AS blocks_done
+    FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid
+    ORDER BY S.datid; -- return the launcher process first
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/postmaster/Makefile b/src/backend/postmaster/Makefile
index 0f4435d2d97..0c36765acfe 100644
--- a/src/backend/postmaster/Makefile
+++ b/src/backend/postmaster/Makefile
@@ -18,6 +18,7 @@ OBJS = \
 	bgworker.o \
 	bgwriter.o \
 	checkpointer.o \
+	datachecksumsworker.o \
 	fork_process.o \
 	interrupt.o \
 	launch_backend.o \
diff --git a/src/backend/postmaster/auxprocess.c b/src/backend/postmaster/auxprocess.c
index a6d3630398f..5742a1dd724 100644
--- a/src/backend/postmaster/auxprocess.c
+++ b/src/backend/postmaster/auxprocess.c
@@ -15,6 +15,7 @@
 #include <unistd.h>
 #include <signal.h>
 
+#include "access/xlog.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/auxprocess.h"
@@ -68,6 +69,24 @@ AuxiliaryProcessMainCommon(void)
 
 	ProcSignalInit(NULL, 0);
 
+	/*
+	 * Initialize a local cache of the data_checksum_version, to be updated by
+	 * the procsignal-based barriers.
+	 *
+	 * This intentionally happens after initializing the procsignal, otherwise
+	 * we might miss a state change. This means we can get a barrier for the
+	 * state we've just initialized - but it can happen only once.
+	 *
+	 * The postmaster (which is what gets forked into the new child process)
+	 * does not handle barriers, therefore it may not have the current value
+	 * of LocalDataChecksumVersion value (it'll have the value read from the
+	 * control file, which may be arbitrarily old).
+	 *
+	 * NB: Even if the postmaster handled barriers, the value might still be
+	 * stale, as it might have changed after this process forked.
+	 */
+	InitLocalDataChecksumVersion();
+
 	/*
 	 * Auxiliary processes don't run transactions, but they may need a
 	 * resource owner anyway to manage buffer pins acquired outside
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 1ad65c237c3..0d2ade1f905 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -18,6 +18,7 @@
 #include "pgstat.h"
 #include "port/atomics.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "replication/logicallauncher.h"
 #include "replication/logicalworker.h"
@@ -132,6 +133,12 @@ static const struct
 	},
 	{
 		"TablesyncWorkerMain", TablesyncWorkerMain
+	},
+	{
+		"DataChecksumsWorkerLauncherMain", DataChecksumsWorkerLauncherMain
+	},
+	{
+		"DataChecksumsWorkerMain", DataChecksumsWorkerMain
 	}
 };
 
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
new file mode 100644
index 00000000000..ff451d502ba
--- /dev/null
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -0,0 +1,1463 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.c
+ *	  Background worker for enabling or disabling data checksums online
+ *
+ * When enabling data checksums on a database at initdb time or when shut down
+ * with pg_checksums, no extra process is required as each page is checksummed,
+ * and verified, when accessed.  When enabling checksums on an already running
+ * cluster, this worker will ensure that all pages are checksummed before
+ * verification of the checksums is turned on. In the case of disabling
+ * checksums, the state transition is performed only in the control file, no
+ * changes are performed on the data pages.
+ *
+ * Checksums can be either enabled or disabled cluster-wide, with on/off being
+ * the end state for data_checksums.
+ *
+ * Enabling checksums
+ * ------------------
+ * When enabling checksums in an online cluster, data_checksums will be set to
+ * "inprogress-on" which signals that write operations MUST compute and write
+ * the checksum on the data page, but during reading the checksum SHALL NOT be
+ * verified. This ensures that all objects created during checksumming will
+ * have checksums set, but no reads will fail due to incorrect checksum. The
+ * DataChecksumsWorker will compile a list of databases which exist at the
+ * start of checksumming, and all of these which haven't been dropped during
+ * the processing MUST have been processed successfully in order for checksums
+ * to be enabled. Any new relation created during processing will see the
+ * in-progress state and will automatically be checksummed.
+ *
+ * For each database, all relations which have storage are read and every data
+ * page is marked dirty to force a write with the checksum. This will generate
+ * a lot of WAL as the entire database is read and written.
+ *
+ * If the processing is interrupted by a cluster restart, it will be restarted
+ * from the beginning again as state isn't persisted.
+ *
+ * Disabling checksums
+ * -------------------
+ * When disabling checksums, data_checksums will be set to "inprogress-off"
+ * which signals that checksums are written but no longer verified. This ensure
+ * that backends which have yet to move from the "on" state will still be able
+ * to process data checksum validation.
+ *
+ * Synchronization and Correctness
+ * -------------------------------
+ * The processes involved in enabling, or disabling, data checksums in an
+ * online cluster must be properly synchronized with the normal backends
+ * serving concurrent queries to ensure correctness. Correctness is defined
+ * as the following:
+ *
+ *    - Backends SHALL NOT violate local data_checksums state
+ *    - Data checksums SHALL NOT be considered enabled cluster-wide until all
+ *      currently connected backends have the local state "enabled"
+ *
+ * There are two levels of synchronization required for enabling data checksums
+ * in an online cluster: (i) changing state in the active backends ("on",
+ * "off", "inprogress-on" and "inprogress-off"), and (ii) ensuring no
+ * incompatible objects and processes are left in a database when workers end.
+ * The former deals with cluster-wide agreement on data checksum state and the
+ * latter with ensuring that any concurrent activity cannot break the data
+ * checksum contract during processing.
+ *
+ * Synchronizing the state change is done with procsignal barriers, where the
+ * WAL logging backend updating the global state in the controlfile will wait
+ * for all other backends to absorb the barrier. Barrier absorption will happen
+ * during interrupt processing, which means that connected backends will change
+ * state at different times. To prevent data checksum state changes when
+ * writing and verifying checksums, interrupts shall be held off before
+ * interrogating state and resumed when the IO operation has been performed.
+ *
+ *   When Enabling Data Checksums
+ *   ----------------------------
+ *   A process which fails to observe data checksums being enabled can induce
+ *   two types of errors: failing to write the checksum when modifying the page
+ *   and failing to validate the data checksum on the page when reading it.
+ *
+ *   When processing starts all backends belong to one of the below sets, with
+ *   one set being empty:
+ *
+ *   Bd: Backends in "off" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   If processing is started in an online cluster then all backends are in Bd.
+ *   If processing was halted by the cluster shutting down, the controlfile
+ *   state "inprogress-on" will be observed on system startup and all backends
+ *   will be in Bd. Backends transition Bd -> Bi via a procsignalbarrier.  When
+ *   the DataChecksumsWorker has finished writing checksums on all pages and
+ *   enables data checksums cluster-wide, there are four sets of backends where
+ *   Bd shall be an empty set:
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   Backends in Bi and Be will write checksums when modifying a page, but only
+ *   backends in Be will verify the checksum during reading. The Bg backend is
+ *   blocked waiting for all backends in Bi to process interrupts and move to
+ *   Be. Any backend starting while Bg is waiting on the procsignalbarrier will
+ *   observe the global state being "on" and will thus automatically belong to
+ *   Be.  Checksums are enabled cluster-wide when Bi is an empty set. Bi and Be
+ *   are compatible sets while still operating based on their local state as
+ *   both write data checksums.
+ *
+ *   When Disabling Data Checksums
+ *   -----------------------------
+ *   A process which fails to observe that data checksums have been disabled
+ *   can induce two types of errors: writing the checksum when modifying the
+ *   page and validating a data checksum which is no longer correct due to
+ *   modifications to the page.
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bo: Backends in "inprogress-off" state
+ *
+ *   Backends transition from the Be state to Bd like so: Be -> Bo -> Bd
+ *
+ *   The goal is to transition all backends to Bd making the others empty sets.
+ *   Backends in Bo write data checksums, but don't validate them, such that
+ *   backends still in Be can continue to validate pages until the barrier has
+ *   been absorbed such that they are in Bo. Once all backends are in Bo, the
+ *   barrier to transition to "off" can be raised and all backends can safely
+ *   stop writing data checksums as no backend is enforcing data checksum
+ *   validation any longer.
+ *
+ *
+ * Potential optimizations
+ * -----------------------
+ * Below are some potential optimizations and improvements which were brought
+ * up during reviews of this feature, but which weren't implemented in the
+ * initial version. These are ideas listed without any validation on their
+ * feasibility or potential payoff. More discussion on these can be found on
+ * the -hackers threads linked to in the commit message of this feature.
+ *
+ *   * Launching datachecksumsworker for resuming operation from the startup
+ *     process: Currently users have to restart processing manually after a
+ *     restart since dynamic background worker cannot be started from the
+ *     postmaster. Changing the startup process could make restarting the
+ *     processing automatic on cluster restart.
+ *   * Avoid dirtying the page when checksums already match: Iff the checksum
+ *     on the page happens to already match we still dirty the page. It should
+ *     be enough to only do the log_newpage_buffer() call in that case.
+ *   * Invent a lightweight WAL record that doesn't contain the full-page
+ *     image but just the block number: On replay, the redo routine would read
+ *     the page from disk.
+ *   * Teach pg_checksums to avoid checksummed pages when pg_checksums is used
+ *     to enable checksums on a cluster which is in inprogress-on state and
+ *     may have checksummed pages (make pg_checksums be able to resume an
+ *     online operation).
+ *
+ *
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/postmaster/datachecksumsworker.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/htup_details.h"
+#include "access/xact.h"
+#include "access/xloginsert.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_database.h"
+#include "commands/progress.h"
+#include "commands/vacuum.h"
+#include "common/relpath.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "postmaster/bgworker.h"
+#include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/bufmgr.h"
+#include "storage/checksum.h"
+#include "storage/ipc.h"
+#include "storage/lmgr.h"
+#include "storage/lwlock.h"
+#include "storage/procarray.h"
+#include "storage/smgr.h"
+#include "tcop/tcopprot.h"
+#include "utils/fmgroids.h"
+#include "utils/injection_point.h"
+#include "utils/lsyscache.h"
+#include "utils/ps_status.h"
+#include "utils/syscache.h"
+
+/*
+ * Number of times we retry to open a database before giving up and consider
+ * it to have failed processing.
+ */
+#define DATACHECKSUMSWORKER_MAX_DB_RETRIES 5
+
+/*
+ * Signaling between backends calling pg_enable/disable_data_checksums, the
+ * checksums launcher process, and the checksums worker process.
+ *
+ * This struct is protected by DataChecksumsWorkerLock
+ */
+typedef struct DataChecksumsWorkerShmemStruct
+{
+	/*
+	 * These are set by pg_{enable|disable|verify}_data_checksums, to tell the
+	 * launcher what the target state is.
+	 */
+	DataChecksumsWorkerOperation launch_operation;
+	int			launch_cost_delay;
+	int			launch_cost_limit;
+	bool		launch_fast;
+
+	/*
+	 * Is a launcher process is currently running?
+	 *
+	 * This is set by the launcher process, after it has read the above
+	 * launch_* parameters.
+	 */
+	bool		launcher_running;
+
+	/*
+	 * These fields indicate the target state that the launcher is currently
+	 * working towards. They can be different from the corresponding launch_*
+	 * fields, if a new pg_enable/disable_data_checksums() call was made while
+	 * the launcher/worker was already running.
+	 *
+	 * The below members are set when the launcher starts, and are only
+	 * accessed read-only by the single worker. Thus, we can access these
+	 * without a lock. If multiple workers, or dynamic cost parameters, are
+	 * supported at some point then this would need to be revisited.
+	 */
+	DataChecksumsWorkerOperation operation;
+	int			cost_delay;
+	int			cost_limit;
+	bool		immediate_checkpoint;
+
+	/*
+	 * Signaling between the launcher and the worker process.
+	 *
+	 * As there is only a single worker, and the launcher won't read these
+	 * until the worker exits, they can be accessed without the need for a
+	 * lock. If multiple workers are supported then this will have to be
+	 * revisited.
+	 */
+
+	/* result, set by worker before exiting */
+	DataChecksumsWorkerResult success;
+
+	/*
+	 * tells the worker process whether it should also process the shared
+	 * catalogs
+	 */
+	bool		process_shared_catalogs;
+} DataChecksumsWorkerShmemStruct;
+
+/* Shared memory segment for datachecksumsworker */
+static DataChecksumsWorkerShmemStruct *DataChecksumsWorkerShmem;
+
+typedef struct DataChecksumsWorkerDatabase
+{
+	Oid			dboid;
+	char	   *dbname;
+} DataChecksumsWorkerDatabase;
+
+typedef struct DataChecksumsWorkerResultEntry
+{
+	Oid			dboid;
+	DataChecksumsWorkerResult result;
+	int			retries;
+} DataChecksumsWorkerResultEntry;
+
+
+/*
+ * Flag set by the interrupt handler
+ */
+static volatile sig_atomic_t abort_requested = false;
+
+/*
+ * Have we set the DataChecksumsWorkerShmemStruct->launcher_running flag?
+ * If we have, we need to clear it before exiting!
+ */
+static volatile sig_atomic_t launcher_running = false;
+
+/*
+ * Are we enabling data checksums, or disabling them?
+ */
+static DataChecksumsWorkerOperation operation;
+
+/* Prototypes */
+static List *BuildDatabaseList(void);
+static List *BuildRelationList(bool temp_relations, bool include_shared);
+static void FreeDatabaseList(List *dblist);
+static DataChecksumsWorkerResult ProcessDatabase(DataChecksumsWorkerDatabase *db);
+static bool ProcessAllDatabases(bool immediate_checkpoint);
+static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy);
+static void launcher_cancel_handler(SIGNAL_ARGS);
+static void WaitForAllTransactionsToFinish(void);
+
+/*
+ * StartDataChecksumsWorkerLauncher
+ *		Main entry point for datachecksumsworker launcher process
+ *
+ * The main entrypoint for starting data checksums processing for enabling as
+ * well as disabling.
+ */
+void
+StartDataChecksumsWorkerLauncher(DataChecksumsWorkerOperation op,
+								 int cost_delay,
+								 int cost_limit,
+								 bool fast)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	bool		launcher_running;
+
+#ifdef USE_ASSERT_CHECKING
+	/* The cost delay settings have no effect when disabling */
+	if (op == DISABLE_DATACHECKSUMS)
+		Assert(cost_delay == 0 && cost_limit == 0);
+#endif
+
+	/* Store the desired state in shared memory */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	DataChecksumsWorkerShmem->launch_operation = op;
+	DataChecksumsWorkerShmem->launch_cost_delay = cost_delay;
+	DataChecksumsWorkerShmem->launch_cost_limit = cost_limit;
+	DataChecksumsWorkerShmem->launch_fast = fast;
+
+	/* is the launcher already running? */
+	launcher_running = DataChecksumsWorkerShmem->launcher_running;
+
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * Launch a new launcher process, if it's not running already.
+	 *
+	 * If the launcher is currently busy enabling the checksums, and we want
+	 * them disabled (or vice versa), the launcher will notice that at latest
+	 * when it's about to exit, and will loop back process the new request. So
+	 * if the launcher is already running, we don't need to do anything more
+	 * here to abort it.
+	 *
+	 * If you call pg_enable/disable_data_checksums() twice in a row, before
+	 * the launcher has had a chance to start up, we still end up launching it
+	 * twice.  That's OK, the second invocation will see that a launcher is
+	 * already running and exit quickly.
+	 *
+	 * TODO: We could optimize here and skip launching the launcher, if we are
+	 * already in the desired state, i.e. if the checksums are already enabled
+	 * and you call pg_enable_data_checksums().
+	 */
+	if (!launcher_running)
+	{
+		/*
+		 * Prepare the BackgroundWorker and launch it.
+		 */
+		memset(&bgw, 0, sizeof(bgw));
+		bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+		bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+		snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+		snprintf(bgw.bgw_function_name, BGW_MAXLEN, "DataChecksumsWorkerLauncherMain");
+		snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker launcher");
+		snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker launcher");
+		bgw.bgw_restart_time = BGW_NEVER_RESTART;
+		bgw.bgw_notify_pid = MyProcPid;
+		bgw.bgw_main_arg = (Datum) 0;
+
+		if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+			ereport(ERROR,
+					errmsg("failed to start background worker to process data checksums"));
+	}
+}
+
+/*
+ * ProcessSingleRelationFork
+ *		Enable data checksums in a single relation/fork.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy)
+{
+	BlockNumber numblocks = RelationGetNumberOfBlocksInFork(reln, forkNum);
+	char		activity[NAMEDATALEN * 2 + 128];
+	char	   *relns;
+
+	relns = get_namespace_name(RelationGetNamespace(reln));
+
+	if (!relns)
+		return false;
+
+	/* Report the current relation to pgstat_activity */
+	snprintf(activity, sizeof(activity) - 1, "processing: %s.%s (%s, %dblocks)",
+			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	/*
+	 * As of now we only update the block counter for main forks in order to
+	 * not cause too frequent calls. TODO: investigate whether we should do it
+	 * more frequent?
+	 */
+	if (forkNum == MAIN_FORKNUM)
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+									 numblocks);
+
+	/*
+	 * We are looping over the blocks which existed at the time of process
+	 * start, which is safe since new blocks are created with checksums set
+	 * already due to the state being "inprogress-on".
+	 */
+	for (BlockNumber blknum = 0; blknum < numblocks; blknum++)
+	{
+		Buffer		buf = ReadBufferExtended(reln, forkNum, blknum, RBM_NORMAL, strategy);
+
+		/* Need to get an exclusive lock before we can flag as dirty */
+		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+
+		/*
+		 * Mark the buffer as dirty and force a full page write.  We have to
+		 * re-write the page to WAL even if the checksum hasn't changed,
+		 * because if there is a replica it might have a slightly different
+		 * version of the page with an invalid checksum, caused by unlogged
+		 * changes (e.g. hintbits) on the master happening while checksums
+		 * were off. This can happen if there was a valid checksum on the page
+		 * at one point in the past, so only when checksums are first on, then
+		 * off, and then turned on again.  TODO: investigate if this could be
+		 * avoided if the checksum is calculated to be correct and wal_level
+		 * is set to "minimal",
+		 */
+		START_CRIT_SECTION();
+		MarkBufferDirty(buf);
+		log_newpage_buffer(buf, false);
+		END_CRIT_SECTION();
+
+		UnlockReleaseBuffer(buf);
+
+		/*
+		 * This is the only place where we check if we are asked to abort, the
+		 * abortion will bubble up from here. It's safe to check this without
+		 * a lock, because if we miss it being set, we will try again soon.
+		 */
+		Assert(operation == ENABLE_DATACHECKSUMS);
+		if (DataChecksumsWorkerShmem->launch_operation == DISABLE_DATACHECKSUMS)
+			abort_requested = true;
+
+		if (abort_requested)
+			return false;
+
+		/*
+		 * As of now we only update the block counter for main forks in order
+		 * to not cause too frequent calls. TODO: investigate whether we
+		 * should do it more frequent?
+		 */
+		if (forkNum == MAIN_FORKNUM)
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+										 (blknum + 1));
+
+		vacuum_delay_point(false);
+	}
+
+	pfree(relns);
+	return true;
+}
+
+/*
+ * ProcessSingleRelationByOid
+ *		Process a single relation based on oid.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationByOid(Oid relationId, BufferAccessStrategy strategy)
+{
+	Relation	rel;
+	bool		aborted = false;
+
+	StartTransactionCommand();
+
+	rel = try_relation_open(relationId, AccessShareLock);
+	if (rel == NULL)
+	{
+		/*
+		 * Relation no longer exists. We don't consider this an error since
+		 * there are no pages in it that need data checksums, and thus return
+		 * true. The worker operates off a list of relations generated at the
+		 * start of processing, so relations being dropped in the meantime is
+		 * to be expected.
+		 */
+		CommitTransactionCommand();
+		pgstat_report_activity(STATE_IDLE, NULL);
+		return true;
+	}
+	RelationGetSmgr(rel);
+
+	for (ForkNumber fnum = 0; fnum <= MAX_FORKNUM; fnum++)
+	{
+		if (smgrexists(rel->rd_smgr, fnum))
+		{
+			if (!ProcessSingleRelationFork(rel, fnum, strategy))
+			{
+				aborted = true;
+				break;
+			}
+		}
+	}
+	relation_close(rel, AccessShareLock);
+	elog(DEBUG2,
+		 "data checksum processing done for relation with OID %u: %s",
+		 relationId, (aborted ? "aborted" : "finished"));
+
+	CommitTransactionCommand();
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return !aborted;
+}
+
+/*
+ * ProcessDatabase
+ *		Enable data checksums in a single database.
+ *
+ * We do this by launching a dynamic background worker into this database, and
+ * waiting for it to finish.  We have to do this in a separate worker, since
+ * each process can only be connected to one database during its lifetime.
+ */
+static DataChecksumsWorkerResult
+ProcessDatabase(DataChecksumsWorkerDatabase *db)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	BgwHandleStatus status;
+	pid_t		pid;
+	char		activity[NAMEDATALEN + 64];
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_FAILED;
+
+	memset(&bgw, 0, sizeof(bgw));
+	bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+	bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+	snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+	snprintf(bgw.bgw_function_name, BGW_MAXLEN, "%s", "DataChecksumsWorkerMain");
+	snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker worker");
+	snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker worker");
+	bgw.bgw_restart_time = BGW_NEVER_RESTART;
+	bgw.bgw_notify_pid = MyProcPid;
+	bgw.bgw_main_arg = ObjectIdGetDatum(db->dboid);
+
+	/*
+	 * If there are no worker slots available, make sure we retry processing
+	 * this database. This will make the datachecksumsworker move on to the
+	 * next database and quite likely fail with the same problem. TODO: Maybe
+	 * we need a backoff to avoid running through all the databases here in
+	 * short order.
+	 */
+	if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+	{
+		ereport(WARNING,
+				errmsg("failed to start worker for enabling data checksums in database \"%s\", retrying",
+					   db->dbname),
+				errhint("The max_worker_processes setting might be too low."));
+		return DATACHECKSUMSWORKER_RETRYDB;
+	}
+
+	status = WaitForBackgroundWorkerStartup(bgw_handle, &pid);
+	if (status == BGWH_STOPPED)
+	{
+		ereport(WARNING,
+				errmsg("could not start background worker for enabling data checksums in database \"%s\"",
+					   db->dbname),
+				errhint("More details on the error might be found in the server log."));
+		return DATACHECKSUMSWORKER_FAILED;
+	}
+
+	/*
+	 * If the postmaster crashed we cannot end up with a processed database so
+	 * we have no alternative other than exiting. When enabling checksums we
+	 * won't at this time have changed the pg_control version to enabled so
+	 * when the cluster comes back up processing will have to be restarted.
+	 * When disabling, the pg_control version will be set to off before this
+	 * so when the cluster comes up checksums will be off as expected.
+	 */
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				errmsg("cannot enable data checksums without the postmaster process"),
+				errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums()."));
+
+	Assert(status == BGWH_STARTED);
+	ereport(DEBUG1,
+			errmsg("initiating data checksum processing in database \"%s\"",
+				   db->dbname));
+
+	snprintf(activity, sizeof(activity) - 1,
+			 "Waiting for worker in database %s (pid %ld)", db->dbname, (long) pid);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	status = WaitForBackgroundWorkerShutdown(bgw_handle);
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				errmsg("postmaster exited during data checksum processing in \"%s\"",
+					   db->dbname),
+				errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums()."));
+
+	if (DataChecksumsWorkerShmem->success == DATACHECKSUMSWORKER_ABORTED)
+		ereport(LOG,
+				errmsg("data checksums processing was aborted in database \"%s\"",
+					   db->dbname));
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return DataChecksumsWorkerShmem->success;
+}
+
+/*
+ * launcher_exit
+ *
+ * Internal routine for cleaning up state when the launcher process exits. We
+ * need to clean up the abort flag to ensure that processing can be restarted
+ * again after it was previously aborted.
+ */
+static void
+launcher_exit(int code, Datum arg)
+{
+	if (launcher_running)
+	{
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		launcher_running = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		LWLockRelease(DataChecksumsWorkerLock);
+	}
+}
+
+/*
+ * launcher_cancel_handler
+ *
+ * Internal routine for reacting to SIGINT and flagging the worker to abort.
+ * The worker won't be interrupted immediately but will check for abort flag
+ * between each block in a relation.
+ */
+static void
+launcher_cancel_handler(SIGNAL_ARGS)
+{
+	int			save_errno = errno;
+
+	abort_requested = true;
+
+	/*
+	 * There is no sleeping in the main loop, the flag will be checked
+	 * periodically in ProcessSingleRelationFork. The worker does however
+	 * sleep when waiting for concurrent transactions to end so we still need
+	 * to set the latch.
+	 */
+	SetLatch(MyLatch);
+
+	errno = save_errno;
+}
+
+/*
+ * WaitForAllTransactionsToFinish
+ *		Blocks awaiting all current transactions to finish
+ *
+ * Returns when all transactions which are active at the call of the function
+ * have ended, or if the postmaster dies while waiting. If the postmaster dies
+ * the abort flag will be set to indicate that the caller of this shouldn't
+ * proceed.
+ *
+ * NB: this will return early, if aborted by SIGINT or if the target state
+ * is changed while we're running.
+ */
+static void
+WaitForAllTransactionsToFinish(void)
+{
+	TransactionId waitforxid;
+
+	LWLockAcquire(XidGenLock, LW_SHARED);
+	waitforxid = XidFromFullTransactionId(TransamVariables->nextXid);
+	LWLockRelease(XidGenLock);
+
+	while (TransactionIdPrecedes(GetOldestActiveTransactionId(false, true), waitforxid))
+	{
+		char		activity[64];
+		int			rc;
+
+		/* Oldest running xid is older than us, so wait */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for current transactions to finish (waiting for %u)",
+				 waitforxid);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 3 seconds */
+		ResetLatch(MyLatch);
+		rc = WaitLatch(MyLatch,
+					   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+					   3000,
+					   WAIT_EVENT_CHECKSUM_ENABLE_STARTCONDITION);
+
+		/*
+		 * If the postmaster died we won't be able to enable checksums
+		 * cluster-wide so abort and hope to continue when restarted.
+		 */
+		if (rc & WL_POSTMASTER_DEATH)
+			ereport(FATAL,
+					errmsg("postmaster exited during data checksum processing"),
+					errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums()."));
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_SHARED);
+		if (DataChecksumsWorkerShmem->launch_operation != operation)
+			abort_requested = true;
+		LWLockRelease(DataChecksumsWorkerLock);
+		if (abort_requested)
+			break;
+	}
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+	return;
+}
+
+/*
+ * DataChecksumsWorkerLauncherMain
+ *
+ * Main function for launching dynamic background workers for processing data
+ * checksums in databases. This function has the bgworker management, with
+ * ProcessAllDatabases being responsible for looping over the databases and
+ * initiating processing.
+ */
+void
+DataChecksumsWorkerLauncherMain(Datum arg)
+{
+	on_shmem_exit(launcher_exit, 0);
+
+	ereport(DEBUG1,
+			errmsg("background worker \"datachecksumsworker\" launcher started"));
+
+	pqsignal(SIGTERM, die);
+	pqsignal(SIGINT, launcher_cancel_handler);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_LAUNCHER;
+	init_ps_display(NULL);
+
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	if (DataChecksumsWorkerShmem->launcher_running)
+	{
+		/* Launcher was already running, let it finish */
+		LWLockRelease(DataChecksumsWorkerLock);
+		return;
+	}
+
+	launcher_running = true;
+
+	/*
+	 * Initialize a connection to shared catalogs only.
+	 */
+	BackgroundWorkerInitializeConnectionByOid(InvalidOid, InvalidOid, 0);
+
+	operation = DataChecksumsWorkerShmem->launch_operation;
+	DataChecksumsWorkerShmem->launcher_running = true;
+	DataChecksumsWorkerShmem->operation = operation;
+	DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+	DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+	DataChecksumsWorkerShmem->immediate_checkpoint = DataChecksumsWorkerShmem->launch_fast;
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * The target state can change while we are busy enabling/disabling
+	 * checksums, if the user calls pg_disable/enable_data_checksums() before
+	 * we are finished with the previous request. In that case, we will loop
+	 * back here, to process the new request.
+	 */
+again:
+
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	if (operation == ENABLE_DATACHECKSUMS)
+	{
+		/*
+		 * If we are asked to enable checksums in a cluster which already has
+		 * checksums enabled, exit immediately as there is nothing more to do.
+		 * Hold interrupts to make sure state doesn't change during checking.
+		 */
+		HOLD_INTERRUPTS();
+		if (DataChecksumsNeedVerify())
+		{
+			RESUME_INTERRUPTS();
+			goto done;
+		}
+		RESUME_INTERRUPTS();
+
+		/*
+		 * Set the state to inprogress-on and wait on the procsignal barrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_ENABLING);
+		SetDataChecksumsOnInProgress();
+
+		/*
+		 * All backends are now in inprogress-on state and are writing data
+		 * checksums.  Start processing all data at rest.
+		 */
+		if (!ProcessAllDatabases(DataChecksumsWorkerShmem->immediate_checkpoint))
+		{
+			/*
+			 * If the target state changed during processing then it's not a
+			 * failure, so restart processing instead.
+			 */
+			LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+			if (DataChecksumsWorkerShmem->launch_operation != operation)
+			{
+				LWLockRelease(DataChecksumsWorkerLock);
+				goto done;
+			}
+			LWLockRelease(DataChecksumsWorkerLock);
+			ereport(ERROR,
+					errmsg("unable to enable data checksums in cluster"));
+		}
+
+		/*
+		 * Data checksums have been set on all pages, set the state to on in
+		 * order to instruct backends to validate checksums on reading.
+		 */
+		SetDataChecksumsOn();
+	}
+	else
+	{
+		int			flags;
+
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_DISABLING);
+		SetDataChecksumsOff();
+
+		flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+		if (DataChecksumsWorkerShmem->immediate_checkpoint)
+			flags = flags | CHECKPOINT_FAST;
+		RequestCheckpoint(flags);
+	}
+
+done:
+
+	/*
+	 * All done. But before we exit, check if the target state was changed
+	 * while we were running. In that case we will have to start all over
+	 * again.
+	 */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+	if (DataChecksumsWorkerShmem->launch_operation != operation)
+	{
+		DataChecksumsWorkerShmem->operation = DataChecksumsWorkerShmem->launch_operation;
+		operation = DataChecksumsWorkerShmem->launch_operation;
+		DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+		DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+		LWLockRelease(DataChecksumsWorkerLock);
+		goto again;
+	}
+
+	/* Shut down progress reporting as we are done */
+	pgstat_progress_end_command();
+
+	launcher_running = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	LWLockRelease(DataChecksumsWorkerLock);
+}
+
+/*
+ * ProcessAllDatabases
+ *		Compute the list of all databases and process checksums in each
+ *
+ * This will repeatedly generate a list of databases to process for enabling
+ * checksums. Until no new databases are found, this will loop around computing
+ * a new list and comparing it to the already seen ones.
+ *
+ * If immediate_checkpoint is set to true then a CHECKPOINT_FAST will be
+ * issued. This is useful for testing but should be avoided in production use
+ * as it may affect cluster performance drastically.
+ */
+static bool
+ProcessAllDatabases(bool immediate_checkpoint)
+{
+	List	   *DatabaseList;
+	HTAB	   *ProcessedDatabases = NULL;
+	HASHCTL		hash_ctl;
+	bool		found_failed = false;
+	int			flags;
+
+	/* Initialize a hash tracking all processed databases */
+	memset(&hash_ctl, 0, sizeof(hash_ctl));
+	hash_ctl.keysize = sizeof(Oid);
+	hash_ctl.entrysize = sizeof(DataChecksumsWorkerResultEntry);
+	ProcessedDatabases = hash_create("Processed databases",
+									 64,
+									 &hash_ctl,
+									 HASH_ELEM | HASH_BLOBS);
+
+	/*
+	 * Set up so first run processes shared catalogs, but not once in every
+	 * db.
+	 */
+	DataChecksumsWorkerShmem->process_shared_catalogs = true;
+
+	/*
+	 * Get a list of all databases to process. This may include databases that
+	 * were created during our runtime.  Since a database can be created as a
+	 * copy of any other database (which may not have existed in our last
+	 * run), we have to repeat this loop until no new databases show up in the
+	 * list.
+	 */
+	DatabaseList = BuildDatabaseList();
+
+	/* Allow a test case to modify the initial list of databases */
+	INJECTION_POINT("datachecksumsworker-initial-dblist", DatabaseList);
+
+	/*
+	 * Update progress reporting with the total number of databases we need to
+	 * process.  This number should not be changed during processing, the
+	 * columns for processed databases is instead increased such that it can
+	 * be compared against the total.
+	 */
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_DBS_TOTAL,
+			PROGRESS_DATACHECKSUMS_DBS_DONE,
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE,
+			PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+			PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+		};
+
+		int64		vals[6];
+
+		vals[0] = list_length(DatabaseList);
+		vals[1] = 0;
+
+		/* translated to NULL */
+		vals[2] = -1;
+		vals[3] = -1;
+		vals[4] = -1;
+		vals[5] = -1;
+
+		pgstat_progress_update_multi_param(6, index, vals);
+	}
+
+	while (true)
+	{
+		int			processed_databases = 0;
+
+		foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+		{
+			DataChecksumsWorkerResult result;
+			DataChecksumsWorkerResultEntry *entry;
+			bool		found;
+
+			/*
+			 * Check if this database has been processed already, and if so
+			 * whether it should be retried or skipped.
+			 */
+			entry = (DataChecksumsWorkerResultEntry *) hash_search(ProcessedDatabases, &db->dboid,
+																   HASH_FIND, NULL);
+
+			if (entry)
+			{
+				if (entry->result == DATACHECKSUMSWORKER_RETRYDB)
+				{
+					/*
+					 * Limit the number of retries to avoid infinite looping
+					 * in case there simply won't be enough workers in the
+					 * cluster to finish this operation.
+					 */
+					if (entry->retries > DATACHECKSUMSWORKER_MAX_DB_RETRIES)
+						entry->result = DATACHECKSUMSWORKER_FAILED;
+				}
+
+				/* Skip if this database has been processed already */
+				if (entry->result != DATACHECKSUMSWORKER_RETRYDB)
+					continue;
+			}
+
+			result = ProcessDatabase(db);
+			processed_databases++;
+
+			/*
+			 * Update the number of processed databases in the progress
+			 * report.
+			 */
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_DBS_DONE,
+										 processed_databases);
+
+			/* Allow a test process to alter the result of the operation */
+			INJECTION_POINT("datachecksumsworker-fail-db", &result);
+
+			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
+			{
+				/*
+				 * If one database has completed shared catalogs, we don't
+				 * have to process them again.
+				 */
+				if (DataChecksumsWorkerShmem->process_shared_catalogs)
+					DataChecksumsWorkerShmem->process_shared_catalogs = false;
+			}
+			else if (result == DATACHECKSUMSWORKER_ABORTED)
+			{
+				/* Abort flag set, so exit the whole process */
+				return false;
+			}
+
+			entry = hash_search(ProcessedDatabases, &db->dboid, HASH_ENTER, &found);
+			entry->dboid = db->dboid;
+			entry->result = result;
+			if (!found)
+				entry->retries = 0;
+			else
+				entry->retries++;
+		}
+
+		elog(DEBUG1,
+			 "%i databases processed for data checksum enabling, %s",
+			 processed_databases,
+			 (processed_databases ? "process with restart" : "process completed"));
+
+		FreeDatabaseList(DatabaseList);
+
+		/*
+		 * If no databases were processed in this run of the loop, we have now
+		 * finished all databases and no concurrently created ones can exist.
+		 */
+		if (processed_databases == 0)
+			break;
+
+		/*
+		 * Re-generate the list of databases for another pass. Since we wait
+		 * for all pre-existing transactions finish, this way we can be
+		 * certain that there are no databases left without checksums.
+		 */
+		WaitForAllTransactionsToFinish();
+		DatabaseList = BuildDatabaseList();
+	}
+
+	/*
+	 * ProcessedDatabases now has all databases and the results of their
+	 * processing. Failure to enable checksums for a database can be because
+	 * they actually failed for some reason, or because the database was
+	 * dropped between us getting the database list and trying to process it.
+	 * Get a fresh list of databases to detect the second case where the
+	 * database was dropped before we had started processing it. If a database
+	 * still exists, but enabling checksums failed then we fail the entire
+	 * checksumming process and exit with an error.
+	 */
+	WaitForAllTransactionsToFinish();
+	DatabaseList = BuildDatabaseList();
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+	{
+		DataChecksumsWorkerResultEntry *entry;
+		bool		found;
+
+		entry = hash_search(ProcessedDatabases, (void *) &db->dboid,
+							HASH_FIND, &found);
+
+		/*
+		 * We are only interested in the processed databases which failed, and
+		 * where the failed database still exists.  This indicates that
+		 * enabling checksums actually failed, and not that the failure was
+		 * due to the db being concurrently dropped.
+		 */
+		if (found && entry->result == DATACHECKSUMSWORKER_FAILED)
+		{
+			ereport(WARNING,
+					errmsg("failed to enable data checksums in \"%s\"", db->dbname));
+			found_failed = found;
+			continue;
+		}
+	}
+
+	FreeDatabaseList(DatabaseList);
+
+	if (found_failed)
+	{
+		/* Disable checksums on cluster, because we failed */
+		SetDataChecksumsOff();
+		/* Force a checkpoint to make everything consistent */
+		flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+		if (immediate_checkpoint)
+			flags = flags | CHECKPOINT_FAST;
+		RequestCheckpoint(flags);
+		ereport(ERROR,
+				errmsg("data checksums failed to get enabled in all databases, aborting"),
+				errhint("The server log might have more information on the cause of the error."));
+	}
+
+	/*
+	 * When enabling checksums, we have to wait for a checkpoint for the
+	 * checksums to change from in-progress to on.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT);
+
+	/*
+	 * Force a checkpoint to get everything out to disk. The use of immediate
+	 * checkpoints is for running tests, as they would otherwise not execute
+	 * in such a way that they can reliably be placed under timeout control.
+	 */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_FAST;
+	RequestCheckpoint(flags);
+
+	return true;
+}
+
+/*
+ * DataChecksumsWorkerShmemSize
+ *		Compute required space for datachecksumsworker-related shared memory
+ */
+Size
+DataChecksumsWorkerShmemSize(void)
+{
+	Size		size;
+
+	size = sizeof(DataChecksumsWorkerShmemStruct);
+	size = MAXALIGN(size);
+
+	return size;
+}
+
+/*
+ * DataChecksumsWorkerShmemInit
+ *		Allocate and initialize datachecksumsworker-related shared memory
+ */
+void
+DataChecksumsWorkerShmemInit(void)
+{
+	bool		found;
+
+	DataChecksumsWorkerShmem = (DataChecksumsWorkerShmemStruct *)
+		ShmemInitStruct("DataChecksumsWorker Data",
+						DataChecksumsWorkerShmemSize(),
+						&found);
+
+	if (!found)
+	{
+		MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
+
+		/*
+		 * Even if this is a redundant assignment, we want to be explicit
+		 * about our intent for readability, since we want to be able to query
+		 * this state in case of restartability.
+		 */
+		DataChecksumsWorkerShmem->launch_operation = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		DataChecksumsWorkerShmem->launch_fast = false;
+	}
+}
+
+/*
+ * BuildDatabaseList
+ *		Compile a list of all currently available databases in the cluster
+ *
+ * This creates the list of databases for the datachecksumsworker workers to
+ * add checksums to. If the caller wants to ensure that no concurrently
+ * running CREATE DATABASE calls exist, this needs to be preceded by a call
+ * to WaitForAllTransactionsToFinish().
+ */
+static List *
+BuildDatabaseList(void)
+{
+	List	   *DatabaseList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(DatabaseRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_database pgdb = (Form_pg_database) GETSTRUCT(tup);
+		DataChecksumsWorkerDatabase *db;
+
+		oldctx = MemoryContextSwitchTo(ctx);
+
+		db = (DataChecksumsWorkerDatabase *) palloc0(sizeof(DataChecksumsWorkerDatabase));
+
+		db->dboid = pgdb->oid;
+		db->dbname = pstrdup(NameStr(pgdb->datname));
+
+		DatabaseList = lappend(DatabaseList, db);
+
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return DatabaseList;
+}
+
+static void
+FreeDatabaseList(List *dblist)
+{
+	if (!dblist)
+		return;
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, dblist)
+	{
+		if (db->dbname != NULL)
+			pfree(db->dbname);
+	}
+
+	list_free_deep(dblist);
+}
+
+/*
+ * BuildRelationList
+ *		Compile a list of relations in the database
+ *
+ * Returns a list of OIDs for the request relation types. If temp_relations
+ * is True then only temporary relations are returned. If temp_relations is
+ * False then non-temporary relations which have data checksums are returned.
+ * If include_shared is True then shared relations are included as well in a
+ * non-temporary list. include_shared has no relevance when building a list of
+ * temporary relations.
+ */
+static List *
+BuildRelationList(bool temp_relations, bool include_shared)
+{
+	List	   *RelationList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(RelationRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_class pgc = (Form_pg_class) GETSTRUCT(tup);
+
+		/*
+		 * Only include temporary relations when asked for a temp relation
+		 * list.
+		 */
+		if (pgc->relpersistence == RELPERSISTENCE_TEMP)
+		{
+			if (!temp_relations)
+				continue;
+		}
+		else
+		{
+			/*
+			 * If we are only interested in temp relations then continue
+			 * immediately as the current relation isn't a temp relation.
+			 */
+			if (temp_relations)
+				continue;
+
+			if (!RELKIND_HAS_STORAGE(pgc->relkind))
+				continue;
+
+			if (pgc->relisshared && !include_shared)
+				continue;
+		}
+
+		oldctx = MemoryContextSwitchTo(ctx);
+		RelationList = lappend_oid(RelationList, pgc->oid);
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return RelationList;
+}
+
+/*
+ * DataChecksumsWorkerMain
+ *
+ * Main function for enabling checksums in a single database, This is the
+ * function set as the bgw_function_name in the dynamic background worker
+ * process initiated for each database by the worker launcher. After enabling
+ * data checksums in each applicable relation in the database, it will wait for
+ * all temporary relations that were present when the function started to
+ * disappear before returning. This is required since we cannot rewrite
+ * existing temporary relations with data checksums.
+ */
+void
+DataChecksumsWorkerMain(Datum arg)
+{
+	Oid			dboid = DatumGetObjectId(arg);
+	List	   *RelationList = NIL;
+	List	   *InitialTempTableList = NIL;
+	BufferAccessStrategy strategy;
+	bool		aborted = false;
+	int64		rels_done;
+
+	operation = ENABLE_DATACHECKSUMS;
+
+	pqsignal(SIGTERM, die);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_WORKER;
+	init_ps_display(NULL);
+
+	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
+											  BGWORKER_BYPASS_ALLOWCONN);
+
+	/* worker will have a separate entry in pg_stat_progress_data_checksums */
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	/*
+	 * Get a list of all temp tables present as we start in this database. We
+	 * need to wait until they are all gone until we are done, since we cannot
+	 * access these relations and modify them.
+	 */
+	InitialTempTableList = BuildRelationList(true, false);
+
+	/*
+	 * Enable vacuum cost delay, if any.
+	 */
+	Assert(DataChecksumsWorkerShmem->operation == ENABLE_DATACHECKSUMS);
+	VacuumCostDelay = DataChecksumsWorkerShmem->cost_delay;
+	VacuumCostLimit = DataChecksumsWorkerShmem->cost_limit;
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumCostPageHit = 0;
+	VacuumCostPageMiss = 0;
+	VacuumCostPageDirty = 0;
+
+	/*
+	 * Create and set the vacuum strategy as our buffer strategy.
+	 */
+	strategy = GetAccessStrategy(BAS_VACUUM);
+
+	RelationList = BuildRelationList(false,
+									 DataChecksumsWorkerShmem->process_shared_catalogs);
+
+	/* Update the total number of relations to be processed in this DB. */
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE
+		};
+
+		int64		vals[2];
+
+		vals[0] = list_length(RelationList);
+		vals[1] = 0;
+
+		pgstat_progress_update_multi_param(2, index, vals);
+	}
+
+	/* Process the relations */
+	rels_done = 0;
+	foreach_oid(reloid, RelationList)
+	{
+		if (!ProcessSingleRelationByOid(reloid, strategy))
+		{
+			aborted = true;
+			break;
+		}
+
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_RELS_DONE,
+									 ++rels_done);
+	}
+	list_free(RelationList);
+
+	if (aborted)
+	{
+		DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+		ereport(DEBUG1,
+				errmsg("data checksum processing aborted in database OID %u",
+					   dboid));
+		return;
+	}
+
+	/* The worker is about to wait for temporary tables to go away. */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL);
+
+	/*
+	 * Wait for all temp tables that existed when we started to go away. This
+	 * is necessary since we cannot "reach" them to enable checksums. Any temp
+	 * tables created after we started will already have checksums in them
+	 * (due to the "inprogress-on" state), so no need to wait for those.
+	 */
+	for (;;)
+	{
+		List	   *CurrentTempTables;
+		int			numleft;
+		char		activity[64];
+
+		CurrentTempTables = BuildRelationList(true, false);
+		numleft = 0;
+		foreach_oid(tmptbloid, InitialTempTableList)
+		{
+			if (list_member_oid(CurrentTempTables, tmptbloid))
+				numleft++;
+		}
+		list_free(CurrentTempTables);
+
+		INJECTION_POINT("datachecksumsworker-fake-temptable-wait", &numleft);
+
+		if (numleft == 0)
+			break;
+
+		/*
+		 * At least one temp table is left to wait for, indicate in pgstat
+		 * activity and progress reporting.
+		 */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for %d temp tables to be removed", numleft);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 3 seconds */
+		ResetLatch(MyLatch);
+		(void) WaitLatch(MyLatch,
+						 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+						 3000,
+						 WAIT_EVENT_CHECKSUM_ENABLE_TEMPTABLE_WAIT);
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		aborted = DataChecksumsWorkerShmem->launch_operation != operation;
+		LWLockRelease(DataChecksumsWorkerLock);
+
+		if (aborted || abort_requested)
+		{
+			DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+			ereport(DEBUG1,
+					errmsg("data checksum processing aborted in database OID %u",
+						   dboid));
+			return;
+		}
+	}
+
+	list_free(InitialTempTableList);
+
+	/* worker done */
+	pgstat_progress_end_command();
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
+}
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index bf6b55ee830..955df32be5d 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -204,6 +204,9 @@ static child_process_kind child_process_kinds[] = {
 	[B_WAL_SUMMARIZER] = {"wal_summarizer", WalSummarizerMain, true},
 	[B_WAL_WRITER] = {"wal_writer", WalWriterMain, true},
 
+	[B_DATACHECKSUMSWORKER_LAUNCHER] = {"datachecksum launcher", NULL, false},
+	[B_DATACHECKSUMSWORKER_WORKER] = {"datachecksum worker", NULL, false},
+
 	[B_LOGGER] = {"syslogger", SysLoggerMain, false},
 };
 
diff --git a/src/backend/postmaster/meson.build b/src/backend/postmaster/meson.build
index 0008603cfee..ce10ef1059a 100644
--- a/src/backend/postmaster/meson.build
+++ b/src/backend/postmaster/meson.build
@@ -6,6 +6,7 @@ backend_sources += files(
   'bgworker.c',
   'bgwriter.c',
   'checkpointer.c',
+  'datachecksumsworker.c',
   'fork_process.c',
   'interrupt.c',
   'launch_backend.c',
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index e1d643b013d..3d15a894c3a 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2983,6 +2983,11 @@ PostmasterStateMachine(void)
 									B_INVALID,
 									B_STANDALONE_BACKEND);
 
+			/* also add checksumming processes */
+			remainMask = btmask_add(remainMask,
+									B_DATACHECKSUMSWORKER_LAUNCHER,
+									B_DATACHECKSUMSWORKER_WORKER);
+
 			/* All types should be included in targetMask or remainMask */
 			Assert((remainMask.mask | targetMask.mask) == BTYPE_MASK_ALL.mask);
 		}
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index cc03f0706e9..f9f06821a8f 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -186,6 +186,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 		case XLOG_FPW_CHANGE:
 		case XLOG_FPI_FOR_HINT:
 		case XLOG_FPI:
+		case XLOG_CHECKSUMS:
 		case XLOG_OVERWRITE_CONTRECORD:
 		case XLOG_CHECKPOINT_REDO:
 			break;
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 2fa045e6b0f..44213d140ae 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -30,6 +30,8 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "postmaster/postmaster.h"
 #include "postmaster/walsummarizer.h"
 #include "replication/logicallauncher.h"
 #include "replication/origin.h"
@@ -150,6 +152,7 @@ CalculateShmemSize(int *num_semaphores)
 	size = add_size(size, InjectionPointShmemSize());
 	size = add_size(size, SlotSyncShmemSize());
 	size = add_size(size, AioShmemSize());
+	size = add_size(size, DataChecksumsWorkerShmemSize());
 
 	/* include additional requested shmem from preload libraries */
 	size = add_size(size, total_addin_request);
@@ -332,6 +335,7 @@ CreateOrAttachShmemStructs(void)
 	PgArchShmemInit();
 	ApplyLauncherShmemInit();
 	SlotSyncShmemInit();
+	DataChecksumsWorkerShmemInit();
 
 	/*
 	 * Set up other modules that need some shared memory space
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 087821311cc..6881c6f4069 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -18,6 +18,7 @@
 #include <unistd.h>
 
 #include "access/parallel.h"
+#include "access/xlog.h"
 #include "commands/async.h"
 #include "miscadmin.h"
 #include "pgstat.h"
@@ -576,6 +577,18 @@ ProcessProcSignalBarrier(void)
 					case PROCSIGNAL_BARRIER_SMGRRELEASE:
 						processed = ProcessBarrierSmgrRelease();
 						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON:
+						processed = AbsorbChecksumsOnInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_ON:
+						processed = AbsorbChecksumsOnBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF:
+						processed = AbsorbChecksumsOffInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_OFF:
+						processed = AbsorbChecksumsOffBarrier();
+						break;
 				}
 
 				/*
diff --git a/src/backend/storage/page/README b/src/backend/storage/page/README
index e30d7ac59ad..73c36a63908 100644
--- a/src/backend/storage/page/README
+++ b/src/backend/storage/page/README
@@ -10,7 +10,9 @@ http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf, discussed
 2010/12/22 on -hackers list.
 
 Current implementation requires this be enabled system-wide at initdb time, or
-by using the pg_checksums tool on an offline cluster.
+by using the pg_checksums tool on an offline cluster.  Checksums can also be
+enabled at runtime using pg_enable_data_checksums(), and disabled by using
+pg_disable_data_checksums().
 
 The checksum is not valid at all times on a data page!!
 The checksum is valid when the page leaves the shared pool and is checked
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index dbb49ed9197..19cf6512e52 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -107,7 +107,7 @@ PageIsVerified(PageData *page, BlockNumber blkno, int flags, bool *checksum_fail
 	 */
 	if (!PageIsNew(page))
 	{
-		if (DataChecksumsEnabled())
+		if (DataChecksumsNeedVerify())
 		{
 			checksum = pg_checksum_page(page, blkno);
 
@@ -1511,7 +1511,7 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 	static char *pageCopy = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return page;
 
 	/*
@@ -1541,7 +1541,7 @@ void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
 	/* If we don't need a checksum, just return */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return;
 
 	((PageHeader) page)->pd_checksum = pg_checksum_page(page, blkno);
diff --git a/src/backend/utils/activity/pgstat_backend.c b/src/backend/utils/activity/pgstat_backend.c
index 8714a85e2d9..edc2512d79f 100644
--- a/src/backend/utils/activity/pgstat_backend.c
+++ b/src/backend/utils/activity/pgstat_backend.c
@@ -378,6 +378,8 @@ pgstat_tracks_backend_bktype(BackendType bktype)
 		case B_CHECKPOINTER:
 		case B_IO_WORKER:
 		case B_STARTUP:
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 			return false;
 
 		case B_AUTOVAC_WORKER:
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index 13ae57ed649..a290d56f409 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -362,6 +362,8 @@ pgstat_tracks_io_bktype(BackendType bktype)
 		case B_LOGGER:
 			return false;
 
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 		case B_AUTOVAC_LAUNCHER:
 		case B_AUTOVAC_WORKER:
 		case B_BACKEND:
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index 5427da5bc1b..7f26d78cb77 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -116,6 +116,9 @@ CHECKPOINT_DELAY_COMPLETE	"Waiting for a backend that blocks a checkpoint from c
 CHECKPOINT_DELAY_START	"Waiting for a backend that blocks a checkpoint from starting."
 CHECKPOINT_DONE	"Waiting for a checkpoint to complete."
 CHECKPOINT_START	"Waiting for a checkpoint to start."
+CHECKSUM_ENABLE_STARTCONDITION	"Waiting for data checksums enabling to start."
+CHECKSUM_ENABLE_FINISHCONDITION	"Waiting for data checksums to be enabled."
+CHECKSUM_ENABLE_TEMPTABLE_WAIT	"Waiting for temporary tables to be dropped for data checksums to be enabled."
 EXECUTE_GATHER	"Waiting for activity from a child process while executing a <literal>Gather</literal> plan node."
 HASH_BATCH_ALLOCATE	"Waiting for an elected Parallel Hash participant to allocate a hash table."
 HASH_BATCH_ELECT	"Waiting to elect a Parallel Hash participant to allocate a hash table."
@@ -352,6 +355,7 @@ DSMRegistry	"Waiting to read or update the dynamic shared memory registry."
 InjectionPoint	"Waiting to read or update information related to injection points."
 SerialControl	"Waiting to read or update shared <filename>pg_serial</filename> state."
 AioWorkerSubmissionQueue	"Waiting to access AIO worker submission queue."
+DataChecksumsWorker	"Waiting for data checksumsworker."
 
 #
 # END OF PREDEFINED LWLOCKS (DO NOT CHANGE THIS LINE)
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index c756c2bebaa..f4e264ebf33 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -274,6 +274,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		cmdtype = PROGRESS_COMMAND_BASEBACKUP;
 	else if (pg_strcasecmp(cmd, "COPY") == 0)
 		cmdtype = PROGRESS_COMMAND_COPY;
+	else if (pg_strcasecmp(cmd, "DATACHECKSUMS") == 0)
+		cmdtype = PROGRESS_COMMAND_DATACHECKSUMS;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -1146,9 +1148,6 @@ pg_stat_get_db_checksum_failures(PG_FUNCTION_ARGS)
 	int64		result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
@@ -1164,9 +1163,6 @@ pg_stat_get_db_checksum_last_failure(PG_FUNCTION_ARGS)
 	TimestampTz result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index 545d1e90fbd..34cce2ce0be 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -293,9 +293,18 @@ GetBackendTypeDesc(BackendType backendType)
 		case B_CHECKPOINTER:
 			backendDesc = gettext_noop("checkpointer");
 			break;
+
 		case B_IO_WORKER:
 			backendDesc = gettext_noop("io worker");
 			break;
+
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+			backendDesc = "datachecksumsworker launcher";
+			break;
+		case B_DATACHECKSUMSWORKER_WORKER:
+			backendDesc = "datachecksumsworker worker";
+			break;
+
 		case B_LOGGER:
 			backendDesc = gettext_noop("logger");
 			break;
@@ -895,7 +904,8 @@ InitializeSessionUserIdStandalone(void)
 	 * workers, in slot sync worker and in background workers.
 	 */
 	Assert(!IsUnderPostmaster || AmAutoVacuumWorkerProcess() ||
-		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess());
+		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess() ||
+		   AmDataChecksumsWorkerProcess());
 
 	/* call only once */
 	Assert(!OidIsValid(AuthenticatedUserId));
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 641e535a73c..589e7eab9e8 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -750,6 +750,24 @@ InitPostgres(const char *in_dbname, Oid dboid,
 
 	ProcSignalInit(MyCancelKey, MyCancelKeyLength);
 
+	/*
+	 * Initialize a local cache of the data_checksum_version, to be updated by
+	 * the procsignal-based barriers.
+	 *
+	 * This intentionally happens after initializing the procsignal, otherwise
+	 * we might miss a state change. This means we can get a barrier for the
+	 * state we've just initialized - but it can happen only once.
+	 *
+	 * The postmaster (which is what gets forked into the new child process)
+	 * does not handle barriers, therefore it may not have the current value
+	 * of LocalDataChecksumVersion value (it'll have the value read from the
+	 * control file, which may be arbitrarily old).
+	 *
+	 * NB: Even if the postmaster handled barriers, the value might still be
+	 * stale, as it might have changed after this process forked.
+	 */
+	InitLocalDataChecksumVersion();
+
 	/*
 	 * Also set up timeout handlers needed for backend operation.  We need
 	 * these in every case except bootstrap.
@@ -878,7 +896,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
 					 errhint("You should immediately run CREATE USER \"%s\" SUPERUSER;.",
 							 username != NULL ? username : "postgres")));
 	}
-	else if (AmBackgroundWorkerProcess())
+	else if (AmBackgroundWorkerProcess() || AmDataChecksumsWorkerProcess())
 	{
 		if (username == NULL && !OidIsValid(useroid))
 		{
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index f137129209f..36fba8496df 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -491,6 +491,14 @@ static const struct config_enum_entry file_copy_method_options[] = {
 	{NULL, 0, false}
 };
 
+static const struct config_enum_entry data_checksums_options[] = {
+	{"on", PG_DATA_CHECKSUM_VERSION, true},
+	{"off", PG_DATA_CHECKSUM_OFF, true},
+	{"inprogress-on", PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION, true},
+	{"inprogress-off", PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION, true},
+	{NULL, 0, false}
+};
+
 /*
  * Options for enum values stored in other modules
  */
@@ -616,7 +624,6 @@ static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
 static int	num_os_semaphores;
-static bool data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
@@ -2043,17 +2050,6 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
-	{
-		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
-			gettext_noop("Shows whether data checksums are turned on for this cluster."),
-			NULL,
-			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
-		},
-		&data_checksums,
-		false,
-		NULL, NULL, NULL
-	},
-
 	{
 		{"syslog_sequence_numbers", PGC_SIGHUP, LOGGING_WHERE,
 			gettext_noop("Add sequence number to syslog messages to avoid duplicate suppression."),
@@ -5489,6 +5485,16 @@ struct config_enum ConfigureNamesEnum[] =
 		DEFAULT_IO_METHOD, io_method_options,
 		NULL, assign_io_method, NULL
 	},
+	{
+		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows whether data checksums are turned on for this cluster."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
+		},
+		&data_checksums,
+		PG_DATA_CHECKSUM_OFF, data_checksums_options,
+		NULL, NULL, show_data_checksums
+	},
 
 	/* End-of-list marker */
 	{
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index f20be82862a..8411cecf3ff 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -568,7 +568,7 @@ main(int argc, char *argv[])
 		ControlFile->state != DB_SHUTDOWNED_IN_RECOVERY)
 		pg_fatal("cluster must be shut down");
 
-	if (ControlFile->data_checksum_version == 0 &&
+	if (ControlFile->data_checksum_version != PG_DATA_CHECKSUM_VERSION &&
 		mode == PG_MODE_CHECK)
 		pg_fatal("data checksums are not enabled in cluster");
 
@@ -576,7 +576,7 @@ main(int argc, char *argv[])
 		mode == PG_MODE_DISABLE)
 		pg_fatal("data checksums are already disabled in cluster");
 
-	if (ControlFile->data_checksum_version > 0 &&
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION &&
 		mode == PG_MODE_ENABLE)
 		pg_fatal("data checksums are already enabled in cluster");
 
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index 10de058ce91..acf5c7b026e 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -280,6 +280,8 @@ main(int argc, char *argv[])
 		   ControlFile->checkPointCopy.oldestCommitTsXid);
 	printf(_("Latest checkpoint's newestCommitTsXid:%u\n"),
 		   ControlFile->checkPointCopy.newestCommitTsXid);
+	printf(_("Latest checkpoint's data_checksum_version:%u\n"),
+		   ControlFile->checkPointCopy.data_checksum_version);
 	printf(_("Time of latest checkpoint:            %s\n"),
 		   ckpttime_str);
 	printf(_("Fake LSN counter for unlogged rels:   %X/%08X\n"),
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index 90cef0864de..29684e82440 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -15,6 +15,7 @@
 #include "access/xlog_internal.h"
 #include "common/string.h"
 #include "pg_upgrade.h"
+#include "storage/bufpage.h"
 
 
 /*
@@ -736,6 +737,14 @@ check_control_data(ControlData *oldctrl,
 	 * check_for_isn_and_int8_passing_mismatch().
 	 */
 
+	/*
+	 * If data checksums are in any in-progress state then disallow the
+	 * upgrade. The user should either let the process finish, or turn off
+	 * data checksums, before retrying.
+	 */
+	if (oldctrl->data_checksum_version > PG_DATA_CHECKSUM_VERSION)
+		pg_fatal("checksums are being enabled in the old cluster");
+
 	/*
 	 * We might eventually allow upgrades from checksum to no-checksum
 	 * clusters.
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index d12798be3d8..8bcc5aa8a63 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -56,6 +56,7 @@ extern PGDLLIMPORT int CommitDelay;
 extern PGDLLIMPORT int CommitSiblings;
 extern PGDLLIMPORT bool track_wal_io_timing;
 extern PGDLLIMPORT int wal_decode_buffer_size;
+extern PGDLLIMPORT int data_checksums;
 
 extern PGDLLIMPORT int CheckPointSegments;
 
@@ -117,7 +118,7 @@ extern PGDLLIMPORT int wal_level;
  * of the bits make it to disk, but the checksum wouldn't match.  Also WAL-log
  * them if forced by wal_log_hints=on.
  */
-#define XLogHintBitIsNeeded() (DataChecksumsEnabled() || wal_log_hints)
+#define XLogHintBitIsNeeded() (wal_log_hints || DataChecksumsNeedWrite())
 
 /* Do we need to WAL-log information required only for Hot Standby and logical replication? */
 #define XLogStandbyInfoActive() (wal_level >= WAL_LEVEL_REPLICA)
@@ -229,7 +230,19 @@ extern XLogRecPtr GetXLogWriteRecPtr(void);
 
 extern uint64 GetSystemIdentifier(void);
 extern char *GetMockAuthenticationNonce(void);
-extern bool DataChecksumsEnabled(void);
+extern bool DataChecksumsNeedWrite(void);
+extern bool DataChecksumsNeedVerify(void);
+extern bool DataChecksumsOnInProgress(void);
+extern bool DataChecksumsOffInProgress(void);
+extern void SetDataChecksumsOnInProgress(void);
+extern void SetDataChecksumsOn(void);
+extern void SetDataChecksumsOff(void);
+extern bool AbsorbChecksumsOnInProgressBarrier(void);
+extern bool AbsorbChecksumsOffInProgressBarrier(void);
+extern bool AbsorbChecksumsOnBarrier(void);
+extern bool AbsorbChecksumsOffBarrier(void);
+extern const char *show_data_checksums(void);
+extern void InitLocalDataChecksumVersion(void);
 extern bool GetDefaultCharSignedness(void);
 extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
 extern Size XLOGShmemSize(void);
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index cc06fc29ab2..cc78b00fe4c 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,6 +25,7 @@
 #include "lib/stringinfo.h"
 #include "pgtime.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/relfilelocator.h"
 
 
@@ -289,6 +290,12 @@ typedef struct xl_restore_point
 	char		rp_name[MAXFNAMELEN];
 } xl_restore_point;
 
+/* Information logged when data checksum level is changed */
+typedef struct xl_checksum_state
+{
+	uint32		new_checksumtype;
+} xl_checksum_state;
+
 /* Overwrite of prior contrecord */
 typedef struct xl_overwrite_contrecord
 {
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 63e834a6ce4..a8877fb87d1 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -62,6 +62,9 @@ typedef struct CheckPoint
 	 * set to InvalidTransactionId.
 	 */
 	TransactionId oldestActiveXid;
+
+	/* data checksums at the time of the checkpoint  */
+	uint32		data_checksum_version;
 } CheckPoint;
 
 /* XLOG info values for XLOG rmgr */
@@ -80,6 +83,7 @@ typedef struct CheckPoint
 /* 0xC0 is used in Postgres 9.5-11 */
 #define XLOG_OVERWRITE_CONTRECORD		0xD0
 #define XLOG_CHECKPOINT_REDO			0xE0
+#define XLOG_CHECKSUMS					0xF0
 
 
 /*
@@ -219,7 +223,7 @@ typedef struct ControlFileData
 	bool		float8ByVal;	/* float8, int8, etc pass-by-value? */
 
 	/* Are data pages protected by checksums? Zero if no checksum version */
-	uint32		data_checksum_version;
+	uint32		data_checksum_version;	/* persistent */
 
 	/*
 	 * True if the default signedness of char is "signed" on a platform where
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 118d6da1ace..c6f4e31a12f 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12356,6 +12356,25 @@
   proname => 'jsonb_subscript_handler', prorettype => 'internal',
   proargtypes => 'internal', prosrc => 'jsonb_subscript_handler' },
 
+# data checksum management functions
+{ oid => '9258',
+  descr => 'disable data checksums',
+  proname => 'pg_disable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'bool', proallargtypes => '{bool}',
+  proargmodes => '{i}',
+  proargnames => '{fast}',
+  prosrc => 'disable_data_checksums' },
+
+{ oid => '9257',
+  descr => 'enable data checksums',
+  proname => 'pg_enable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'int4 int4 bool', proallargtypes => '{int4,int4,bool}',
+  proargmodes => '{i,i,i}',
+  proargnames => '{cost_delay,cost_limit,fast}',
+  prosrc => 'enable_data_checksums' },
+
 # collation management functions
 { oid => '3445', descr => 'import collations from operating system',
   proname => 'pg_import_system_collations', procost => '100',
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 1cde4bd9bcf..cf6de4ef12d 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -162,4 +162,20 @@
 #define PROGRESS_COPY_TYPE_PIPE 3
 #define PROGRESS_COPY_TYPE_CALLBACK 4
 
+/* Progress parameters for PROGRESS_DATACHECKSUMS */
+#define PROGRESS_DATACHECKSUMS_PHASE		0
+#define PROGRESS_DATACHECKSUMS_DBS_TOTAL	1
+#define PROGRESS_DATACHECKSUMS_DBS_DONE		2
+#define PROGRESS_DATACHECKSUMS_RELS_TOTAL	3
+#define PROGRESS_DATACHECKSUMS_RELS_DONE	4
+#define PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL	5
+#define PROGRESS_DATACHECKSUMS_BLOCKS_DONE	6
+
+/* Phases of datachecksumsworker operation */
+#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING			0
+#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING			1
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING			2
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL	3
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT 4
+
 #endif
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1bef98471c3..2a0d7b6de42 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -366,6 +366,9 @@ typedef enum BackendType
 	B_WAL_SUMMARIZER,
 	B_WAL_WRITER,
 
+	B_DATACHECKSUMSWORKER_LAUNCHER,
+	B_DATACHECKSUMSWORKER_WORKER,
+
 	/*
 	 * Logger is not connected to shared memory and does not have a PGPROC
 	 * entry.
@@ -391,6 +394,9 @@ extern PGDLLIMPORT BackendType MyBackendType;
 #define AmWalSummarizerProcess()	(MyBackendType == B_WAL_SUMMARIZER)
 #define AmWalWriterProcess()		(MyBackendType == B_WAL_WRITER)
 #define AmIoWorkerProcess()			(MyBackendType == B_IO_WORKER)
+#define AmDataChecksumsWorkerProcess() \
+	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || \
+	 MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
 
 #define AmSpecialWorkerProcess() \
 	(AmAutoVacuumLauncherProcess() || \
diff --git a/src/include/postmaster/datachecksumsworker.h b/src/include/postmaster/datachecksumsworker.h
new file mode 100644
index 00000000000..2cd066fd0fe
--- /dev/null
+++ b/src/include/postmaster/datachecksumsworker.h
@@ -0,0 +1,51 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.h
+ *	  header file for data checksum helper background worker
+ *
+ *
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/postmaster/datachecksumsworker.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef DATACHECKSUMSWORKER_H
+#define DATACHECKSUMSWORKER_H
+
+/* Shared memory */
+extern Size DataChecksumsWorkerShmemSize(void);
+extern void DataChecksumsWorkerShmemInit(void);
+
+/* Possible operations the Datachecksumsworker can perform */
+typedef enum DataChecksumsWorkerOperation
+{
+	ENABLE_DATACHECKSUMS,
+	DISABLE_DATACHECKSUMS,
+	/* TODO: VERIFY_DATACHECKSUMS, */
+}			DataChecksumsWorkerOperation;
+
+/*
+ * Possible states for a database entry which has been processed. Exported
+ * here since we want to be able to reference this from injection point tests.
+ */
+typedef enum
+{
+	DATACHECKSUMSWORKER_SUCCESSFUL = 0,
+	DATACHECKSUMSWORKER_ABORTED,
+	DATACHECKSUMSWORKER_FAILED,
+	DATACHECKSUMSWORKER_RETRYDB,
+} DataChecksumsWorkerResult;
+
+/* Start the background processes for enabling or disabling checksums */
+void		StartDataChecksumsWorkerLauncher(DataChecksumsWorkerOperation op,
+											 int cost_delay,
+											 int cost_limit,
+											 bool fast);
+
+/* Background worker entrypoints */
+void		DataChecksumsWorkerLauncherMain(Datum arg);
+void		DataChecksumsWorkerMain(Datum arg);
+
+#endif							/* DATACHECKSUMSWORKER_H */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index aeb67c498c5..30fb0f62d4c 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -16,6 +16,7 @@
 
 #include "access/xlogdefs.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/item.h"
 #include "storage/off.h"
 
@@ -205,7 +206,6 @@ typedef PageHeaderData *PageHeader;
  * handling pages.
  */
 #define PG_PAGE_LAYOUT_VERSION		4
-#define PG_DATA_CHECKSUM_VERSION	1
 
 /* ----------------------------------------------------------------
  *						page support functions
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 25d13a798d1..b3f368a15b5 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -15,6 +15,20 @@
 
 #include "storage/block.h"
 
+/*
+ * Checksum version 0 is used for when data checksums are disabled (OFF).
+ * PG_DATA_CHECKSUM_VERSION defines that data checksums are enabled in the
+ * cluster and PG_DATA_CHECKSUM_INPROGRESS_{ON|OFF}_VERSION defines that data
+ * checksums are either currently being enabled or disabled.
+ */
+typedef enum ChecksumType
+{
+	PG_DATA_CHECKSUM_OFF = 0,
+	PG_DATA_CHECKSUM_VERSION,
+	PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION,
+	PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION
+} ChecksumType;
+
 /*
  * Compute the checksum for a Postgres page.  The page must be aligned on a
  * 4-byte boundary.
diff --git a/src/include/storage/lwlocklist.h b/src/include/storage/lwlocklist.h
index 06a1ffd4b08..b8f7ba0be51 100644
--- a/src/include/storage/lwlocklist.h
+++ b/src/include/storage/lwlocklist.h
@@ -85,6 +85,7 @@ PG_LWLOCK(50, DSMRegistry)
 PG_LWLOCK(51, InjectionPoint)
 PG_LWLOCK(52, SerialControl)
 PG_LWLOCK(53, AioWorkerSubmissionQueue)
+PG_LWLOCK(54, DataChecksumsWorker)
 
 /*
  * There also exist several built-in LWLock tranches.  As with the predefined
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index c6f5ebceefd..d90d35b1d6f 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -463,11 +463,11 @@ extern PGDLLIMPORT PGPROC *PreparedXactProcs;
  * Background writer, checkpointer, WAL writer, WAL summarizer, and archiver
  * run during normal operation.  Startup process and WAL receiver also consume
  * 2 slots, but WAL writer is launched only after startup has exited, so we
- * only need 6 slots.
+ * only need 6 slots to cover these. The DataChecksums worker and launcher
+ * can consume 2 slots when data checksums are enabled or disabled.
  */
 #define MAX_IO_WORKERS          32
-#define NUM_AUXILIARY_PROCS		(6 + MAX_IO_WORKERS)
-
+#define NUM_AUXILIARY_PROCS		(8 + MAX_IO_WORKERS)
 
 /* configurable options */
 extern PGDLLIMPORT int DeadlockTimeout;
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index afeeb1ca019..c54c61e2cd8 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -54,6 +54,11 @@ typedef enum
 typedef enum
 {
 	PROCSIGNAL_BARRIER_SMGRRELEASE, /* ask smgr to close files */
+
+	PROCSIGNAL_BARRIER_CHECKSUM_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_ON,
 } ProcSignalBarrierType;
 
 /*
diff --git a/src/include/utils/backend_progress.h b/src/include/utils/backend_progress.h
index dda813ab407..c664e92dbfe 100644
--- a/src/include/utils/backend_progress.h
+++ b/src/include/utils/backend_progress.h
@@ -28,6 +28,7 @@ typedef enum ProgressCommandType
 	PROGRESS_COMMAND_CREATE_INDEX,
 	PROGRESS_COMMAND_BASEBACKUP,
 	PROGRESS_COMMAND_COPY,
+	PROGRESS_COMMAND_DATACHECKSUMS,
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	20
diff --git a/src/test/Makefile b/src/test/Makefile
index 511a72e6238..278ce3e8a86 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,16 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = perl postmaster regress isolation modules authentication recovery subscription
+SUBDIRS = \
+		perl \
+		postmaster \
+		regress \
+		isolation \
+		modules \
+		authentication \
+		recovery \
+		subscription \
+		checksum
 
 ifeq ($(with_icu),yes)
 SUBDIRS += icu
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 93be0f57289..6b4450eb473 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -16,6 +16,7 @@ subdir('ssl_passphrase_callback')
 subdir('test_aio')
 subdir('test_binaryheap')
 subdir('test_bloomfilter')
+subdir('test_checksums')
 subdir('test_copy_callbacks')
 subdir('test_custom_rmgrs')
 subdir('test_ddl_deparse')
diff --git a/src/test/modules/test_checksums/.gitignore b/src/test/modules/test_checksums/.gitignore
new file mode 100644
index 00000000000..871e943d50e
--- /dev/null
+++ b/src/test/modules/test_checksums/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/modules/test_checksums/Makefile b/src/test/modules/test_checksums/Makefile
new file mode 100644
index 00000000000..b9136bb513f
--- /dev/null
+++ b/src/test/modules/test_checksums/Makefile
@@ -0,0 +1,36 @@
+#-------------------------------------------------------------------------
+#
+# Makefile for src/test/checksum
+#
+# Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/checksum/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/test/checksum
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+export enable_injection_points
+
+MODULE_big = test_checksums
+OBJS = \
+	$(WIN32RES)
+	test_checksums.o
+PGFILEDESC = "test_checksums - test code for data checksums"
+
+EXTENSION = test_checksums
+DATA = test_checksums--1.0.sql
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
+
+clean distclean maintainer-clean:
+	rm -rf tmp_check
diff --git a/src/test/modules/test_checksums/README b/src/test/modules/test_checksums/README
new file mode 100644
index 00000000000..0f0317060b3
--- /dev/null
+++ b/src/test/modules/test_checksums/README
@@ -0,0 +1,22 @@
+src/test/checksum/README
+
+Regression tests for data checksums
+===================================
+
+This directory contains a test suite for enabling data checksums
+in a running cluster.
+
+Running the tests
+=================
+
+    make check
+
+or
+
+    make installcheck
+
+NOTE: This creates a temporary installation (in the case of "check"),
+with multiple nodes, be they master or standby(s) for the purpose of
+the tests.
+
+NOTE: This requires the --enable-tap-tests argument to configure.
diff --git a/src/test/modules/test_checksums/meson.build b/src/test/modules/test_checksums/meson.build
new file mode 100644
index 00000000000..57156b63599
--- /dev/null
+++ b/src/test/modules/test_checksums/meson.build
@@ -0,0 +1,35 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+test_checksums_sources = files(
+	'test_checksums.c',
+)
+
+test_checksums = shared_module('test_checksums',
+	test_checksums_sources,
+	kwargs: pg_test_mod_args,
+)
+test_install_libs += test_checksums
+
+test_install_data += files(
+	'test_checksums.control',
+	'test_checksums--1.0.sql',
+)
+
+tests += {
+  'name': 'test_checksums',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'env': {
+       'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+	},
+    'tests': [
+      't/001_basic.pl',
+      't/002_restarts.pl',
+      't/003_standby_restarts.pl',
+      't/004_offline.pl',
+      't/005_injection.pl',
+      't/006_concurrent_pgbench.pl',
+    ],
+  },
+}
diff --git a/src/test/modules/test_checksums/t/001_basic.pl b/src/test/modules/test_checksums/t/001_basic.pl
new file mode 100644
index 00000000000..728a5c4510c
--- /dev/null
+++ b/src/test/modules/test_checksums/t/001_basic.pl
@@ -0,0 +1,63 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are turned off
+test_checksum_state($node, 'off');
+
+# Enable data checksums and wait for the state transition to 'on'
+enable_data_checksums($node, wait => 'on');
+
+# Run a dummy query just to make sure we can read back data
+my $result =
+  $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1 ");
+is($result, '9999', 'ensure checksummed pages can be read back');
+
+# Enable data checksums again which should be a no-op so we explicitly don't
+# wait for any state transition as none should happen here
+enable_data_checksums($node);
+test_checksum_state($node, 'on');
+# ..and make sure we can still read/write data
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums again and wait for the state transition
+disable_data_checksums($node, wait => 'on');
+
+# Test reading data again
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '10000', 'ensure previously checksummed pages can be read back');
+
+# Re-enable checksums and make sure that the underlying data has changed to
+# ensure that checksums will be different.
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+enable_data_checksums($node, wait => 'on');
+
+# Run a dummy query just to make sure we can read back the data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/002_restarts.pl b/src/test/modules/test_checksums/t/002_restarts.pl
new file mode 100644
index 00000000000..75599cf41f2
--- /dev/null
+++ b/src/test/modules/test_checksums/t/002_restarts.pl
@@ -0,0 +1,110 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with a
+# restart which breaks processing.
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Initialize result storage for queries
+my $result;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+test_checksum_state($node, 'off');
+
+SKIP:
+{
+	skip 'Data checksum delay tests not enabled in PG_TEST_EXTRA', 6
+	  if (!$ENV{PG_TEST_EXTRA}
+		|| $ENV{PG_TEST_EXTRA} !~ /\bchecksum_extended\b/);
+
+	# Create a barrier for checksumming to block on, in this case a pre-
+	# existing temporary table which is kept open while processing is started.
+	# We can accomplish this by setting up an interactive psql process which
+	# keeps the temporary table created as we enable checksums in another psql
+	# process.
+	#
+	# This is a similar test to the synthetic variant in 005_injection.pl
+	# which fakes this scenario.
+	my $bsession = $node->background_psql('postgres');
+	$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+	# In another session, make sure we can see the blocking temp table but
+	# start processing anyways and check that we are blocked with a proper
+	# wait event.
+	$result = $node->safe_psql('postgres',
+		"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';"
+	);
+	is($result, 't', 'ensure we can see the temporary table');
+
+	# Enabling data checksums shouldn't work as the process is blocked on the
+	# temporary table held open by $bsession. Ensure that we reach inprogress-
+	# on before we do more tests.
+	enable_data_checksums($node, wait => 'inprogress-on');
+
+	# Wait for processing to finish and the worker waiting for leftover temp
+	# relations to be able to actually finish
+	$result = $node->poll_query_until(
+		'postgres',
+		"SELECT wait_event FROM pg_catalog.pg_stat_activity "
+		  . "WHERE backend_type = 'datachecksumsworker worker';",
+		'ChecksumEnableTemptableWait');
+
+	# The datachecksumsworker waits for temporary tables to disappear for 3
+	# seconds before retrying, so sleep for 4 seconds to be guaranteed to see
+	# a retry cycle
+	sleep(4);
+
+	# Re-check the wait event to ensure we are blocked on the right thing.
+	$result = $node->safe_psql('postgres',
+			"SELECT wait_event FROM pg_catalog.pg_stat_activity "
+		  . "WHERE backend_type = 'datachecksumsworker worker';");
+	is($result, 'ChecksumEnableTemptableWait',
+		'ensure the correct wait condition is set');
+	test_checksum_state($node, 'inprogress-on');
+
+	# Stop the cluster while bsession is still attached.  We can't close the
+	# session first since the brief period between closing and stopping might
+	# be enough for checksums to get enabled.
+	$node->stop;
+	$bsession->quit;
+	$node->start;
+
+	# Ensure the checksums aren't enabled across the restart.  This leaves the
+	# cluster in the same state as before we entered the SKIP block.
+	test_checksum_state($node, 'off');
+}
+
+enable_data_checksums($node, wait => 'on');
+
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '9999', 'ensure checksummed pages can be read back');
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+disable_data_checksums($node, wait => 1);
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/003_standby_restarts.pl b/src/test/modules/test_checksums/t/003_standby_restarts.pl
new file mode 100644
index 00000000000..fe34b4d7d05
--- /dev/null
+++ b/src/test/modules/test_checksums/t/003_standby_restarts.pl
@@ -0,0 +1,114 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# streaming replication
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+# Initialize primary node
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+$node_primary->init(allows_streaming => 1, no_data_checksums => 1);
+$node_primary->start;
+
+my $slotname = 'physical_slot';
+$node_primary->safe_psql('postgres',
+	"SELECT pg_create_physical_replication_slot('$slotname')");
+
+# Take backup
+my $backup_name = 'my_backup';
+$node_primary->backup($backup_name);
+
+# Create streaming standby linking to primary
+my $node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $backup_name,
+	has_streaming => 1);
+$node_standby_1->append_conf(
+	'postgresql.conf', qq[
+primary_slot_name = '$slotname'
+]);
+$node_standby_1->start;
+
+# Create some content on the primary to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Wait for standbys to catch up
+$node_primary->wait_for_catchup($node_standby_1, 'replay',
+	$node_primary->lsn('insert'));
+
+# Check that checksums are turned off on all nodes
+test_checksum_state($node_primary, 'off');
+test_checksum_state($node_standby_1, 'off');
+
+# ---------------------------------------------------------------------------
+# Enable checksums for the cluster, and make sure that both the primary and
+# standby change state.
+#
+
+# Ensure that the primary switches to "inprogress-on"
+enable_data_checksums($node_primary, wait => 'inprogress-on');
+# Wait for checksum enable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Ensure that the standby has switched to "inprogress-on" or "on".  Normally it
+# would be "inprogress-on", but it is theoretically possible for the primary to
+# complete the checksum enabling *and* have the standby replay that record
+# before we reach the check below.
+my $result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting = 'off' FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'f');
+is($result, 1, 'ensure standby has absorbed the inprogress-on barrier');
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+
+is(($result eq 'inprogress-on' || $result eq 'on'),
+	1, 'ensure checksums are on, or in progress, on standby_1');
+
+# Insert some more data which should be checksummed on INSERT
+$node_primary->safe_psql('postgres',
+	"INSERT INTO t VALUES (generate_series(1, 10000));");
+
+# Wait for checksums enabled on the primary and standby
+wait_for_checksum_state($node_primary, 'on');
+wait_for_checksum_state($node_standby_1, 'on');
+
+$result =
+  $node_primary->safe_psql('postgres', "SELECT count(a) FROM t WHERE a > 1");
+is($result, '19998', 'ensure we can safely read all data with checksums');
+
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+#
+# Disable checksums and ensure it's propagated to standby and that we can
+# still read all data
+#
+
+# Disable checksums and wait for the operation to be replayed
+disable_data_checksums($node_primary);
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+# Ensure that the primary abd standby has switched to off
+wait_for_checksum_state($node_primary, 'off');
+wait_for_checksum_state($node_standby_1, 'off');
+# Doublecheck reading data withourt errors
+$result =
+  $node_primary->safe_psql('postgres', "SELECT count(a) FROM t WHERE a > 1");
+is($result, "19998", 'ensure we can safely read all data without checksums');
+
+$node_standby_1->stop;
+$node_primary->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/004_offline.pl b/src/test/modules/test_checksums/t/004_offline.pl
new file mode 100644
index 00000000000..e9fbcf77eab
--- /dev/null
+++ b/src/test/modules/test_checksums/t/004_offline.pl
@@ -0,0 +1,82 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums offline from various states
+# of checksum processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+test_checksum_state($node, 'off');
+
+# Enable checksums offline using pg_checksums
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are enabled
+test_checksum_state($node, 'on');
+
+# Run a dummy query just to make sure we can read back some data
+my $result =
+  $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '9999', 'ensure checksummed pages can be read back');
+
+# Disable checksums offline again using pg_checksums
+$node->stop;
+$node->checksum_disable_offline;
+$node->start;
+
+# Ensure that checksums are disabled
+test_checksum_state($node, 'off');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+enable_data_checksums($node, wait => 'inprogress-on');
+
+# Turn the cluster off and enable checksums offline, then start back up
+$bsession->quit;
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are now enabled even though processing wasn't
+# restarted
+test_checksum_state($node, 'on');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '9999', 'ensure checksummed pages can be read back');
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/005_injection.pl b/src/test/modules/test_checksums/t/005_injection.pl
new file mode 100644
index 00000000000..f4459e0e636
--- /dev/null
+++ b/src/test/modules/test_checksums/t/005_injection.pl
@@ -0,0 +1,76 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# injection point tests injecting failures into the processing
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+# ---------------------------------------------------------------------------
+# Test cluster setup
+#
+
+# Initiate testcluster
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Set up test environment
+$node->safe_psql('postgres', 'CREATE EXTENSION test_checksums;');
+
+# ---------------------------------------------------------------------------
+# Inducing failures in processing
+
+# Force enabling checksums to fail by marking one of the databases as having
+# failed in processing.
+disable_data_checksums($node, wait => 1);
+$node->safe_psql('postgres', 'SELECT dcw_inject_fail_database(true);');
+enable_data_checksums($node, wait => 'off');
+$node->safe_psql('postgres', 'SELECT dcw_inject_fail_database(false);');
+
+# Force the enable checksums processing to make multiple passes by removing
+# one database from the list in the first pass.  This will simulate a CREATE
+# DATABASE during processing.  Doing this via fault injection makes the test
+# not be subject to exact timing.
+$node->safe_psql('postgres', 'SELECT dcw_prune_dblist(true);');
+enable_data_checksums($node, wait => 'on');
+
+# ---------------------------------------------------------------------------
+# Timing and retry related tests
+#
+SKIP:
+{
+	skip 'Data checksum delay tests not enabled in PG_TEST_EXTRA', 4
+	  if (!$ENV{PG_TEST_EXTRA}
+		|| $ENV{PG_TEST_EXTRA} !~ /\bchecksum_extended\b/);
+
+	# Inject a delay in the barrier for enabling checksums
+	disable_data_checksums($node, wait => 1);
+	$node->safe_psql('postgres', 'SELECT dcw_inject_delay_barrier();');
+	enable_data_checksums($node, wait => 'on');
+
+	# Fake the existence of a temporary table at the start of processing, which
+	# will force the processing to wait and retry in order to wait for it to
+	# disappear.
+	disable_data_checksums($node, wait => 1);
+	$node->safe_psql('postgres', 'SELECT dcw_fake_temptable(true);');
+	enable_data_checksums($node, wait => 'on');
+}
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/006_concurrent_pgbench.pl b/src/test/modules/test_checksums/t/006_concurrent_pgbench.pl
new file mode 100644
index 00000000000..630abee9c63
--- /dev/null
+++ b/src/test/modules/test_checksums/t/006_concurrent_pgbench.pl
@@ -0,0 +1,224 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# concurrent activity via pgbench runs
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+my $node_primary_slot = 'physical_slot';
+my $node_primary_backup = 'primary_backup';
+my $node_primary;
+my $node_standby_1;
+
+# The number of full test iterations which will be performed
+my $TEST_ITERATIONS = 250;
+
+# Variables which record the current state of the cluster
+my $data_checksum_state = 'off';
+my $pgbench_running = 0;
+
+# Variables holding state for managing the cluster in various ways
+my @stop_modes = ();
+
+if (!$ENV{PG_TEST_EXTRA} || $ENV{PG_TEST_EXTRA} !~ /\bchecksum_extended\b/)
+{
+	plan skip_all => 'Extended tests not enabled';
+}
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+# Start a pgbench run in the background against the server specified via the
+# port passed as parameter
+sub background_pgbench
+{
+	my ($port, $readonly, $stdin, $stdout, $stderr) = @_;
+
+	my $pgbench_primary = IPC::Run::start(
+		[
+			'pgbench', '-p', $port, ($readonly == 1 ? '-S' : ''),
+			'-T', '600', '-c', '10', '-q', 'postgres'
+		],
+		'<' => \$stdin,
+		'>' => \$stdout,
+		'2>' => \$stderr,
+		IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default));
+}
+
+sub flip_data_checksums
+{
+	test_checksum_state($node_primary, $data_checksum_state);
+	test_checksum_state($node_standby_1, $data_checksum_state);
+
+	if ($data_checksum_state eq 'off')
+	{
+		# Coin-toss to see if we are injecting a retry due to a temptable
+		if (int(rand(2)) == 1)
+		{
+			$node_primary->safe_psql('postgres',
+				'SELECT dcw_fake_temptable(true);');
+		}
+		# Ensure that the primary switches to "inprogress-on"
+		enable_data_checksums($node_primary, wait => 'inprogress-on');
+		# Wait for checksum enable to be replayed
+		$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+		# Ensure that the standby has switched to "inprogress-on" or "on".
+		# Normally it would be "inprogress-on", but it is theoretically
+		# possible for the primary to complete the checksum enabling *and* have
+		# the standby replay that record before we reach the check below.
+		my $result = $node_standby_1->poll_query_until(
+			'postgres',
+			"SELECT setting = 'off' "
+			  . "FROM pg_catalog.pg_settings "
+			  . "WHERE name = 'data_checksums';",
+			'f');
+		is($result, 1,
+			'ensure standby has absorbed the inprogress-on barrier');
+		$result = $node_standby_1->safe_psql('postgres',
+			"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+		);
+
+		is(($result eq 'inprogress-on' || $result eq 'on'),
+			1, 'ensure checksums are on, or in progress, on standby_1');
+
+		# Wait for checksums enabled on the primary and standby
+		wait_for_checksum_state($node_primary, 'on');
+		wait_for_checksum_state($node_standby_1, 'on');
+
+		$node_primary->safe_psql('postgres',
+			'SELECT dcw_fake_temptable(false);');
+		$data_checksum_state = 'on';
+	}
+	elsif ($data_checksum_state eq 'on')
+	{
+		disable_data_checksums($node_primary);
+		$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+		# Wait for checksums enabled on the primary and standby
+		wait_for_checksum_state($node_primary, 'off');
+		wait_for_checksum_state($node_standby_1, 'off');
+
+		$data_checksum_state = 'off';
+	}
+	else
+	{
+		# This should only happen due to programmer error when hacking on the
+		# test code, but since that might pass subtly by let's ensure it gets
+		# caught with a test error if so.
+		is(1, 0, 'data_checksum_state variable has invalid state');
+	}
+}
+
+# Prepare an array with pg_ctl stop modes which we later can randomly select
+# from in order to stop the cluster in some way.
+for (my $i = 1; $i <= 100; $i++)
+{
+	if (int(rand($i * 2)) > $i)
+	{
+		push(@stop_modes, "immediate");
+	}
+	else
+	{
+		push(@stop_modes, "fast");
+	}
+}
+
+# Create and start a cluster with one primary and one standby node, and ensure
+# they are caught up and in sync.
+$node_primary = PostgreSQL::Test::Cluster->new('main');
+$node_primary->init(allows_streaming => 1, no_data_checksums => 1);
+$node_primary->start;
+$node_primary->safe_psql('postgres', 'CREATE EXTENSION test_checksums;');
+# Create some content to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,100000) AS a;");
+$node_primary->safe_psql('postgres',
+	"SELECT pg_create_physical_replication_slot('$node_primary_slot');");
+$node_primary->backup($node_primary_backup);
+
+$node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $node_primary_backup,
+	has_streaming => 1);
+$node_standby_1->append_conf(
+	'postgresql.conf', qq[
+primary_slot_name = '$node_primary_slot'
+]);
+$node_standby_1->start;
+
+$node_primary->command_ok([ 'pgbench', '-i', '-s', '100', '-q', 'postgres' ]);
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Main test suite. This loop will start a pgbench run on the cluster and while
+# that's running flip the state of data checksums concurrently. It will then
+# randomly restart thec cluster (in fast or immediate) mode and then check for
+# the desired state.  The idea behind doing things randomly is to stress out
+# any timing related issues by subjecting the cluster for varied workloads.
+# A TODO is to generate a trace such that any test failure can be traced to
+# its order of operations for debugging.
+for (my $i = 0; $i < $TEST_ITERATIONS; $i++)
+{
+	# If pgbench isn't running against the cluster then start it up
+	if (!$pgbench_running)
+	{
+		# Start a pgbench in the background against the primary
+		my ($pgb_primary_stdin, $pgb_primary_stdout, $pgb_primary_stderr) =
+		  ('', '', '');
+		background_pgbench($node_primary->port, 0, $pgb_primary_stdin,
+			$pgb_primary_stdout, $pgb_primary_stderr);
+
+		# Start a select-only pgbench in the background on the standby
+		my ($pgb_standby_1_stdin, $pgb_standby_1_stdout,
+			$pgb_standby_1_stderr)
+		  = ('', '', '');
+		background_pgbench($node_standby_1->port, 1, $pgb_standby_1_stdin,
+			$pgb_standby_1_stdout, $pgb_standby_1_stderr);
+	}
+
+	$node_primary->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+	# Check that checksums are as expected on all nodes
+	test_checksum_state($node_primary, $data_checksum_state);
+	test_checksum_state($node_standby_1, $data_checksum_state);
+
+	flip_data_checksums();
+	my $result = $node_primary->safe_psql('postgres',
+		"SELECT count(*) FROM t WHERE a > 1");
+	is($result, '100000', 'ensure data pages can be read back on primary');
+	$result = $node_standby_1->safe_psql('postgres',
+		"SELECT count(*) FROM t WHERE a > 1");
+	is($result, '100000', 'ensure data pages can be read back on standby');
+
+	# Coin-toss to see if we are powercycling the cluster or not
+	if (int(rand(2)) == 1)
+	{
+		$node_primary->stop($stop_modes[ int(rand(100)) ]);
+		$node_standby_1->stop($stop_modes[ int(rand(100)) ]);
+
+		$node_primary->start;
+		$node_standby_1->start;
+
+		$pgbench_running = 0;
+	}
+
+	# Check that checksums are turned off on all nodes
+	test_checksum_state($node_primary, $data_checksum_state);
+	test_checksum_state($node_standby_1, $data_checksum_state);
+}
+
+$node_standby_1->stop;
+$node_primary->stop;
+
+done_testing();
diff --git a/src/test/modules/test_checksums/t/DataChecksums/Utils.pm b/src/test/modules/test_checksums/t/DataChecksums/Utils.pm
new file mode 100644
index 00000000000..ee2f2a1428f
--- /dev/null
+++ b/src/test/modules/test_checksums/t/DataChecksums/Utils.pm
@@ -0,0 +1,185 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+=pod
+
+=head1 NAME
+
+DataChecksums::Utils - Utility functions for testing data checksums in a running cluster
+
+=head1 SYNOPSIS
+
+  use PostgreSQL::Test::Cluster;
+  use DataChecksums::Utils qw( .. );
+
+  # Create, and start, a new cluster
+  my $node = PostgreSQL::Test::Cluster->new('primary');
+  $node->init;
+  $node->start;
+
+  test_checksum_state($node, 'off');
+
+  enable_data_checksums($node);
+
+  wait_for_checksum_state($node, 'on');
+
+
+=cut
+
+package DataChecksums::Utils;
+
+use strict;
+use warnings FATAL => 'all';
+use Exporter 'import';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+our @EXPORT = qw(
+  test_checksum_state
+  wait_for_checksum_state
+  enable_data_checksums
+  disable_data_checksums
+);
+
+=pod
+
+=head1 METHODS
+
+=over
+
+=item test_checksum_state(node, state)
+
+Test that the current value of the data checksum GUC in the server running
+at B<node> matches B<state>.  If the values differ, a test failure is logged.
+Returns True if the values match, otherwise False.
+
+=cut
+
+sub test_checksum_state
+{
+	my ($postgresnode, $state) = @_;
+
+	my $result = $postgresnode->safe_psql('postgres',
+		"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+	);
+	is($result, $state, 'ensure checksums are set to ' . $state);
+	return $result eq $state;
+}
+
+=item wait_for_checksum_state(node, state)
+
+Test the value of the data checksum GUC in the server running at B<node>
+repeatedly until it matches B<state> or times out.  Processing will run for
+$PostgreSQL::Test::Utils::timeout_default seconds before timing out.  If the
+values differ when the process times out, False is returned and a test failure
+is logged, otherwise True.
+
+=cut
+
+sub wait_for_checksum_state
+{
+	my ($postgresnode, $state) = @_;
+
+	my $res = $postgresnode->poll_query_until(
+		'postgres',
+		"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+		$state);
+	is($res, 1, 'ensure data checksums are transitioned to ' . $state);
+	return $res == 1;
+}
+
+=item enable_data_checksums($node, %params)
+
+Function for enabling data checksums in the cluster running at B<node>.
+
+=over
+
+=item cost_delay
+
+The C<cost_delay> to use when enabling data checksums, default is 0.
+
+=item cost_limit
+
+The C<cost_limit> to use when enabling data checksums, default is 100.
+
+=item fast
+
+If set to C<true> an immediate checkpoint will be issued after data
+checksums are enabled.  Setting this to false will lead to slower tests.
+The default is true.
+
+=item wait
+
+If defined, the function will wait for the state defined in this parameter,
+waiting timing out, before returning.  The function will wait for
+$PostgreSQL::Test::Utils::timeout_default seconds before timing out.
+
+=back
+
+=cut
+
+sub enable_data_checksums
+{
+	my $postgresnode = shift;
+	my %params = @_;
+
+	# Set sane defaults for the parameters
+	$params{cost_delay} = 0 unless (defined($params{cost_delay}));
+	$params{cost_limit} = 100 unless (defined($params{cost_limit}));
+	$params{fast} = 'true' unless (defined($params{fast}));
+
+	my $query = <<'EOQ';
+SELECT pg_enable_data_checksums(%s, %s, %s);
+EOQ
+
+	$postgresnode->safe_psql(
+		'postgres',
+		sprintf($query,
+			$params{cost_delay}, $params{cost_limit}, $params{fast}));
+
+	wait_for_checksum_state($postgresnode, $params{wait})
+	  if (defined($params{wait}));
+}
+
+=item disable_data_checksums($node, %params)
+
+Function for disabling data checksums in the cluster running at B<node>.
+
+=over
+
+=item wait
+
+If defined, the function will wait for the state to turn to B<off>, or
+waiting timing out, before returning.  The function will wait for
+$PostgreSQL::Test::Utils::timeout_default seconds before timing out.
+Unlike in C<enable_data_checksums> the value of the parameter is discarded.
+
+=back
+
+=cut
+
+sub disable_data_checksums
+{
+	my $postgresnode = shift;
+	my %params = @_;
+
+	# Set sane defaults for the parameters
+	$params{fast} = 'true' unless (defined($params{fast}));
+
+	my $query = <<'EOQ';
+SELECT pg_disable_data_checksums(%s);
+EOQ
+
+	$postgresnode->safe_psql('postgres', sprintf($query, $params{fast}));
+
+	wait_for_checksum_state($postgresnode, 'off') if (defined($params{wait}));
+}
+
+=pod
+
+=back
+
+=cut
+
+1;
diff --git a/src/test/modules/test_checksums/test_checksums--1.0.sql b/src/test/modules/test_checksums/test_checksums--1.0.sql
new file mode 100644
index 00000000000..704b45a3186
--- /dev/null
+++ b/src/test/modules/test_checksums/test_checksums--1.0.sql
@@ -0,0 +1,20 @@
+/* src/test/modules/test_checksums/test_checksums--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_checksums" to load this file. \quit
+
+CREATE FUNCTION dcw_inject_delay_barrier(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION dcw_inject_fail_database(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION dcw_prune_dblist(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION dcw_fake_temptable(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_checksums/test_checksums.c b/src/test/modules/test_checksums/test_checksums.c
new file mode 100644
index 00000000000..26897bff960
--- /dev/null
+++ b/src/test/modules/test_checksums/test_checksums.c
@@ -0,0 +1,173 @@
+/*--------------------------------------------------------------------------
+ *
+ * test_checksums.c
+ *		Test data checksums
+ *
+ * Copyright (c) 2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *		src/test/modules/test_checksums/test_checksums.c
+ *
+ * -------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "funcapi.h"
+#include "miscadmin.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/latch.h"
+#include "utils/injection_point.h"
+#include "utils/wait_event.h"
+
+#define USEC_PER_SEC    1000000
+
+
+PG_MODULE_MAGIC;
+
+extern PGDLLEXPORT void dc_delay_barrier(const char *name, const void *private_data, void *arg);
+extern PGDLLEXPORT void dc_fail_database(const char *name, const void *private_data, void *arg);
+extern PGDLLEXPORT void dc_dblist(const char *name, const void *private_data, void *arg);
+extern PGDLLEXPORT void dc_fake_temptable(const char *name, const void *private_data, void *arg);
+
+/*
+ * Test for delaying emission of procsignalbarriers.
+ */
+void
+dc_delay_barrier(const char *name, const void *private_data, void *arg)
+{
+	(void) name;
+	(void) private_data;
+
+	(void) WaitLatch(MyLatch,
+					 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+					 (3 * 1000),
+					 WAIT_EVENT_PG_SLEEP);
+}
+
+PG_FUNCTION_INFO_V1(dcw_inject_delay_barrier);
+Datum
+dcw_inject_delay_barrier(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksums-enable-checksums-delay",
+							 "test_checksums",
+							 "dc_delay_barrier",
+							 NULL,
+							 0);
+	else
+		InjectionPointDetach("datachecksums-enable-checksums-delay");
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
+
+void
+dc_fail_database(const char *name, const void *private_data, void *arg)
+{
+	static bool first_pass = true;
+	DataChecksumsWorkerResult *res = (DataChecksumsWorkerResult *) arg;
+
+	if (first_pass)
+		*res = DATACHECKSUMSWORKER_FAILED;
+	first_pass = false;
+}
+
+PG_FUNCTION_INFO_V1(dcw_inject_fail_database);
+Datum
+dcw_inject_fail_database(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksumsworker-fail-db",
+							 "test_checksums",
+							 "dc_fail_database",
+							 NULL,
+							 0);
+	else
+		InjectionPointDetach("datachecksumsworker-fail-db");
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
+
+/*
+ * Test to remove an entry from the Databaselist to force re-processing since
+ * not all databases could be processed in the first iteration of the loop.
+ */
+void
+dc_dblist(const char *name, const void *private_data, void *arg)
+{
+	static bool first_pass = true;
+	List	   *DatabaseList = (List *) arg;
+
+	if (first_pass)
+		DatabaseList = list_delete_last(DatabaseList);
+	first_pass = false;
+}
+
+PG_FUNCTION_INFO_V1(dcw_prune_dblist);
+Datum
+dcw_prune_dblist(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksumsworker-initial-dblist",
+							 "test_checksums",
+							 "dc_dblist",
+							 NULL,
+							 0);
+	else
+		InjectionPointDetach("datachecksumsworker-initial-dblist");
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
+
+/*
+ * Test to force waiting for existing temptables.
+ */
+void
+dc_fake_temptable(const char *name, const void *private_data, void *arg)
+{
+	static bool first_pass = true;
+	int		   *numleft = (int *) arg;
+
+	if (first_pass)
+		*numleft = 1;
+	first_pass = false;
+}
+
+PG_FUNCTION_INFO_V1(dcw_fake_temptable);
+Datum
+dcw_fake_temptable(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksumsworker-fake-temptable-wait",
+							 "test_checksums",
+							 "dc_fake_temptable",
+							 NULL,
+							 0);
+	else
+		InjectionPointDetach("datachecksumsworker-fake-temptable-wait");
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
diff --git a/src/test/modules/test_checksums/test_checksums.control b/src/test/modules/test_checksums/test_checksums.control
new file mode 100644
index 00000000000..84b4cc035a7
--- /dev/null
+++ b/src/test/modules/test_checksums/test_checksums.control
@@ -0,0 +1,4 @@
+comment = 'Test code for data checksums'
+default_version = '1.0'
+module_pathname = '$libdir/test_checksums'
+relocatable = true
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 35413f14019..950433ea929 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3872,6 +3872,42 @@ sub advance_wal
 	}
 }
 
+=item $node->checksum_enable_offline()
+
+Enable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_enable_offline
+{
+	my ($self) = @_;
+
+	print "### Enabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-e');
+	return;
+}
+
+=item checksum_disable_offline
+
+Disable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_disable_offline
+{
+	my ($self) = @_;
+
+	print "### Disabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-d');
+	return;
+}
+
 =pod
 
 =back
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 35e8aad7701..4b9c5526e50 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2071,6 +2071,42 @@ pg_stat_progress_create_index| SELECT s.pid,
     s.param15 AS partitions_done
    FROM (pg_stat_get_progress_info('CREATE INDEX'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
+pg_stat_progress_data_checksums| SELECT s.pid,
+    s.datid,
+    d.datname,
+        CASE s.param1
+            WHEN 0 THEN 'enabling'::text
+            WHEN 1 THEN 'disabling'::text
+            WHEN 2 THEN 'waiting'::text
+            WHEN 3 THEN 'waiting on temporary tables'::text
+            WHEN 4 THEN 'waiting on checkpoint'::text
+            WHEN 5 THEN 'done'::text
+            ELSE NULL::text
+        END AS phase,
+        CASE s.param2
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param2
+        END AS databases_total,
+    s.param3 AS databases_done,
+        CASE s.param4
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param4
+        END AS relations_total,
+        CASE s.param5
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param5
+        END AS relations_done,
+        CASE s.param6
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param6
+        END AS blocks_total,
+        CASE s.param7
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param7
+        END AS blocks_done
+   FROM (pg_stat_get_progress_info('DATACHECKSUMS'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)))
+  ORDER BY s.datid;
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,
diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 605f5070376..9042e4d38e3 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -59,6 +59,22 @@ io worker|relation|vacuum
 io worker|temp relation|normal
 io worker|wal|init
 io worker|wal|normal
+datachecksumsworker launcher|relation|bulkread
+datachecksumsworker launcher|relation|bulkwrite
+datachecksumsworker launcher|relation|init
+datachecksumsworker launcher|relation|normal
+datachecksumsworker launcher|relation|vacuum
+datachecksumsworker launcher|temp relation|normal
+datachecksumsworker launcher|wal|init
+datachecksumsworker launcher|wal|normal
+datachecksumsworker worker|relation|bulkread
+datachecksumsworker worker|relation|bulkwrite
+datachecksumsworker worker|relation|init
+datachecksumsworker worker|relation|normal
+datachecksumsworker worker|relation|vacuum
+datachecksumsworker worker|temp relation|normal
+datachecksumsworker worker|wal|init
+datachecksumsworker worker|wal|normal
 slotsync worker|relation|bulkread
 slotsync worker|relation|bulkwrite
 slotsync worker|relation|init
@@ -95,7 +111,7 @@ walsummarizer|wal|init
 walsummarizer|wal|normal
 walwriter|wal|init
 walwriter|wal|normal
-(79 rows)
+(87 rows)
 \a
 -- ensure that both seqscan and indexscan plans are allowed
 SET enable_seqscan TO on;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a13e8162890..df0f49ea2aa 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -416,6 +416,7 @@ CheckPointStmt
 CheckpointStatsData
 CheckpointerRequest
 CheckpointerShmemStruct
+ChecksumType
 Chromosome
 CkptSortItem
 CkptTsStatus
@@ -608,6 +609,10 @@ DataPageDeleteStack
 DataTypesUsageChecks
 DataTypesUsageVersionCheck
 DatabaseInfo
+DataChecksumsWorkerDatabase
+DataChecksumsWorkerResult
+DataChecksumsWorkerResultEntry
+DataChecksumsWorkerShmemStruct
 DateADT
 DateTimeErrorExtra
 Datum
@@ -4243,6 +4248,7 @@ xl_btree_split
 xl_btree_unlink_page
 xl_btree_update
 xl_btree_vacuum
+xl_checksum_state
 xl_clog_truncate
 xl_commit_ts_truncate
 xl_dbase_create_file_copy_rec
-- 
2.39.3 (Apple Git-146)

#52Tomas Vondra
tomas@vondra.me
In reply to: Daniel Gustafsson (#51)
1 attachment(s)
Re: Changing the state of data checksums in a running cluster

On 8/25/25 20:32, Daniel Gustafsson wrote:

On 20 Aug 2025, at 16:37, Tomas Vondra <tomas@vondra.me> wrote:

This happens quite regularly, it's not hard to hit. But I've only seen
it to happen on a FSM, and only right after immediate shutdown. I don't
think that's quite expected.

I believe the built-in TAP tests (with injection points) can't catch
this, because there's no concurrent activity while flipping checksums
on/off. It'd be good to do something like that, by running pgbench in
the background, or something like that.

In searching for this bug I opted for implementing a version of the stress
tests as a TAP test, see 006_concurrent_pgbench.pl in the attached patch
version. It's gated behind PG_TEST_EXTRA since it's clearly not something
which can be enabled by default (if this goes in this need to be re-done to
provide two levels IMO, but during testing this is more convenient). I'm
curious to see which improvements you can think to make it stress the code to
the breaking point.

I think this TAP looks very nice, but there's a couple issues with it.
See the attached patch fixing those.

1) I think test_checksums should be in src/test/modules/Makefile?

2) The test_checksums/Makefile didn't seem to work for me, I was getting

Makefile:23: *** recipe commences before first target. Stop.

Because there was a missing "\" so I had to fix that. And then it was
complaining about Makefile.global or something, so I fixed that by
cargo-culting what other Makefiles in test modules do. Now it seems to
work for me. I guess you're on meson?

3) I'm no perl expert, but AFAICS the test wasn't really running the
pgbench, for a couple of reasons. It was passing "-q" to pgbench, but
that's only for initialization. The clusters had max_connections=10, but
the pgbench was using "-c 10", so I was getting "too many connections".
It was not setting "$pgbench_running = 1" so the other loops were
getting "too many connections" too. Another thing is I'm not sure it's
OK to pass '' to IPC::Run::start, I think it'll take it as an argument,
confusing pgbench.

With these changes it runs for me, and I even saw some

LOG: page verification failed

in tmp_check/log/006_concurrent_pgbench_standby_1.log. But it takes a
while - a couple minutes, maybe? I think I saw it at

t/006_concurrent_pgbench.pl .. 427/?

or something like that. I think the bash version did a couple things
differently, which might make the failures more frequent (but it's just
a wild guess).

In particular, I think the script restarts the two nodes independently,
while the TAP always stops both primary and standby, in this order. I
think it'd be useful to restart one or both.

The other thing is the bash script added some random delays/sleep, which
increases the test duration, but it also means generating somewhat
random amounts of data, etc. It also randomized some other stuff (scale,
client count, ...). But that can wait.

regards

--
Tomas Vondra

Attachments:

checksums-fixes.patchtext/x-patch; charset=UTF-8; name=checksums-fixes.patchDownload
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 903a8ac151a..c8f2747b261 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -17,6 +17,7 @@ SUBDIRS = \
 		  test_aio \
 		  test_binaryheap \
 		  test_bloomfilter \
+		  test_checksums \
 		  test_copy_callbacks \
 		  test_custom_rmgrs \
 		  test_ddl_deparse \
diff --git a/src/test/modules/test_checksums/Makefile b/src/test/modules/test_checksums/Makefile
index b9136bb513f..a5b6259a728 100644
--- a/src/test/modules/test_checksums/Makefile
+++ b/src/test/modules/test_checksums/Makefile
@@ -9,28 +9,32 @@
 #
 #-------------------------------------------------------------------------
 
-subdir = src/test/checksum
-top_builddir = ../../..
-include $(top_builddir)/src/Makefile.global
-
 EXTRA_INSTALL = src/test/modules/injection_points
 
 export enable_injection_points
 
 MODULE_big = test_checksums
 OBJS = \
-	$(WIN32RES)
+	$(WIN32RES) \
 	test_checksums.o
 PGFILEDESC = "test_checksums - test code for data checksums"
 
 EXTENSION = test_checksums
 DATA = test_checksums--1.0.sql
 
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_checksums
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
+
 check:
 	$(prove_check)
 
 installcheck:
 	$(prove_installcheck)
-
-clean distclean maintainer-clean:
-	rm -rf tmp_check
diff --git a/src/test/modules/test_checksums/t/006_concurrent_pgbench.pl b/src/test/modules/test_checksums/t/006_concurrent_pgbench.pl
index 630abee9c63..364225933ca 100644
--- a/src/test/modules/test_checksums/t/006_concurrent_pgbench.pl
+++ b/src/test/modules/test_checksums/t/006_concurrent_pgbench.pl
@@ -36,21 +36,31 @@ if (!$ENV{PG_TEST_EXTRA} || $ENV{PG_TEST_EXTRA} !~ /\bchecksum_extended\b/)
 	plan skip_all => 'Extended tests not enabled';
 }
 
-if ($ENV{enable_injection_points} ne 'yes')
+# Start a pgbench run in the background against the server specified via the
+# port passed as parameter
+sub background_ro_pgbench
 {
-	plan skip_all => 'Injection points not supported by this build';
+	my ($port, $readonly, $stdin, $stdout, $stderr) = @_;
+
+	my $pgbench_primary = IPC::Run::start(
+		[
+			'pgbench', '-p', $port, '-S',
+			'-T', '600', '-c', '10', 'postgres'
+		],
+		'<' => \$stdin,
+		'>' => \$stdout,
+		'2>' => \$stderr,
+		IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default));
 }
 
-# Start a pgbench run in the background against the server specified via the
-# port passed as parameter
-sub background_pgbench
+sub background_rw_pgbench
 {
 	my ($port, $readonly, $stdin, $stdout, $stderr) = @_;
 
 	my $pgbench_primary = IPC::Run::start(
 		[
-			'pgbench', '-p', $port, ($readonly == 1 ? '-S' : ''),
-			'-T', '600', '-c', '10', '-q', 'postgres'
+			'pgbench', '-p', $port,
+			'-T', '600', '-c', '10', 'postgres'
 		],
 		'<' => \$stdin,
 		'>' => \$stdout,
@@ -141,6 +151,7 @@ for (my $i = 1; $i <= 100; $i++)
 # they are caught up and in sync.
 $node_primary = PostgreSQL::Test::Cluster->new('main');
 $node_primary->init(allows_streaming => 1, no_data_checksums => 1);
+$node_primary->append_conf('postgresql.conf', "max_connections = 30");
 $node_primary->start;
 $node_primary->safe_psql('postgres', 'CREATE EXTENSION test_checksums;');
 # Create some content to have un-checksummed data in the cluster
@@ -177,15 +188,17 @@ for (my $i = 0; $i < $TEST_ITERATIONS; $i++)
 		# Start a pgbench in the background against the primary
 		my ($pgb_primary_stdin, $pgb_primary_stdout, $pgb_primary_stderr) =
 		  ('', '', '');
-		background_pgbench($node_primary->port, 0, $pgb_primary_stdin,
+		background_rw_pgbench($node_primary->port, 0, $pgb_primary_stdin,
 			$pgb_primary_stdout, $pgb_primary_stderr);
 
 		# Start a select-only pgbench in the background on the standby
 		my ($pgb_standby_1_stdin, $pgb_standby_1_stdout,
 			$pgb_standby_1_stderr)
 		  = ('', '', '');
-		background_pgbench($node_standby_1->port, 1, $pgb_standby_1_stdin,
+		background_ro_pgbench($node_standby_1->port, 1, $pgb_standby_1_stdin,
 			$pgb_standby_1_stdout, $pgb_standby_1_stderr);
+
+		$pgbench_running = 1;
 	}
 
 	$node_primary->safe_psql('postgres', "UPDATE t SET a = a + 1;");
#53Daniel Gustafsson
daniel@yesql.se
In reply to: Tomas Vondra (#52)
1 attachment(s)
Re: Changing the state of data checksums in a running cluster

On 26 Aug 2025, at 01:06, Tomas Vondra <tomas@vondra.me> wrote:

I think this TAP looks very nice, but there's a couple issues with it.
See the attached patch fixing those.

Thanks, I have incorporated (most of) your patch in the attached. I did keep
the PG_TEST_EXTRA check for injection points though which I assume were removed
out of mistake.

With these changes it runs for me, and I even saw some

LOG: page verification failed

in tmp_check/log/006_concurrent_pgbench_standby_1.log. But it takes a
while - a couple minutes, maybe? I think I saw it at

t/006_concurrent_pgbench.pl .. 427/?

That's very interesting, I have been running it to timeout several times in a
row without hitting any verification failures. Will keep running.

or something like that. I think the bash version did a couple things
differently, which might make the failures more frequent (but it's just
a wild guess).

In particular, I think the script restarts the two nodes independently,
while the TAP always stops both primary and standby, in this order. I
think it'd be useful to restart one or both.

Done in the attached, it will now randomly stop one or both or none. If the
node is stopped I've added an offline pg_checksum step to validate the
datafiles as a why-not test.

The other thing is the bash script added some random delays/sleep, which
increases the test duration, but it also means generating somewhat
random amounts of data, etc. It also randomized some other stuff (scale,
client count, ...). But that can wait.

Added as well in a few places, maybe more can be sprinkled in.

--
Daniel Gustafsson

Attachments:

v20250827-0001-Online-enabling-and-disabling-of-data-chec.patchapplication/octet-stream; name=v20250827-0001-Online-enabling-and-disabling-of-data-chec.patch; x-unix-mode=0644Download
From 07957af635c9d2a22fc0c12abd9fb91087934b48 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Fri, 15 Aug 2025 14:48:02 +0200
Subject: [PATCH v20250827] Online enabling and disabling of data checksums

This allows data checksums to be enabled, or disabled, in a running
cluster without restricting access to the cluster during processing.

Data checksums could prior to this only be enabled during initdb or
when the cluster is offline using the pg_checksums app. This commit
introduce functionality to enable, or disable, data checksums while
the cluster is running regardless of how it was initialized.

A background worker launcher process is responsible for launching a
dynamic per-database background worker which will mark all buffers
dirty for all relation with storage in order for them to have data
checksums calcuated on write.  Once all relations in all databases
have been processed, the data_checksums state will be set to on and
the cluster will at that point be identical to one which had data
checksums enabled during initialization or via offline processing.

When data checksums are being enabled, concurrent I/O operations
from backends other than the data checksums worker will write the
checksums but not verify them on reading.  Only when all backends
have absorbed the procsignalbarrier for setting data_checksums to
on will they also start verifying checksums on reading.  The same
process is repeated during disabling; all backends write checksums
but do not verify them until the barrier for setting the state to
off has been absorbed by all.  This in-progress state is used to
ensure there are no false negatives (or positives) due to reading
a checksum which is not in sync with the page.

This work is based on an earlier version of this patch which was
reviewed by among others Heikki Linnakangas, Robert Haas, Andres
Freund, Tomas Vondra, Michael Banck and Andrey Borodin.  Tomas
Vondra has given invaluable assistance with not only review but
very in-depth testing.

Author: Daniel Gustafsson <daniel@yesql.se>
Author: Magnus Hagander <magnus@hagander.net>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com
Discussion: https://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de
Discussion: https://postgr.es/m/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com
---
 doc/src/sgml/func/func-admin.sgml             |   71 +
 doc/src/sgml/glossary.sgml                    |   23 +
 doc/src/sgml/monitoring.sgml                  |  208 ++-
 doc/src/sgml/ref/pg_checksums.sgml            |    6 +
 doc/src/sgml/regress.sgml                     |   12 +
 doc/src/sgml/wal.sgml                         |   59 +-
 src/backend/access/rmgrdesc/xlogdesc.c        |   24 +
 src/backend/access/transam/xlog.c             |  547 +++++-
 src/backend/access/transam/xlogfuncs.c        |   43 +
 src/backend/backup/basebackup.c               |    6 +-
 src/backend/catalog/system_functions.sql      |   16 +
 src/backend/catalog/system_views.sql          |   20 +
 src/backend/postmaster/Makefile               |    1 +
 src/backend/postmaster/auxprocess.c           |   19 +
 src/backend/postmaster/bgworker.c             |    7 +
 src/backend/postmaster/datachecksumsworker.c  | 1463 +++++++++++++++++
 src/backend/postmaster/launch_backend.c       |    3 +
 src/backend/postmaster/meson.build            |    1 +
 src/backend/postmaster/postmaster.c           |    5 +
 src/backend/replication/logical/decode.c      |    1 +
 src/backend/storage/ipc/ipci.c                |    4 +
 src/backend/storage/ipc/procsignal.c          |   13 +
 src/backend/storage/page/README               |    4 +-
 src/backend/storage/page/bufpage.c            |    6 +-
 src/backend/utils/activity/pgstat_backend.c   |    2 +
 src/backend/utils/activity/pgstat_io.c        |    2 +
 .../utils/activity/wait_event_names.txt       |    4 +
 src/backend/utils/adt/pgstatfuncs.c           |    8 +-
 src/backend/utils/init/miscinit.c             |   12 +-
 src/backend/utils/init/postinit.c             |   20 +-
 src/backend/utils/misc/guc_tables.c           |   30 +-
 src/bin/pg_checksums/pg_checksums.c           |    4 +-
 src/bin/pg_controldata/pg_controldata.c       |    2 +
 src/bin/pg_upgrade/controldata.c              |    9 +
 src/include/access/xlog.h                     |   17 +-
 src/include/access/xlog_internal.h            |    7 +
 src/include/catalog/pg_control.h              |    6 +-
 src/include/catalog/pg_proc.dat               |   19 +
 src/include/commands/progress.h               |   16 +
 src/include/miscadmin.h                       |    6 +
 src/include/postmaster/datachecksumsworker.h  |   51 +
 src/include/storage/bufpage.h                 |    2 +-
 src/include/storage/checksum.h                |   14 +
 src/include/storage/lwlocklist.h              |    1 +
 src/include/storage/proc.h                    |    6 +-
 src/include/storage/procsignal.h              |    5 +
 src/include/utils/backend_progress.h          |    1 +
 src/test/Makefile                             |   11 +-
 src/test/modules/Makefile                     |    1 +
 src/test/modules/meson.build                  |    1 +
 src/test/modules/test_checksums/.gitignore    |    2 +
 src/test/modules/test_checksums/Makefile      |   40 +
 src/test/modules/test_checksums/README        |   22 +
 src/test/modules/test_checksums/meson.build   |   35 +
 .../modules/test_checksums/t/001_basic.pl     |   63 +
 .../modules/test_checksums/t/002_restarts.pl  |  110 ++
 .../test_checksums/t/003_standby_restarts.pl  |  114 ++
 .../modules/test_checksums/t/004_offline.pl   |   82 +
 .../modules/test_checksums/t/005_injection.pl |   76 +
 .../t/006_concurrent_pgbench.pl               |  283 ++++
 .../test_checksums/t/DataChecksums/Utils.pm   |  185 +++
 .../test_checksums/test_checksums--1.0.sql    |   20 +
 .../modules/test_checksums/test_checksums.c   |  173 ++
 .../test_checksums/test_checksums.control     |    4 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |   45 +
 src/test/regress/expected/rules.out           |   36 +
 src/test/regress/expected/stats.out           |   18 +-
 src/tools/pgindent/typedefs.list              |    6 +
 68 files changed, 4077 insertions(+), 56 deletions(-)
 create mode 100644 src/backend/postmaster/datachecksumsworker.c
 create mode 100644 src/include/postmaster/datachecksumsworker.h
 create mode 100644 src/test/modules/test_checksums/.gitignore
 create mode 100644 src/test/modules/test_checksums/Makefile
 create mode 100644 src/test/modules/test_checksums/README
 create mode 100644 src/test/modules/test_checksums/meson.build
 create mode 100644 src/test/modules/test_checksums/t/001_basic.pl
 create mode 100644 src/test/modules/test_checksums/t/002_restarts.pl
 create mode 100644 src/test/modules/test_checksums/t/003_standby_restarts.pl
 create mode 100644 src/test/modules/test_checksums/t/004_offline.pl
 create mode 100644 src/test/modules/test_checksums/t/005_injection.pl
 create mode 100644 src/test/modules/test_checksums/t/006_concurrent_pgbench.pl
 create mode 100644 src/test/modules/test_checksums/t/DataChecksums/Utils.pm
 create mode 100644 src/test/modules/test_checksums/test_checksums--1.0.sql
 create mode 100644 src/test/modules/test_checksums/test_checksums.c
 create mode 100644 src/test/modules/test_checksums/test_checksums.control

diff --git a/doc/src/sgml/func/func-admin.sgml b/doc/src/sgml/func/func-admin.sgml
index 57ff333159f..88d260795b8 100644
--- a/doc/src/sgml/func/func-admin.sgml
+++ b/doc/src/sgml/func/func-admin.sgml
@@ -2960,4 +2960,75 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 
   </sect2>
 
+  <sect2 id="functions-admin-checksum">
+   <title>Data Checksum Functions</title>
+
+   <para>
+    The functions shown in <xref linkend="functions-checksums-table" /> can
+    be used to enable or disable data checksums in a running cluster.
+    See <xref linkend="checksums" /> for details.
+   </para>
+
+   <table id="functions-checksums-table">
+    <title>Data Checksum Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_enable_data_checksums</primary>
+        </indexterm>
+        <function>pg_enable_data_checksums</function> ( <optional><parameter>cost_delay</parameter> <type>int</type>, <parameter>cost_limit</parameter> <type>int</type></optional> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Initiates data checksums for the cluster. This will switch the data
+        checksums mode to <literal>inprogress-on</literal> as well as start a
+        background worker that will process all pages in the database and
+        enable checksums on them. When all data pages have had checksums
+        enabled, the cluster will automatically switch data checksums mode to
+        <literal>on</literal>.
+       </para>
+       <para>
+        If <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter> are
+        specified, the speed of the process is throttled using the same principles as
+        <link linkend="runtime-config-resource-vacuum-cost">Cost-based Vacuum Delay</link>.
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_disable_data_checksums</primary>
+        </indexterm>
+        <function>pg_disable_data_checksums</function> ()
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Disables data checksum validation and calculation for the cluster. This
+        will switch the data checksum mode to <literal>inprogress-off</literal>
+        while data checksums are being disabled. When all active backends have
+        stopped validating data checksums, the data checksum mode will be
+        changed to <literal>off</literal>.  At this point the data pages will
+        still have checksums recorded but they are not updated when pages are
+        modified.
+       </para></entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+
+  </sect2>
+
   </sect1>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index b88cac598e9..a4e16d03aae 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -184,6 +184,8 @@
      (but not the autovacuum workers),
      the <glossterm linkend="glossary-background-writer">background writer</glossterm>,
      the <glossterm linkend="glossary-checkpointer">checkpointer</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker">data checksums worker</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker-launcher">data checksums worker launcher</glossterm>,
      the <glossterm linkend="glossary-logger">logger</glossterm>,
      the <glossterm linkend="glossary-startup-process">startup process</glossterm>,
      the <glossterm linkend="glossary-wal-archiver">WAL archiver</glossterm>,
@@ -573,6 +575,27 @@
    </glossdef>
   </glossentry>
 
+  <glossentry id="glossary-data-checksums-worker">
+   <glossterm>Data Checksums Worker</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which enables or disables data checksums in a specific database.
+    </para>
+   </glossdef>
+  </glossentry>
+
+  <glossentry id="glossary-data-checksums-worker-launcher">
+   <glossterm>Data Checksums Worker Launcher</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which starts <glossterm linkend="glossary-data-checksums-worker"> processes</glossterm>
+     for each database.
+    </para>
+   </glossdef>
+  </glossentry>
+
   <glossentry id="glossary-db-cluster">
    <glossterm>Database cluster</glossterm>
    <glossdef>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 3f4a27a736e..6082d991497 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3527,8 +3527,9 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Number of data page checksum failures detected in this
-       database (or on a shared object), or NULL if data checksums are
-       disabled.
+       database (or on a shared object).
+       Detected failures are reported regardless of the
+       <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -3538,8 +3539,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Time at which the last data page checksum failure was detected in
-       this database (or on a shared object), or NULL if data checksums are
-       disabled.
+       this database (or on a shared object). Last failure is reported
+       regardless of the <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -6877,6 +6878,205 @@ FROM pg_stat_get_backend_idset() AS backendid;
 
  </sect2>
 
+ <sect2 id="data-checksum-progress-reporting">
+  <title>Data Checksum Progress Reporting</title>
+
+  <indexterm>
+   <primary>pg_stat_progress_data_checksums</primary>
+  </indexterm>
+
+  <para>
+   When data checksums are being enabled on a running cluster, the
+   <structname>pg_stat_progress_data_checksums</structname> view will contain
+   a row for the launcher process, and one row for each worker process which
+   is currently calculating checksums for the data pages in one database.
+  </para>
+
+  <table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
+   <title><structname>pg_stat_progress_data_checksums</structname> View</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        Column Type
+       </para>
+       <para>
+        Description>
+       </para>
+      </entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>pid</structfield> <type>integer</type>
+       </para>
+       <para>
+        Process ID of a datachecksumworker process.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datid</structfield> <type>oid</type>
+      </para>
+      <para>
+       OID of this database, or 0 for the launcher process
+       relation
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datname</structfield> <type>name</type>
+      </para>
+      <para>
+       Name of this database, or <literal>NULL</literal> for the
+       launcher process.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>phase</structfield> <type>text</type>
+       </para>
+       <para>
+        Current processing phase, see <xref linkend="datachecksum-phases"/>
+        for description of the phases.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of databases which will be processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this set to <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of databases which have been processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this set to <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of relations which will be processed, or
+        <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of relations yet. The launcher process has
+        this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of relations which have been processed. The launcher
+        process has this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>blocks_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks in the current relation which will be processed,
+        or <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of blocks yet. The launcher process has
+        this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>blocks_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks in the current relation which have been processed.
+        The launcher process has this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+  <table id="datachecksum-phases">
+   <title>Data Checksum Phases</title>
+   <tgroup cols="2">
+    <colspec colname="col1" colwidth="1*"/>
+    <colspec colname="col2" colwidth="2*"/>
+    <thead>
+     <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+    <tbody>
+     <row>
+      <entry><literal>enabling</literal></entry>
+      <entry>
+       The command is currently enabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>disabling</literal></entry>
+      <entry>
+       The command is currently disabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on temporary tables</literal></entry>
+      <entry>
+       The command is currently waiting for all temporary tables which existed
+       at the time the command was started to be removed.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on checkpoint</literal></entry>
+      <entry>
+       The command is currently waiting for a checkpoint to update the checksum
+       state before finishing.
+      </entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+ </sect2>
+
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index 95043aa329c..0343710af53 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -45,6 +45,12 @@ PostgreSQL documentation
    exit status is nonzero if the operation failed.
   </para>
 
+  <para>
+   When enabling checksums, if checksums were in the process of being enabled
+   when the cluster was shut down, <application>pg_checksums</application>
+   will still process all relations regardless of the online processing.
+  </para>
+
   <para>
    When verifying checksums, every file in the cluster is scanned. When
    enabling checksums, each relation file block with a changed checksum is
diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index 8838fe7f022..7074751834e 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -263,6 +263,18 @@ make check-world PG_TEST_EXTRA='kerberos ldap ssl load_balance libpq_encryption'
 </programlisting>
    The following values are currently supported:
    <variablelist>
+    <varlistentry>
+     <term><literal>checksum_extended</literal></term>
+     <listitem>
+      <para>
+       Runs additional tests for enabling data checksums which inject delays
+       and re-tries in the processing, as well as tests that run pgbench
+       concurrently and randomly restarts the cluster.  Some of these test
+       suites requires injection points enabled in the installation.
+      </para>
+     </listitem>
+    </varlistentry>
+
     <varlistentry>
      <term><literal>kerberos</literal></term>
      <listitem>
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index f3b86b26be9..0ada90ca0b1 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -246,9 +246,10 @@
   <para>
    Checksums can be disabled when the cluster is initialized using <link
    linkend="app-initdb-data-checksums"><application>initdb</application></link>.
-   They can also be enabled or disabled at a later time as an offline
-   operation. Data checksums are enabled or disabled at the full cluster
-   level, and cannot be specified individually for databases or tables.
+   They can also be enabled or disabled at a later time either as an offline
+   operation or online in a running cluster allowing concurrent access. Data
+   checksums are enabled or disabled at the full cluster level, and cannot be
+   specified individually for databases or tables.
   </para>
 
   <para>
@@ -265,7 +266,7 @@
   </para>
 
   <sect2 id="checksums-offline-enable-disable">
-   <title>Off-line Enabling of Checksums</title>
+   <title>Offline Enabling of Checksums</title>
 
    <para>
     The <link linkend="app-pgchecksums"><application>pg_checksums</application></link>
@@ -274,6 +275,56 @@
    </para>
 
   </sect2>
+
+  <sect2 id="checksums-online-enable-disable">
+   <title>Online Enabling of Checksums</title>
+
+   <para>
+    Checksums can be enabled or disabled online, by calling the appropriate
+    <link linkend="functions-admin-checksum">functions</link>.
+   </para>
+
+   <para>
+    Enabling checksums will put the cluster checksum mode in
+    <literal>inprogress-on</literal> mode.  During this time, checksums will be
+    written but not verified. In addition to this, a background worker process
+    is started that enables checksums on all existing data in the cluster. Once
+    this worker has completed processing all databases in the cluster, the
+    checksum mode will automatically switch to <literal>on</literal>. The
+    processing will consume two background worker processes, make sure that
+    <varname>max_worker_processes</varname> allows for at least two more
+    additional processes.
+   </para>
+
+   <para>
+    The process will initially wait for all open transactions to finish before
+    it starts, so that it can be certain that there are no tables that have been
+    created inside a transaction that has not committed yet and thus would not
+    be visible to the process enabling checksums. It will also, for each database,
+    wait for all pre-existing temporary tables to get removed before it finishes.
+    If long-lived temporary tables are used in the application it may be necessary
+    to terminate these application connections to allow the process to complete.
+   </para>
+
+   <para>
+    If the cluster is stopped while in <literal>inprogress-on</literal> mode, for
+    any reason, then this process must be restarted manually. To do this,
+    re-execute the function <function>pg_enable_data_checksums()</function>
+    once the cluster has been restarted. The process will start over, there is
+    no support for resuming work from where it was interrupted.
+   </para>
+
+   <note>
+    <para>
+     Enabling checksums can cause significant I/O to the system, as most of the
+     database pages will need to be rewritten, and will be written both to the
+     data files and the WAL. The impact may be limited by throttling using the
+     <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter>
+     parameters of the <function>pg_enable_data_checksums</function> function.
+    </para>
+   </note>
+
+  </sect2>
  </sect1>
 
   <sect1 id="wal-intro">
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index cd6c2a2f650..c50d654db30 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -18,6 +18,7 @@
 #include "access/xlog.h"
 #include "access/xlog_internal.h"
 #include "catalog/pg_control.h"
+#include "storage/bufpage.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -167,6 +168,26 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
 		memcpy(&wal_level, rec, sizeof(int));
 		appendStringInfo(buf, "wal_level %s", get_wal_level_string(wal_level));
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state xlrec;
+
+		memcpy(&xlrec, rec, sizeof(xl_checksum_state));
+		switch (xlrec.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_VERSION:
+				appendStringInfoString(buf, "on");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				appendStringInfoString(buf, "inprogress-off");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				appendStringInfoString(buf, "inprogress-on");
+				break;
+			default:
+				appendStringInfoString(buf, "off");
+		}
+	}
 }
 
 const char *
@@ -218,6 +239,9 @@ xlog_identify(uint8 info)
 		case XLOG_CHECKPOINT_REDO:
 			id = "CHECKPOINT_REDO";
 			break;
+		case XLOG_CHECKSUMS:
+			id = "CHECKSUMS";
+			break;
 	}
 
 	return id;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 7ffb2179151..46edf531359 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -550,6 +550,9 @@ typedef struct XLogCtlData
 	 */
 	XLogRecPtr	lastFpwDisableRecPtr;
 
+	/* last data_checksum_version we've seen */
+	uint32		data_checksum_version;
+
 	slock_t		info_lck;		/* locks shared variables shown above */
 } XLogCtlData;
 
@@ -647,6 +650,36 @@ static XLogRecPtr LocalMinRecoveryPoint;
 static TimeLineID LocalMinRecoveryPointTLI;
 static bool updateMinRecoveryPoint = true;
 
+/*
+ * Local state fror Controlfile data_checksum_version.  After initialization
+ * this is only updated when absorbing a procsignal barrier during interrupt
+ * processing.  The reason for keeping a copy in backend-private memory is to
+ * avoid locking for interrogating checksum state.  Possible values are the
+ * checksum versions defined in storage/bufpage.h as well as zero when data
+ * checksums are disabled.
+ */
+static uint32 LocalDataChecksumVersion = 0;
+
+/*
+ * Flag to remember if the procsignalbarrier being absorbed for checksums is
+ * the first one.  The first procsignalbarrier can in rare cases be for the
+ * state we've initialized, i.e. a duplicate.  This may happen for any
+ * data_checksum_version value, but for PG_DATA_CHECKSUM_ON_VERSION this would
+ * trigger an assert failure (this is the only transition with an assert) when
+ * processing the barrier.  This may happen if the process is spawned between
+ * the update of XLogCtl->data_checksum_version and the barrier being emitted.
+ * This can only happen on the very first barrier so mark that with this flag.
+ */
+static bool InitialDataChecksumTransition = true;
+
+/*
+ * Variable backing the GUC, keep it in sync with LocalDataChecksumVersion.
+ * See SetLocalDataChecksumVersion().
+ */
+int			data_checksums = 0;
+
+static void SetLocalDataChecksumVersion(uint32 data_checksum_version);
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -715,6 +748,8 @@ static void WALInsertLockAcquireExclusive(void);
 static void WALInsertLockRelease(void);
 static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);
 
+static void XLogChecksums(uint32 new_type);
+
 /*
  * Insert an XLOG record represented by an already-constructed chain of data
  * chunks.  This is a low-level routine; to construct the WAL record header
@@ -828,9 +863,10 @@ XLogInsertRecord(XLogRecData *rdata,
 		 * only happen just after a checkpoint, so it's better to be slow in
 		 * this case and fast otherwise.
 		 *
-		 * Also check to see if fullPageWrites was just turned on or there's a
-		 * running backup (which forces full-page writes); if we weren't
-		 * already doing full-page writes then go back and recompute.
+		 * Also check to see if fullPageWrites was just turned on, there's a
+		 * running backup or if checksums are enabled (all of which forces
+		 * full-page writes); if we weren't already doing full-page writes
+		 * then go back and recompute.
 		 *
 		 * If we aren't doing full-page writes then RedoRecPtr doesn't
 		 * actually affect the contents of the XLOG record, so we'll update
@@ -843,7 +879,9 @@ XLogInsertRecord(XLogRecData *rdata,
 			Assert(RedoRecPtr < Insert->RedoRecPtr);
 			RedoRecPtr = Insert->RedoRecPtr;
 		}
-		doPageWrites = (Insert->fullPageWrites || Insert->runningBackups > 0);
+		doPageWrites = (Insert->fullPageWrites ||
+						Insert->runningBackups > 0 ||
+						DataChecksumsNeedWrite());
 
 		if (doPageWrites &&
 			(!prevDoPageWrites ||
@@ -4229,6 +4267,12 @@ InitControlFile(uint64 sysidentifier, uint32 data_checksum_version)
 	ControlFile->wal_log_hints = wal_log_hints;
 	ControlFile->track_commit_timestamp = track_commit_timestamp;
 	ControlFile->data_checksum_version = data_checksum_version;
+
+	/*
+	 * Set the data_checksum_version value into XLogCtl, which is where all
+	 * processes get the current value from. (Maybe it should go just there?)
+	 */
+	XLogCtl->data_checksum_version = data_checksum_version;
 }
 
 static void
@@ -4552,10 +4596,6 @@ ReadControlFile(void)
 		(SizeOfXLogLongPHD - SizeOfXLogShortPHD);
 
 	CalculateCheckpointSegments();
-
-	/* Make the initdb settings visible as GUC variables, too */
-	SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no",
-					PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
 }
 
 /*
@@ -4589,13 +4629,374 @@ GetMockAuthenticationNonce(void)
 }
 
 /*
- * Are checksums enabled for data pages?
+ * DataChecksumsNeedWrite
+ *		Returns whether data checksums must be written or not
+ *
+ * Returns true iff data checksums are enabled or are in the process of being
+ * enabled. During "inprogress-on" and "inprogress-off" states checksums must
+ * be written even though they are not verified (see datachecksumsworker.c for
+ * a longer discussion).
+ *
+ * This function is intended for callsites which are about to write a data page
+ * to storage, and need to know whether to re-calculate the checksum for the
+ * page header. Calling this function must be performed as close to the write
+ * operation as possible to keep the critical section short.
+ */
+bool
+DataChecksumsNeedWrite(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * DataChecksumsNeedVerify
+ *		Returns whether data checksums must be verified or not
+ *
+ * Data checksums are only verified if they are fully enabled in the cluster.
+ * During the "inprogress-on" and "inprogress-off" states they are only
+ * updated, not verified (see datachecksumsworker.c for a longer discussion).
+ *
+ * This function is intended for callsites which have read data and are about
+ * to perform checksum validation based on the result of this.  Calling this
+ * function must be performed as close to the validation call as possible to
+ * keep the critical section short. This is in order to protect against time of
+ * check/time of use situations around data checksum validation.
+ */
+bool
+DataChecksumsNeedVerify(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
+}
+
+/*
+ * DataChecksumsOnInProgress
+ *		Returns whether data checksums are being enabled
+ *
+ * Most operations don't need to worry about the "inprogress" states, and
+ * should use DataChecksumsNeedVerify() or DataChecksumsNeedWrite(). The
+ * "inprogress-on" state for enabling checksums is used when the checksum
+ * worker is setting checksums on all pages, it can thus be used to check for
+ * aborted checksum processing which need to be restarted.
+ */
+inline bool
+DataChecksumsOnInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+}
+
+/*
+ * DataChecksumsOffInProgress
+ *		Returns whether data checksums are being disabled
+ *
+ * The "inprogress-off" state for disabling checksums is used for when the
+ * worker resets the catalog state.  DataChecksumsNeedVerify() or
+ * DataChecksumsNeedWrite() should be used for deciding whether to read/write
+ * checksums.
  */
 bool
-DataChecksumsEnabled(void)
+DataChecksumsOffInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * SetDataChecksumsOnInProgress
+ *		Sets the data checksum state to "inprogress-on" to enable checksums
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on". See
+ * SetDataChecksumsOn below for a description on how this state change works.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOnInProgress(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile != NULL);
+
+	/*
+	 * The state transition is performed in a critical section with
+	 * checkpoints held off to provide crash safety.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state change in all backends to ensure that all backends are in
+	 * "inprogress-on". Once done we know that all backends are writing data
+	 * checksums.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOn
+ *		Enables data checksums cluster-wide
+ *
+ * Enabling data checksums is performed using two barriers, the first one to
+ * set the state to "inprogress-on" (done by SetDataChecksumsOnInProgress())
+ * and the second one to set the state to "on" (done here). Below is a short
+ * description of the processing, a more detailed write-up can be found in
+ * datachecksumsworker.c.
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on".  This state
+ * requires data checksums to be written but not verified. This ensures that
+ * all data pages can be checksummed without the risk of false negatives in
+ * validation during the process.  When all existing pages are guaranteed to
+ * have checksums, and all new pages will be initiated with checksums, the
+ * state can be changed to "on". Once the state is "on" checksums will be both
+ * written and verified. See datachecksumsworker.c for a longer discussion on
+ * how data checksums can be enabled in a running cluster.
+ *
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOn(void)
 {
+	uint64		barrier;
+
 	Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+
+	/*
+	 * The only allowed state transition to "on" is from "inprogress-on" since
+	 * that state ensures that all pages will have data checksums written.
+	 */
+	if (XLogCtl->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		SpinLockRelease(&XLogCtl->info_lck);
+		elog(ERROR, "checksums not in \"inprogress-on\" mode");
+	}
+
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	INJECTION_POINT("datachecksums-enable-checksums-delay", NULL);
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_VERSION);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state transition of "on" in all backends. When done we know that
+	 * data checksums are enabled in all backends and data checksums are both
+	 * written and verified.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOff
+ *		Disables data checksums cluster-wide
+ *
+ * Disabling data checksums must be performed with two sets of barriers, each
+ * carrying a different state. The state is first set to "inprogress-off"
+ * during which checksums are still written but not verified. This ensures that
+ * backends which have yet to observe the state change from "on" won't get
+ * validation errors on concurrently modified pages. Once all backends have
+ * changed to "inprogress-off", the barrier for moving to "off" can be emitted.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOff(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+
+	/* If data checksums are already disabled there is nothing to do */
+	if (XLogCtl->data_checksum_version == 0)
+	{
+		SpinLockRelease(&XLogCtl->info_lck);
+		return;
+	}
+
+	/*
+	 * If data checksums are currently enabled we first transition to the
+	 * "inprogress-off" state during which backends continue to write
+	 * checksums without verifying them. When all backends are in
+	 * "inprogress-off" the next transition to "off" can be performed, after
+	 * which all data checksum processing is disabled.
+	 */
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
+	{
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+		START_CRIT_SECTION();
+
+		XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+
+		END_CRIT_SECTION();
+		MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+		/*
+		 * Update local state in all backends to ensure that any backend in
+		 * "on" state is changed to "inprogress-off".
+		 */
+		WaitForProcSignalBarrier(barrier);
+
+		/*
+		 * At this point we know that no backends are verifying data checksums
+		 * during reading. Next, we can safely move to state "off" to also
+		 * stop writing checksums.
+		 */
+	}
+	else
+	{
+		/*
+		 * Ending up here implies that the checksums state is "inprogress-on"
+		 * or "inprogress-off" and we can transition directly to "off" from
+		 * there.
+		 */
+		SpinLockRelease(&XLogCtl->info_lck);
+	}
+
+	/*
+	 * Ensure that we don't incur a checkpoint during disabling checksums.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(0);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = 0;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * ProcSignalBarrier absorption functions for enabling and disabling data
+ * checksums in a running cluster. The procsignalbarriers are emitted in the
+ * SetDataChecksums* functions.
+ */
+bool
+AbsorbChecksumsOnInProgressBarrier(void)
+{
+	Assert(LocalDataChecksumVersion != PG_DATA_CHECKSUM_VERSION);
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	return true;
+}
+
+bool
+AbsorbChecksumsOnBarrier(void)
+{
+	/*
+	 * If the process was spawned between updating XLogCtl and emitting the
+	 * barrier it will have seen the updated value, so for the first barrier
+	 * we accept both "on" and "inprogress-on".
+	 */
+	Assert((LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION) ||
+		   (InitialDataChecksumTransition &&
+			(LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)));
+
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_VERSION);
+	InitialDataChecksumTransition = false;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffInProgressBarrier(void)
+{
+	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+	return true;
+}
+
+bool
+AbsorbChecksumsOffBarrier(void)
+{
+	/*
+	 * We should never get here directly from a cluster with data checksums
+	 * enabled, an inprogress state should be in between.  When there are no
+	 * failures the inprogress-off state should preceed, but in case of error
+	 * in processing we can also reach here from the inprogress-on state.
+	 */
+	Assert((LocalDataChecksumVersion != PG_DATA_CHECKSUM_VERSION) &&
+		   (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION));
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_OFF);
+	return true;
+}
+
+/*
+ * InitLocalControlData
+ *
+ * Set up backend local caches of controldata variables which may change at
+ * any point during runtime and thus require special cased locking. So far
+ * this only applies to data_checksum_version, but it's intended to be general
+ * purpose enough to handle future cases.
+ */
+void
+InitLocalDataChecksumVersion(void)
+{
+	SpinLockAcquire(&XLogCtl->info_lck);
+	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+	SpinLockRelease(&XLogCtl->info_lck);
+}
+
+void
+SetLocalDataChecksumVersion(uint32 data_checksum_version)
+{
+	LocalDataChecksumVersion = data_checksum_version;
+
+	data_checksums = data_checksum_version;
+}
+
+/* guc hook */
+const char *
+show_data_checksums(void)
+{
+	if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)
+		return "on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		return "inprogress-on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+		return "inprogress-off";
+	else
+		return "off";
 }
 
 /*
@@ -4870,6 +5271,7 @@ LocalProcessControlFile(bool reset)
 	Assert(reset || ControlFile == NULL);
 	ControlFile = palloc(sizeof(ControlFileData));
 	ReadControlFile();
+	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
 }
 
 /*
@@ -5039,6 +5441,11 @@ XLOGShmemInit(void)
 	XLogCtl->InstallXLogFileSegmentActive = false;
 	XLogCtl->WalWriterSleeping = false;
 
+	/* Use the checksum info from control file */
+	XLogCtl->data_checksum_version = ControlFile->data_checksum_version;
+
+	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+
 	SpinLockInit(&XLogCtl->Insert.insertpos_lck);
 	SpinLockInit(&XLogCtl->info_lck);
 	pg_atomic_init_u64(&XLogCtl->logInsertResult, InvalidXLogRecPtr);
@@ -6180,6 +6587,47 @@ StartupXLOG(void)
 	pfree(endOfRecoveryInfo->recoveryStopReason);
 	pfree(endOfRecoveryInfo);
 
+	/*
+	 * If we reach this point with checksums in the state inprogress-on, it
+	 * means that data checksums were in the process of being enabled when the
+	 * cluster shut down. Since processing didn't finish, the operation will
+	 * have to be restarted from scratch since there is no capability to
+	 * continue where it was when the cluster shut down. Thus, revert the
+	 * state back to off, and inform the user with a warning message. Being
+	 * able to restart processing is a TODO, but it wouldn't be possible to
+	 * restart here since we cannot launch a dynamic background worker
+	 * directly from here (it has to be from a regular backend).
+	 */
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		XLogChecksums(0);
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = 0;
+		SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		ereport(WARNING,
+				(errmsg("data checksums state has been set of off"),
+				 errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
+	}
+
+	/*
+	 * If data checksums were being disabled when the cluster was shut down,
+	 * we know that we have a state where all backends have stopped validating
+	 * checksums and we can move to off instead of prompting the user to
+	 * perform any action.
+	 */
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+	{
+		XLogChecksums(0);
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = 0;
+		SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+		SpinLockRelease(&XLogCtl->info_lck);
+	}
+
 	/*
 	 * All done with end-of-recovery actions.
 	 *
@@ -6471,7 +6919,7 @@ GetRedoRecPtr(void)
 	XLogRecPtr	ptr;
 
 	/*
-	 * The possibly not up-to-date copy in XlogCtl is enough. Even if we
+	 * The possibly not up-to-date copy in XLogCtl is enough. Even if we
 	 * grabbed a WAL insertion lock to read the authoritative value in
 	 * Insert->RedoRecPtr, someone might update it just after we've released
 	 * the lock.
@@ -7035,6 +7483,12 @@ CreateCheckPoint(int flags)
 	checkPoint.fullPageWrites = Insert->fullPageWrites;
 	checkPoint.wal_level = wal_level;
 
+	/*
+	 * Get the current data_checksum_version value from xlogctl, valid at the
+	 * time of the checkpoint.
+	 */
+	checkPoint.data_checksum_version = XLogCtl->data_checksum_version;
+
 	if (shutdown)
 	{
 		XLogRecPtr	curInsert = XLogBytePosToRecPtr(Insert->CurrBytePos);
@@ -7290,6 +7744,9 @@ CreateCheckPoint(int flags)
 	ControlFile->minRecoveryPoint = InvalidXLogRecPtr;
 	ControlFile->minRecoveryPointTLI = 0;
 
+	/* make sure we start with the checksum version as of the checkpoint */
+	ControlFile->data_checksum_version = checkPoint.data_checksum_version;
+
 	/*
 	 * Persist unloggedLSN value. It's reset on crash recovery, so this goes
 	 * unused on non-shutdown checkpoints, but seems useful to store it always
@@ -7435,6 +7892,10 @@ CreateEndOfRecoveryRecord(void)
 	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
 	ControlFile->minRecoveryPoint = recptr;
 	ControlFile->minRecoveryPointTLI = xlrec.ThisTimeLineID;
+
+	/* start with the latest checksum version (as of the end of recovery) */
+	ControlFile->data_checksum_version = XLogCtl->data_checksum_version;
+
 	UpdateControlFile();
 	LWLockRelease(ControlFileLock);
 
@@ -7776,6 +8237,10 @@ CreateRestartPoint(int flags)
 			if (flags & CHECKPOINT_IS_SHUTDOWN)
 				ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY;
 		}
+
+		/* we shall start with the latest checksum version */
+		ControlFile->data_checksum_version = lastCheckPoint.data_checksum_version;
+
 		UpdateControlFile();
 	}
 	LWLockRelease(ControlFileLock);
@@ -8187,6 +8652,24 @@ XLogReportParameters(void)
 	}
 }
 
+/*
+ * Log the new state of checksums
+ */
+static void
+XLogChecksums(uint32 new_type)
+{
+	xl_checksum_state xlrec;
+	XLogRecPtr	recptr;
+
+	xlrec.new_checksumtype = new_type;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, sizeof(xl_checksum_state));
+
+	recptr = XLogInsert(RM_XLOG_ID, XLOG_CHECKSUMS);
+	XLogFlush(recptr);
+}
+
 /*
  * Update full_page_writes in shared memory, and write an
  * XLOG_FPW_CHANGE record if necessary.
@@ -8605,6 +9088,46 @@ xlog_redo(XLogReaderState *record)
 	{
 		/* nothing to do here, just for informational purposes */
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state state;
+		uint64		barrier;
+
+		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = state.new_checksumtype;
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		/*
+		 * Block on a procsignalbarrier to await all processes having seen the
+		 * change to checksum status. Once the barrier has been passed we can
+		 * initiate the corresponding processing.
+		 */
+		switch (state.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			default:
+				Assert(state.new_checksumtype == 0);
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+		}
+	}
 }
 
 /*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 8c3090165f0..337932a89e5 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -26,6 +26,7 @@
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "postmaster/datachecksumsworker.h"
 #include "replication/walreceiver.h"
 #include "storage/fd.h"
 #include "storage/latch.h"
@@ -748,3 +749,45 @@ pg_promote(PG_FUNCTION_ARGS)
 						   wait_seconds)));
 	PG_RETURN_BOOL(false);
 }
+
+/*
+ * Disables data checksums for the cluster, if applicable. Starts a background
+ * worker which turns off the data checksums.
+ */
+Datum
+disable_data_checksums(PG_FUNCTION_ARGS)
+{
+	bool		fast = PG_GETARG_BOOL(0);
+
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	StartDataChecksumsWorkerLauncher(DISABLE_DATACHECKSUMS, 0, 0, fast);
+	PG_RETURN_VOID();
+}
+
+/*
+ * Enables data checksums for the cluster, if applicable.  Supports vacuum-
+ * like cost based throttling to limit system load. Starts a background worker
+ * which updates data checksums on existing data.
+ */
+Datum
+enable_data_checksums(PG_FUNCTION_ARGS)
+{
+	int			cost_delay = PG_GETARG_INT32(0);
+	int			cost_limit = PG_GETARG_INT32(1);
+	bool		fast = PG_GETARG_BOOL(2);
+
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	if (cost_delay < 0)
+		ereport(ERROR, errmsg("cost delay cannot be a negative value"));
+
+	if (cost_limit <= 0)
+		ereport(ERROR, errmsg("cost limit must be greater than zero"));
+
+	StartDataChecksumsWorkerLauncher(ENABLE_DATACHECKSUMS, cost_delay, cost_limit, fast);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index bb7d90aa5d9..54dcfbcb333 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1613,7 +1613,8 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	 * enabled for this cluster, and if this is a relation file, then verify
 	 * the checksum.
 	 */
-	if (!noverify_checksums && DataChecksumsEnabled() &&
+	if (!noverify_checksums &&
+		DataChecksumsNeedWrite() &&
 		RelFileNumberIsValid(relfilenumber))
 		verify_checksum = true;
 
@@ -2007,6 +2008,9 @@ verify_page_checksum(Page page, XLogRecPtr start_lsn, BlockNumber blkno,
 	if (PageIsNew(page) || PageGetLSN(page) >= start_lsn)
 		return true;
 
+	if (!DataChecksumsNeedVerify())
+		return true;
+
 	/* Perform the actual checksum calculation. */
 	checksum = pg_checksum_page(page, blkno);
 
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 566f308e443..dea7ad3cf30 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -650,6 +650,18 @@ LANGUAGE INTERNAL
 CALLED ON NULL INPUT VOLATILE PARALLEL SAFE
 AS 'pg_stat_reset_slru';
 
+CREATE OR REPLACE FUNCTION
+  pg_enable_data_checksums(cost_delay integer DEFAULT 0,
+                           cost_limit integer DEFAULT 100,
+						   fast boolean DEFAULT false)
+  RETURNS void STRICT VOLATILE LANGUAGE internal AS 'enable_data_checksums'
+  PARALLEL RESTRICTED;
+
+CREATE OR REPLACE FUNCTION
+  pg_disable_data_checksums(fast boolean DEFAULT false)
+  RETURNS void STRICT VOLATILE LANGUAGE internal AS 'disable_data_checksums'
+  PARALLEL RESTRICTED;
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
@@ -775,6 +787,10 @@ REVOKE EXECUTE ON FUNCTION pg_ls_logicalmapdir() FROM PUBLIC;
 
 REVOKE EXECUTE ON FUNCTION pg_ls_replslotdir(text) FROM PUBLIC;
 
+REVOKE EXECUTE ON FUNCTION pg_enable_data_checksums(integer, integer, boolean) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_disable_data_checksums(boolean) FROM public;
+
 --
 -- We also set up some things as accessible to standard roles.
 --
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 1b3c5a55882..22f67c7ee4a 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1354,6 +1354,26 @@ CREATE VIEW pg_stat_progress_copy AS
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_data_checksums AS
+    SELECT
+        S.pid AS pid, S.datid, D.datname AS datname,
+        CASE S.param1 WHEN 0 THEN 'enabling'
+                      WHEN 1 THEN 'disabling'
+                      WHEN 2 THEN 'waiting'
+                      WHEN 3 THEN 'waiting on temporary tables'
+                      WHEN 4 THEN 'waiting on checkpoint'
+                      WHEN 5 THEN 'done'
+                      END AS phase,
+        CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+        S.param3 AS databases_done,
+        CASE S.param4 WHEN -1 THEN NULL ELSE S.param4 END AS relations_total,
+        CASE S.param5 WHEN -1 THEN NULL ELSE S.param5 END AS relations_done,
+        CASE S.param6 WHEN -1 THEN NULL ELSE S.param6 END AS blocks_total,
+        CASE S.param7 WHEN -1 THEN NULL ELSE S.param7 END AS blocks_done
+    FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid
+    ORDER BY S.datid; -- return the launcher process first
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/postmaster/Makefile b/src/backend/postmaster/Makefile
index 0f4435d2d97..0c36765acfe 100644
--- a/src/backend/postmaster/Makefile
+++ b/src/backend/postmaster/Makefile
@@ -18,6 +18,7 @@ OBJS = \
 	bgworker.o \
 	bgwriter.o \
 	checkpointer.o \
+	datachecksumsworker.o \
 	fork_process.o \
 	interrupt.o \
 	launch_backend.o \
diff --git a/src/backend/postmaster/auxprocess.c b/src/backend/postmaster/auxprocess.c
index a6d3630398f..5742a1dd724 100644
--- a/src/backend/postmaster/auxprocess.c
+++ b/src/backend/postmaster/auxprocess.c
@@ -15,6 +15,7 @@
 #include <unistd.h>
 #include <signal.h>
 
+#include "access/xlog.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/auxprocess.h"
@@ -68,6 +69,24 @@ AuxiliaryProcessMainCommon(void)
 
 	ProcSignalInit(NULL, 0);
 
+	/*
+	 * Initialize a local cache of the data_checksum_version, to be updated by
+	 * the procsignal-based barriers.
+	 *
+	 * This intentionally happens after initializing the procsignal, otherwise
+	 * we might miss a state change. This means we can get a barrier for the
+	 * state we've just initialized - but it can happen only once.
+	 *
+	 * The postmaster (which is what gets forked into the new child process)
+	 * does not handle barriers, therefore it may not have the current value
+	 * of LocalDataChecksumVersion value (it'll have the value read from the
+	 * control file, which may be arbitrarily old).
+	 *
+	 * NB: Even if the postmaster handled barriers, the value might still be
+	 * stale, as it might have changed after this process forked.
+	 */
+	InitLocalDataChecksumVersion();
+
 	/*
 	 * Auxiliary processes don't run transactions, but they may need a
 	 * resource owner anyway to manage buffer pins acquired outside
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 1ad65c237c3..0d2ade1f905 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -18,6 +18,7 @@
 #include "pgstat.h"
 #include "port/atomics.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "replication/logicallauncher.h"
 #include "replication/logicalworker.h"
@@ -132,6 +133,12 @@ static const struct
 	},
 	{
 		"TablesyncWorkerMain", TablesyncWorkerMain
+	},
+	{
+		"DataChecksumsWorkerLauncherMain", DataChecksumsWorkerLauncherMain
+	},
+	{
+		"DataChecksumsWorkerMain", DataChecksumsWorkerMain
 	}
 };
 
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
new file mode 100644
index 00000000000..ff451d502ba
--- /dev/null
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -0,0 +1,1463 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.c
+ *	  Background worker for enabling or disabling data checksums online
+ *
+ * When enabling data checksums on a database at initdb time or when shut down
+ * with pg_checksums, no extra process is required as each page is checksummed,
+ * and verified, when accessed.  When enabling checksums on an already running
+ * cluster, this worker will ensure that all pages are checksummed before
+ * verification of the checksums is turned on. In the case of disabling
+ * checksums, the state transition is performed only in the control file, no
+ * changes are performed on the data pages.
+ *
+ * Checksums can be either enabled or disabled cluster-wide, with on/off being
+ * the end state for data_checksums.
+ *
+ * Enabling checksums
+ * ------------------
+ * When enabling checksums in an online cluster, data_checksums will be set to
+ * "inprogress-on" which signals that write operations MUST compute and write
+ * the checksum on the data page, but during reading the checksum SHALL NOT be
+ * verified. This ensures that all objects created during checksumming will
+ * have checksums set, but no reads will fail due to incorrect checksum. The
+ * DataChecksumsWorker will compile a list of databases which exist at the
+ * start of checksumming, and all of these which haven't been dropped during
+ * the processing MUST have been processed successfully in order for checksums
+ * to be enabled. Any new relation created during processing will see the
+ * in-progress state and will automatically be checksummed.
+ *
+ * For each database, all relations which have storage are read and every data
+ * page is marked dirty to force a write with the checksum. This will generate
+ * a lot of WAL as the entire database is read and written.
+ *
+ * If the processing is interrupted by a cluster restart, it will be restarted
+ * from the beginning again as state isn't persisted.
+ *
+ * Disabling checksums
+ * -------------------
+ * When disabling checksums, data_checksums will be set to "inprogress-off"
+ * which signals that checksums are written but no longer verified. This ensure
+ * that backends which have yet to move from the "on" state will still be able
+ * to process data checksum validation.
+ *
+ * Synchronization and Correctness
+ * -------------------------------
+ * The processes involved in enabling, or disabling, data checksums in an
+ * online cluster must be properly synchronized with the normal backends
+ * serving concurrent queries to ensure correctness. Correctness is defined
+ * as the following:
+ *
+ *    - Backends SHALL NOT violate local data_checksums state
+ *    - Data checksums SHALL NOT be considered enabled cluster-wide until all
+ *      currently connected backends have the local state "enabled"
+ *
+ * There are two levels of synchronization required for enabling data checksums
+ * in an online cluster: (i) changing state in the active backends ("on",
+ * "off", "inprogress-on" and "inprogress-off"), and (ii) ensuring no
+ * incompatible objects and processes are left in a database when workers end.
+ * The former deals with cluster-wide agreement on data checksum state and the
+ * latter with ensuring that any concurrent activity cannot break the data
+ * checksum contract during processing.
+ *
+ * Synchronizing the state change is done with procsignal barriers, where the
+ * WAL logging backend updating the global state in the controlfile will wait
+ * for all other backends to absorb the barrier. Barrier absorption will happen
+ * during interrupt processing, which means that connected backends will change
+ * state at different times. To prevent data checksum state changes when
+ * writing and verifying checksums, interrupts shall be held off before
+ * interrogating state and resumed when the IO operation has been performed.
+ *
+ *   When Enabling Data Checksums
+ *   ----------------------------
+ *   A process which fails to observe data checksums being enabled can induce
+ *   two types of errors: failing to write the checksum when modifying the page
+ *   and failing to validate the data checksum on the page when reading it.
+ *
+ *   When processing starts all backends belong to one of the below sets, with
+ *   one set being empty:
+ *
+ *   Bd: Backends in "off" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   If processing is started in an online cluster then all backends are in Bd.
+ *   If processing was halted by the cluster shutting down, the controlfile
+ *   state "inprogress-on" will be observed on system startup and all backends
+ *   will be in Bd. Backends transition Bd -> Bi via a procsignalbarrier.  When
+ *   the DataChecksumsWorker has finished writing checksums on all pages and
+ *   enables data checksums cluster-wide, there are four sets of backends where
+ *   Bd shall be an empty set:
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   Backends in Bi and Be will write checksums when modifying a page, but only
+ *   backends in Be will verify the checksum during reading. The Bg backend is
+ *   blocked waiting for all backends in Bi to process interrupts and move to
+ *   Be. Any backend starting while Bg is waiting on the procsignalbarrier will
+ *   observe the global state being "on" and will thus automatically belong to
+ *   Be.  Checksums are enabled cluster-wide when Bi is an empty set. Bi and Be
+ *   are compatible sets while still operating based on their local state as
+ *   both write data checksums.
+ *
+ *   When Disabling Data Checksums
+ *   -----------------------------
+ *   A process which fails to observe that data checksums have been disabled
+ *   can induce two types of errors: writing the checksum when modifying the
+ *   page and validating a data checksum which is no longer correct due to
+ *   modifications to the page.
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bo: Backends in "inprogress-off" state
+ *
+ *   Backends transition from the Be state to Bd like so: Be -> Bo -> Bd
+ *
+ *   The goal is to transition all backends to Bd making the others empty sets.
+ *   Backends in Bo write data checksums, but don't validate them, such that
+ *   backends still in Be can continue to validate pages until the barrier has
+ *   been absorbed such that they are in Bo. Once all backends are in Bo, the
+ *   barrier to transition to "off" can be raised and all backends can safely
+ *   stop writing data checksums as no backend is enforcing data checksum
+ *   validation any longer.
+ *
+ *
+ * Potential optimizations
+ * -----------------------
+ * Below are some potential optimizations and improvements which were brought
+ * up during reviews of this feature, but which weren't implemented in the
+ * initial version. These are ideas listed without any validation on their
+ * feasibility or potential payoff. More discussion on these can be found on
+ * the -hackers threads linked to in the commit message of this feature.
+ *
+ *   * Launching datachecksumsworker for resuming operation from the startup
+ *     process: Currently users have to restart processing manually after a
+ *     restart since dynamic background worker cannot be started from the
+ *     postmaster. Changing the startup process could make restarting the
+ *     processing automatic on cluster restart.
+ *   * Avoid dirtying the page when checksums already match: Iff the checksum
+ *     on the page happens to already match we still dirty the page. It should
+ *     be enough to only do the log_newpage_buffer() call in that case.
+ *   * Invent a lightweight WAL record that doesn't contain the full-page
+ *     image but just the block number: On replay, the redo routine would read
+ *     the page from disk.
+ *   * Teach pg_checksums to avoid checksummed pages when pg_checksums is used
+ *     to enable checksums on a cluster which is in inprogress-on state and
+ *     may have checksummed pages (make pg_checksums be able to resume an
+ *     online operation).
+ *
+ *
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/postmaster/datachecksumsworker.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/htup_details.h"
+#include "access/xact.h"
+#include "access/xloginsert.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_database.h"
+#include "commands/progress.h"
+#include "commands/vacuum.h"
+#include "common/relpath.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "postmaster/bgworker.h"
+#include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/bufmgr.h"
+#include "storage/checksum.h"
+#include "storage/ipc.h"
+#include "storage/lmgr.h"
+#include "storage/lwlock.h"
+#include "storage/procarray.h"
+#include "storage/smgr.h"
+#include "tcop/tcopprot.h"
+#include "utils/fmgroids.h"
+#include "utils/injection_point.h"
+#include "utils/lsyscache.h"
+#include "utils/ps_status.h"
+#include "utils/syscache.h"
+
+/*
+ * Number of times we retry to open a database before giving up and consider
+ * it to have failed processing.
+ */
+#define DATACHECKSUMSWORKER_MAX_DB_RETRIES 5
+
+/*
+ * Signaling between backends calling pg_enable/disable_data_checksums, the
+ * checksums launcher process, and the checksums worker process.
+ *
+ * This struct is protected by DataChecksumsWorkerLock
+ */
+typedef struct DataChecksumsWorkerShmemStruct
+{
+	/*
+	 * These are set by pg_{enable|disable|verify}_data_checksums, to tell the
+	 * launcher what the target state is.
+	 */
+	DataChecksumsWorkerOperation launch_operation;
+	int			launch_cost_delay;
+	int			launch_cost_limit;
+	bool		launch_fast;
+
+	/*
+	 * Is a launcher process is currently running?
+	 *
+	 * This is set by the launcher process, after it has read the above
+	 * launch_* parameters.
+	 */
+	bool		launcher_running;
+
+	/*
+	 * These fields indicate the target state that the launcher is currently
+	 * working towards. They can be different from the corresponding launch_*
+	 * fields, if a new pg_enable/disable_data_checksums() call was made while
+	 * the launcher/worker was already running.
+	 *
+	 * The below members are set when the launcher starts, and are only
+	 * accessed read-only by the single worker. Thus, we can access these
+	 * without a lock. If multiple workers, or dynamic cost parameters, are
+	 * supported at some point then this would need to be revisited.
+	 */
+	DataChecksumsWorkerOperation operation;
+	int			cost_delay;
+	int			cost_limit;
+	bool		immediate_checkpoint;
+
+	/*
+	 * Signaling between the launcher and the worker process.
+	 *
+	 * As there is only a single worker, and the launcher won't read these
+	 * until the worker exits, they can be accessed without the need for a
+	 * lock. If multiple workers are supported then this will have to be
+	 * revisited.
+	 */
+
+	/* result, set by worker before exiting */
+	DataChecksumsWorkerResult success;
+
+	/*
+	 * tells the worker process whether it should also process the shared
+	 * catalogs
+	 */
+	bool		process_shared_catalogs;
+} DataChecksumsWorkerShmemStruct;
+
+/* Shared memory segment for datachecksumsworker */
+static DataChecksumsWorkerShmemStruct *DataChecksumsWorkerShmem;
+
+typedef struct DataChecksumsWorkerDatabase
+{
+	Oid			dboid;
+	char	   *dbname;
+} DataChecksumsWorkerDatabase;
+
+typedef struct DataChecksumsWorkerResultEntry
+{
+	Oid			dboid;
+	DataChecksumsWorkerResult result;
+	int			retries;
+} DataChecksumsWorkerResultEntry;
+
+
+/*
+ * Flag set by the interrupt handler
+ */
+static volatile sig_atomic_t abort_requested = false;
+
+/*
+ * Have we set the DataChecksumsWorkerShmemStruct->launcher_running flag?
+ * If we have, we need to clear it before exiting!
+ */
+static volatile sig_atomic_t launcher_running = false;
+
+/*
+ * Are we enabling data checksums, or disabling them?
+ */
+static DataChecksumsWorkerOperation operation;
+
+/* Prototypes */
+static List *BuildDatabaseList(void);
+static List *BuildRelationList(bool temp_relations, bool include_shared);
+static void FreeDatabaseList(List *dblist);
+static DataChecksumsWorkerResult ProcessDatabase(DataChecksumsWorkerDatabase *db);
+static bool ProcessAllDatabases(bool immediate_checkpoint);
+static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy);
+static void launcher_cancel_handler(SIGNAL_ARGS);
+static void WaitForAllTransactionsToFinish(void);
+
+/*
+ * StartDataChecksumsWorkerLauncher
+ *		Main entry point for datachecksumsworker launcher process
+ *
+ * The main entrypoint for starting data checksums processing for enabling as
+ * well as disabling.
+ */
+void
+StartDataChecksumsWorkerLauncher(DataChecksumsWorkerOperation op,
+								 int cost_delay,
+								 int cost_limit,
+								 bool fast)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	bool		launcher_running;
+
+#ifdef USE_ASSERT_CHECKING
+	/* The cost delay settings have no effect when disabling */
+	if (op == DISABLE_DATACHECKSUMS)
+		Assert(cost_delay == 0 && cost_limit == 0);
+#endif
+
+	/* Store the desired state in shared memory */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	DataChecksumsWorkerShmem->launch_operation = op;
+	DataChecksumsWorkerShmem->launch_cost_delay = cost_delay;
+	DataChecksumsWorkerShmem->launch_cost_limit = cost_limit;
+	DataChecksumsWorkerShmem->launch_fast = fast;
+
+	/* is the launcher already running? */
+	launcher_running = DataChecksumsWorkerShmem->launcher_running;
+
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * Launch a new launcher process, if it's not running already.
+	 *
+	 * If the launcher is currently busy enabling the checksums, and we want
+	 * them disabled (or vice versa), the launcher will notice that at latest
+	 * when it's about to exit, and will loop back process the new request. So
+	 * if the launcher is already running, we don't need to do anything more
+	 * here to abort it.
+	 *
+	 * If you call pg_enable/disable_data_checksums() twice in a row, before
+	 * the launcher has had a chance to start up, we still end up launching it
+	 * twice.  That's OK, the second invocation will see that a launcher is
+	 * already running and exit quickly.
+	 *
+	 * TODO: We could optimize here and skip launching the launcher, if we are
+	 * already in the desired state, i.e. if the checksums are already enabled
+	 * and you call pg_enable_data_checksums().
+	 */
+	if (!launcher_running)
+	{
+		/*
+		 * Prepare the BackgroundWorker and launch it.
+		 */
+		memset(&bgw, 0, sizeof(bgw));
+		bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+		bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+		snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+		snprintf(bgw.bgw_function_name, BGW_MAXLEN, "DataChecksumsWorkerLauncherMain");
+		snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker launcher");
+		snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker launcher");
+		bgw.bgw_restart_time = BGW_NEVER_RESTART;
+		bgw.bgw_notify_pid = MyProcPid;
+		bgw.bgw_main_arg = (Datum) 0;
+
+		if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+			ereport(ERROR,
+					errmsg("failed to start background worker to process data checksums"));
+	}
+}
+
+/*
+ * ProcessSingleRelationFork
+ *		Enable data checksums in a single relation/fork.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy)
+{
+	BlockNumber numblocks = RelationGetNumberOfBlocksInFork(reln, forkNum);
+	char		activity[NAMEDATALEN * 2 + 128];
+	char	   *relns;
+
+	relns = get_namespace_name(RelationGetNamespace(reln));
+
+	if (!relns)
+		return false;
+
+	/* Report the current relation to pgstat_activity */
+	snprintf(activity, sizeof(activity) - 1, "processing: %s.%s (%s, %dblocks)",
+			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	/*
+	 * As of now we only update the block counter for main forks in order to
+	 * not cause too frequent calls. TODO: investigate whether we should do it
+	 * more frequent?
+	 */
+	if (forkNum == MAIN_FORKNUM)
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+									 numblocks);
+
+	/*
+	 * We are looping over the blocks which existed at the time of process
+	 * start, which is safe since new blocks are created with checksums set
+	 * already due to the state being "inprogress-on".
+	 */
+	for (BlockNumber blknum = 0; blknum < numblocks; blknum++)
+	{
+		Buffer		buf = ReadBufferExtended(reln, forkNum, blknum, RBM_NORMAL, strategy);
+
+		/* Need to get an exclusive lock before we can flag as dirty */
+		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+
+		/*
+		 * Mark the buffer as dirty and force a full page write.  We have to
+		 * re-write the page to WAL even if the checksum hasn't changed,
+		 * because if there is a replica it might have a slightly different
+		 * version of the page with an invalid checksum, caused by unlogged
+		 * changes (e.g. hintbits) on the master happening while checksums
+		 * were off. This can happen if there was a valid checksum on the page
+		 * at one point in the past, so only when checksums are first on, then
+		 * off, and then turned on again.  TODO: investigate if this could be
+		 * avoided if the checksum is calculated to be correct and wal_level
+		 * is set to "minimal",
+		 */
+		START_CRIT_SECTION();
+		MarkBufferDirty(buf);
+		log_newpage_buffer(buf, false);
+		END_CRIT_SECTION();
+
+		UnlockReleaseBuffer(buf);
+
+		/*
+		 * This is the only place where we check if we are asked to abort, the
+		 * abortion will bubble up from here. It's safe to check this without
+		 * a lock, because if we miss it being set, we will try again soon.
+		 */
+		Assert(operation == ENABLE_DATACHECKSUMS);
+		if (DataChecksumsWorkerShmem->launch_operation == DISABLE_DATACHECKSUMS)
+			abort_requested = true;
+
+		if (abort_requested)
+			return false;
+
+		/*
+		 * As of now we only update the block counter for main forks in order
+		 * to not cause too frequent calls. TODO: investigate whether we
+		 * should do it more frequent?
+		 */
+		if (forkNum == MAIN_FORKNUM)
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+										 (blknum + 1));
+
+		vacuum_delay_point(false);
+	}
+
+	pfree(relns);
+	return true;
+}
+
+/*
+ * ProcessSingleRelationByOid
+ *		Process a single relation based on oid.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationByOid(Oid relationId, BufferAccessStrategy strategy)
+{
+	Relation	rel;
+	bool		aborted = false;
+
+	StartTransactionCommand();
+
+	rel = try_relation_open(relationId, AccessShareLock);
+	if (rel == NULL)
+	{
+		/*
+		 * Relation no longer exists. We don't consider this an error since
+		 * there are no pages in it that need data checksums, and thus return
+		 * true. The worker operates off a list of relations generated at the
+		 * start of processing, so relations being dropped in the meantime is
+		 * to be expected.
+		 */
+		CommitTransactionCommand();
+		pgstat_report_activity(STATE_IDLE, NULL);
+		return true;
+	}
+	RelationGetSmgr(rel);
+
+	for (ForkNumber fnum = 0; fnum <= MAX_FORKNUM; fnum++)
+	{
+		if (smgrexists(rel->rd_smgr, fnum))
+		{
+			if (!ProcessSingleRelationFork(rel, fnum, strategy))
+			{
+				aborted = true;
+				break;
+			}
+		}
+	}
+	relation_close(rel, AccessShareLock);
+	elog(DEBUG2,
+		 "data checksum processing done for relation with OID %u: %s",
+		 relationId, (aborted ? "aborted" : "finished"));
+
+	CommitTransactionCommand();
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return !aborted;
+}
+
+/*
+ * ProcessDatabase
+ *		Enable data checksums in a single database.
+ *
+ * We do this by launching a dynamic background worker into this database, and
+ * waiting for it to finish.  We have to do this in a separate worker, since
+ * each process can only be connected to one database during its lifetime.
+ */
+static DataChecksumsWorkerResult
+ProcessDatabase(DataChecksumsWorkerDatabase *db)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	BgwHandleStatus status;
+	pid_t		pid;
+	char		activity[NAMEDATALEN + 64];
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_FAILED;
+
+	memset(&bgw, 0, sizeof(bgw));
+	bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+	bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+	snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+	snprintf(bgw.bgw_function_name, BGW_MAXLEN, "%s", "DataChecksumsWorkerMain");
+	snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker worker");
+	snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker worker");
+	bgw.bgw_restart_time = BGW_NEVER_RESTART;
+	bgw.bgw_notify_pid = MyProcPid;
+	bgw.bgw_main_arg = ObjectIdGetDatum(db->dboid);
+
+	/*
+	 * If there are no worker slots available, make sure we retry processing
+	 * this database. This will make the datachecksumsworker move on to the
+	 * next database and quite likely fail with the same problem. TODO: Maybe
+	 * we need a backoff to avoid running through all the databases here in
+	 * short order.
+	 */
+	if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+	{
+		ereport(WARNING,
+				errmsg("failed to start worker for enabling data checksums in database \"%s\", retrying",
+					   db->dbname),
+				errhint("The max_worker_processes setting might be too low."));
+		return DATACHECKSUMSWORKER_RETRYDB;
+	}
+
+	status = WaitForBackgroundWorkerStartup(bgw_handle, &pid);
+	if (status == BGWH_STOPPED)
+	{
+		ereport(WARNING,
+				errmsg("could not start background worker for enabling data checksums in database \"%s\"",
+					   db->dbname),
+				errhint("More details on the error might be found in the server log."));
+		return DATACHECKSUMSWORKER_FAILED;
+	}
+
+	/*
+	 * If the postmaster crashed we cannot end up with a processed database so
+	 * we have no alternative other than exiting. When enabling checksums we
+	 * won't at this time have changed the pg_control version to enabled so
+	 * when the cluster comes back up processing will have to be restarted.
+	 * When disabling, the pg_control version will be set to off before this
+	 * so when the cluster comes up checksums will be off as expected.
+	 */
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				errmsg("cannot enable data checksums without the postmaster process"),
+				errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums()."));
+
+	Assert(status == BGWH_STARTED);
+	ereport(DEBUG1,
+			errmsg("initiating data checksum processing in database \"%s\"",
+				   db->dbname));
+
+	snprintf(activity, sizeof(activity) - 1,
+			 "Waiting for worker in database %s (pid %ld)", db->dbname, (long) pid);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	status = WaitForBackgroundWorkerShutdown(bgw_handle);
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				errmsg("postmaster exited during data checksum processing in \"%s\"",
+					   db->dbname),
+				errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums()."));
+
+	if (DataChecksumsWorkerShmem->success == DATACHECKSUMSWORKER_ABORTED)
+		ereport(LOG,
+				errmsg("data checksums processing was aborted in database \"%s\"",
+					   db->dbname));
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return DataChecksumsWorkerShmem->success;
+}
+
+/*
+ * launcher_exit
+ *
+ * Internal routine for cleaning up state when the launcher process exits. We
+ * need to clean up the abort flag to ensure that processing can be restarted
+ * again after it was previously aborted.
+ */
+static void
+launcher_exit(int code, Datum arg)
+{
+	if (launcher_running)
+	{
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		launcher_running = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		LWLockRelease(DataChecksumsWorkerLock);
+	}
+}
+
+/*
+ * launcher_cancel_handler
+ *
+ * Internal routine for reacting to SIGINT and flagging the worker to abort.
+ * The worker won't be interrupted immediately but will check for abort flag
+ * between each block in a relation.
+ */
+static void
+launcher_cancel_handler(SIGNAL_ARGS)
+{
+	int			save_errno = errno;
+
+	abort_requested = true;
+
+	/*
+	 * There is no sleeping in the main loop, the flag will be checked
+	 * periodically in ProcessSingleRelationFork. The worker does however
+	 * sleep when waiting for concurrent transactions to end so we still need
+	 * to set the latch.
+	 */
+	SetLatch(MyLatch);
+
+	errno = save_errno;
+}
+
+/*
+ * WaitForAllTransactionsToFinish
+ *		Blocks awaiting all current transactions to finish
+ *
+ * Returns when all transactions which are active at the call of the function
+ * have ended, or if the postmaster dies while waiting. If the postmaster dies
+ * the abort flag will be set to indicate that the caller of this shouldn't
+ * proceed.
+ *
+ * NB: this will return early, if aborted by SIGINT or if the target state
+ * is changed while we're running.
+ */
+static void
+WaitForAllTransactionsToFinish(void)
+{
+	TransactionId waitforxid;
+
+	LWLockAcquire(XidGenLock, LW_SHARED);
+	waitforxid = XidFromFullTransactionId(TransamVariables->nextXid);
+	LWLockRelease(XidGenLock);
+
+	while (TransactionIdPrecedes(GetOldestActiveTransactionId(false, true), waitforxid))
+	{
+		char		activity[64];
+		int			rc;
+
+		/* Oldest running xid is older than us, so wait */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for current transactions to finish (waiting for %u)",
+				 waitforxid);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 3 seconds */
+		ResetLatch(MyLatch);
+		rc = WaitLatch(MyLatch,
+					   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+					   3000,
+					   WAIT_EVENT_CHECKSUM_ENABLE_STARTCONDITION);
+
+		/*
+		 * If the postmaster died we won't be able to enable checksums
+		 * cluster-wide so abort and hope to continue when restarted.
+		 */
+		if (rc & WL_POSTMASTER_DEATH)
+			ereport(FATAL,
+					errmsg("postmaster exited during data checksum processing"),
+					errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums()."));
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_SHARED);
+		if (DataChecksumsWorkerShmem->launch_operation != operation)
+			abort_requested = true;
+		LWLockRelease(DataChecksumsWorkerLock);
+		if (abort_requested)
+			break;
+	}
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+	return;
+}
+
+/*
+ * DataChecksumsWorkerLauncherMain
+ *
+ * Main function for launching dynamic background workers for processing data
+ * checksums in databases. This function has the bgworker management, with
+ * ProcessAllDatabases being responsible for looping over the databases and
+ * initiating processing.
+ */
+void
+DataChecksumsWorkerLauncherMain(Datum arg)
+{
+	on_shmem_exit(launcher_exit, 0);
+
+	ereport(DEBUG1,
+			errmsg("background worker \"datachecksumsworker\" launcher started"));
+
+	pqsignal(SIGTERM, die);
+	pqsignal(SIGINT, launcher_cancel_handler);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_LAUNCHER;
+	init_ps_display(NULL);
+
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	if (DataChecksumsWorkerShmem->launcher_running)
+	{
+		/* Launcher was already running, let it finish */
+		LWLockRelease(DataChecksumsWorkerLock);
+		return;
+	}
+
+	launcher_running = true;
+
+	/*
+	 * Initialize a connection to shared catalogs only.
+	 */
+	BackgroundWorkerInitializeConnectionByOid(InvalidOid, InvalidOid, 0);
+
+	operation = DataChecksumsWorkerShmem->launch_operation;
+	DataChecksumsWorkerShmem->launcher_running = true;
+	DataChecksumsWorkerShmem->operation = operation;
+	DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+	DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+	DataChecksumsWorkerShmem->immediate_checkpoint = DataChecksumsWorkerShmem->launch_fast;
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * The target state can change while we are busy enabling/disabling
+	 * checksums, if the user calls pg_disable/enable_data_checksums() before
+	 * we are finished with the previous request. In that case, we will loop
+	 * back here, to process the new request.
+	 */
+again:
+
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	if (operation == ENABLE_DATACHECKSUMS)
+	{
+		/*
+		 * If we are asked to enable checksums in a cluster which already has
+		 * checksums enabled, exit immediately as there is nothing more to do.
+		 * Hold interrupts to make sure state doesn't change during checking.
+		 */
+		HOLD_INTERRUPTS();
+		if (DataChecksumsNeedVerify())
+		{
+			RESUME_INTERRUPTS();
+			goto done;
+		}
+		RESUME_INTERRUPTS();
+
+		/*
+		 * Set the state to inprogress-on and wait on the procsignal barrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_ENABLING);
+		SetDataChecksumsOnInProgress();
+
+		/*
+		 * All backends are now in inprogress-on state and are writing data
+		 * checksums.  Start processing all data at rest.
+		 */
+		if (!ProcessAllDatabases(DataChecksumsWorkerShmem->immediate_checkpoint))
+		{
+			/*
+			 * If the target state changed during processing then it's not a
+			 * failure, so restart processing instead.
+			 */
+			LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+			if (DataChecksumsWorkerShmem->launch_operation != operation)
+			{
+				LWLockRelease(DataChecksumsWorkerLock);
+				goto done;
+			}
+			LWLockRelease(DataChecksumsWorkerLock);
+			ereport(ERROR,
+					errmsg("unable to enable data checksums in cluster"));
+		}
+
+		/*
+		 * Data checksums have been set on all pages, set the state to on in
+		 * order to instruct backends to validate checksums on reading.
+		 */
+		SetDataChecksumsOn();
+	}
+	else
+	{
+		int			flags;
+
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_DISABLING);
+		SetDataChecksumsOff();
+
+		flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+		if (DataChecksumsWorkerShmem->immediate_checkpoint)
+			flags = flags | CHECKPOINT_FAST;
+		RequestCheckpoint(flags);
+	}
+
+done:
+
+	/*
+	 * All done. But before we exit, check if the target state was changed
+	 * while we were running. In that case we will have to start all over
+	 * again.
+	 */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+	if (DataChecksumsWorkerShmem->launch_operation != operation)
+	{
+		DataChecksumsWorkerShmem->operation = DataChecksumsWorkerShmem->launch_operation;
+		operation = DataChecksumsWorkerShmem->launch_operation;
+		DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+		DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+		LWLockRelease(DataChecksumsWorkerLock);
+		goto again;
+	}
+
+	/* Shut down progress reporting as we are done */
+	pgstat_progress_end_command();
+
+	launcher_running = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	LWLockRelease(DataChecksumsWorkerLock);
+}
+
+/*
+ * ProcessAllDatabases
+ *		Compute the list of all databases and process checksums in each
+ *
+ * This will repeatedly generate a list of databases to process for enabling
+ * checksums. Until no new databases are found, this will loop around computing
+ * a new list and comparing it to the already seen ones.
+ *
+ * If immediate_checkpoint is set to true then a CHECKPOINT_FAST will be
+ * issued. This is useful for testing but should be avoided in production use
+ * as it may affect cluster performance drastically.
+ */
+static bool
+ProcessAllDatabases(bool immediate_checkpoint)
+{
+	List	   *DatabaseList;
+	HTAB	   *ProcessedDatabases = NULL;
+	HASHCTL		hash_ctl;
+	bool		found_failed = false;
+	int			flags;
+
+	/* Initialize a hash tracking all processed databases */
+	memset(&hash_ctl, 0, sizeof(hash_ctl));
+	hash_ctl.keysize = sizeof(Oid);
+	hash_ctl.entrysize = sizeof(DataChecksumsWorkerResultEntry);
+	ProcessedDatabases = hash_create("Processed databases",
+									 64,
+									 &hash_ctl,
+									 HASH_ELEM | HASH_BLOBS);
+
+	/*
+	 * Set up so first run processes shared catalogs, but not once in every
+	 * db.
+	 */
+	DataChecksumsWorkerShmem->process_shared_catalogs = true;
+
+	/*
+	 * Get a list of all databases to process. This may include databases that
+	 * were created during our runtime.  Since a database can be created as a
+	 * copy of any other database (which may not have existed in our last
+	 * run), we have to repeat this loop until no new databases show up in the
+	 * list.
+	 */
+	DatabaseList = BuildDatabaseList();
+
+	/* Allow a test case to modify the initial list of databases */
+	INJECTION_POINT("datachecksumsworker-initial-dblist", DatabaseList);
+
+	/*
+	 * Update progress reporting with the total number of databases we need to
+	 * process.  This number should not be changed during processing, the
+	 * columns for processed databases is instead increased such that it can
+	 * be compared against the total.
+	 */
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_DBS_TOTAL,
+			PROGRESS_DATACHECKSUMS_DBS_DONE,
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE,
+			PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+			PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+		};
+
+		int64		vals[6];
+
+		vals[0] = list_length(DatabaseList);
+		vals[1] = 0;
+
+		/* translated to NULL */
+		vals[2] = -1;
+		vals[3] = -1;
+		vals[4] = -1;
+		vals[5] = -1;
+
+		pgstat_progress_update_multi_param(6, index, vals);
+	}
+
+	while (true)
+	{
+		int			processed_databases = 0;
+
+		foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+		{
+			DataChecksumsWorkerResult result;
+			DataChecksumsWorkerResultEntry *entry;
+			bool		found;
+
+			/*
+			 * Check if this database has been processed already, and if so
+			 * whether it should be retried or skipped.
+			 */
+			entry = (DataChecksumsWorkerResultEntry *) hash_search(ProcessedDatabases, &db->dboid,
+																   HASH_FIND, NULL);
+
+			if (entry)
+			{
+				if (entry->result == DATACHECKSUMSWORKER_RETRYDB)
+				{
+					/*
+					 * Limit the number of retries to avoid infinite looping
+					 * in case there simply won't be enough workers in the
+					 * cluster to finish this operation.
+					 */
+					if (entry->retries > DATACHECKSUMSWORKER_MAX_DB_RETRIES)
+						entry->result = DATACHECKSUMSWORKER_FAILED;
+				}
+
+				/* Skip if this database has been processed already */
+				if (entry->result != DATACHECKSUMSWORKER_RETRYDB)
+					continue;
+			}
+
+			result = ProcessDatabase(db);
+			processed_databases++;
+
+			/*
+			 * Update the number of processed databases in the progress
+			 * report.
+			 */
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_DBS_DONE,
+										 processed_databases);
+
+			/* Allow a test process to alter the result of the operation */
+			INJECTION_POINT("datachecksumsworker-fail-db", &result);
+
+			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
+			{
+				/*
+				 * If one database has completed shared catalogs, we don't
+				 * have to process them again.
+				 */
+				if (DataChecksumsWorkerShmem->process_shared_catalogs)
+					DataChecksumsWorkerShmem->process_shared_catalogs = false;
+			}
+			else if (result == DATACHECKSUMSWORKER_ABORTED)
+			{
+				/* Abort flag set, so exit the whole process */
+				return false;
+			}
+
+			entry = hash_search(ProcessedDatabases, &db->dboid, HASH_ENTER, &found);
+			entry->dboid = db->dboid;
+			entry->result = result;
+			if (!found)
+				entry->retries = 0;
+			else
+				entry->retries++;
+		}
+
+		elog(DEBUG1,
+			 "%i databases processed for data checksum enabling, %s",
+			 processed_databases,
+			 (processed_databases ? "process with restart" : "process completed"));
+
+		FreeDatabaseList(DatabaseList);
+
+		/*
+		 * If no databases were processed in this run of the loop, we have now
+		 * finished all databases and no concurrently created ones can exist.
+		 */
+		if (processed_databases == 0)
+			break;
+
+		/*
+		 * Re-generate the list of databases for another pass. Since we wait
+		 * for all pre-existing transactions finish, this way we can be
+		 * certain that there are no databases left without checksums.
+		 */
+		WaitForAllTransactionsToFinish();
+		DatabaseList = BuildDatabaseList();
+	}
+
+	/*
+	 * ProcessedDatabases now has all databases and the results of their
+	 * processing. Failure to enable checksums for a database can be because
+	 * they actually failed for some reason, or because the database was
+	 * dropped between us getting the database list and trying to process it.
+	 * Get a fresh list of databases to detect the second case where the
+	 * database was dropped before we had started processing it. If a database
+	 * still exists, but enabling checksums failed then we fail the entire
+	 * checksumming process and exit with an error.
+	 */
+	WaitForAllTransactionsToFinish();
+	DatabaseList = BuildDatabaseList();
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+	{
+		DataChecksumsWorkerResultEntry *entry;
+		bool		found;
+
+		entry = hash_search(ProcessedDatabases, (void *) &db->dboid,
+							HASH_FIND, &found);
+
+		/*
+		 * We are only interested in the processed databases which failed, and
+		 * where the failed database still exists.  This indicates that
+		 * enabling checksums actually failed, and not that the failure was
+		 * due to the db being concurrently dropped.
+		 */
+		if (found && entry->result == DATACHECKSUMSWORKER_FAILED)
+		{
+			ereport(WARNING,
+					errmsg("failed to enable data checksums in \"%s\"", db->dbname));
+			found_failed = found;
+			continue;
+		}
+	}
+
+	FreeDatabaseList(DatabaseList);
+
+	if (found_failed)
+	{
+		/* Disable checksums on cluster, because we failed */
+		SetDataChecksumsOff();
+		/* Force a checkpoint to make everything consistent */
+		flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+		if (immediate_checkpoint)
+			flags = flags | CHECKPOINT_FAST;
+		RequestCheckpoint(flags);
+		ereport(ERROR,
+				errmsg("data checksums failed to get enabled in all databases, aborting"),
+				errhint("The server log might have more information on the cause of the error."));
+	}
+
+	/*
+	 * When enabling checksums, we have to wait for a checkpoint for the
+	 * checksums to change from in-progress to on.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT);
+
+	/*
+	 * Force a checkpoint to get everything out to disk. The use of immediate
+	 * checkpoints is for running tests, as they would otherwise not execute
+	 * in such a way that they can reliably be placed under timeout control.
+	 */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_FAST;
+	RequestCheckpoint(flags);
+
+	return true;
+}
+
+/*
+ * DataChecksumsWorkerShmemSize
+ *		Compute required space for datachecksumsworker-related shared memory
+ */
+Size
+DataChecksumsWorkerShmemSize(void)
+{
+	Size		size;
+
+	size = sizeof(DataChecksumsWorkerShmemStruct);
+	size = MAXALIGN(size);
+
+	return size;
+}
+
+/*
+ * DataChecksumsWorkerShmemInit
+ *		Allocate and initialize datachecksumsworker-related shared memory
+ */
+void
+DataChecksumsWorkerShmemInit(void)
+{
+	bool		found;
+
+	DataChecksumsWorkerShmem = (DataChecksumsWorkerShmemStruct *)
+		ShmemInitStruct("DataChecksumsWorker Data",
+						DataChecksumsWorkerShmemSize(),
+						&found);
+
+	if (!found)
+	{
+		MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
+
+		/*
+		 * Even if this is a redundant assignment, we want to be explicit
+		 * about our intent for readability, since we want to be able to query
+		 * this state in case of restartability.
+		 */
+		DataChecksumsWorkerShmem->launch_operation = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		DataChecksumsWorkerShmem->launch_fast = false;
+	}
+}
+
+/*
+ * BuildDatabaseList
+ *		Compile a list of all currently available databases in the cluster
+ *
+ * This creates the list of databases for the datachecksumsworker workers to
+ * add checksums to. If the caller wants to ensure that no concurrently
+ * running CREATE DATABASE calls exist, this needs to be preceded by a call
+ * to WaitForAllTransactionsToFinish().
+ */
+static List *
+BuildDatabaseList(void)
+{
+	List	   *DatabaseList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(DatabaseRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_database pgdb = (Form_pg_database) GETSTRUCT(tup);
+		DataChecksumsWorkerDatabase *db;
+
+		oldctx = MemoryContextSwitchTo(ctx);
+
+		db = (DataChecksumsWorkerDatabase *) palloc0(sizeof(DataChecksumsWorkerDatabase));
+
+		db->dboid = pgdb->oid;
+		db->dbname = pstrdup(NameStr(pgdb->datname));
+
+		DatabaseList = lappend(DatabaseList, db);
+
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return DatabaseList;
+}
+
+static void
+FreeDatabaseList(List *dblist)
+{
+	if (!dblist)
+		return;
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, dblist)
+	{
+		if (db->dbname != NULL)
+			pfree(db->dbname);
+	}
+
+	list_free_deep(dblist);
+}
+
+/*
+ * BuildRelationList
+ *		Compile a list of relations in the database
+ *
+ * Returns a list of OIDs for the request relation types. If temp_relations
+ * is True then only temporary relations are returned. If temp_relations is
+ * False then non-temporary relations which have data checksums are returned.
+ * If include_shared is True then shared relations are included as well in a
+ * non-temporary list. include_shared has no relevance when building a list of
+ * temporary relations.
+ */
+static List *
+BuildRelationList(bool temp_relations, bool include_shared)
+{
+	List	   *RelationList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(RelationRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_class pgc = (Form_pg_class) GETSTRUCT(tup);
+
+		/*
+		 * Only include temporary relations when asked for a temp relation
+		 * list.
+		 */
+		if (pgc->relpersistence == RELPERSISTENCE_TEMP)
+		{
+			if (!temp_relations)
+				continue;
+		}
+		else
+		{
+			/*
+			 * If we are only interested in temp relations then continue
+			 * immediately as the current relation isn't a temp relation.
+			 */
+			if (temp_relations)
+				continue;
+
+			if (!RELKIND_HAS_STORAGE(pgc->relkind))
+				continue;
+
+			if (pgc->relisshared && !include_shared)
+				continue;
+		}
+
+		oldctx = MemoryContextSwitchTo(ctx);
+		RelationList = lappend_oid(RelationList, pgc->oid);
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return RelationList;
+}
+
+/*
+ * DataChecksumsWorkerMain
+ *
+ * Main function for enabling checksums in a single database, This is the
+ * function set as the bgw_function_name in the dynamic background worker
+ * process initiated for each database by the worker launcher. After enabling
+ * data checksums in each applicable relation in the database, it will wait for
+ * all temporary relations that were present when the function started to
+ * disappear before returning. This is required since we cannot rewrite
+ * existing temporary relations with data checksums.
+ */
+void
+DataChecksumsWorkerMain(Datum arg)
+{
+	Oid			dboid = DatumGetObjectId(arg);
+	List	   *RelationList = NIL;
+	List	   *InitialTempTableList = NIL;
+	BufferAccessStrategy strategy;
+	bool		aborted = false;
+	int64		rels_done;
+
+	operation = ENABLE_DATACHECKSUMS;
+
+	pqsignal(SIGTERM, die);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_WORKER;
+	init_ps_display(NULL);
+
+	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
+											  BGWORKER_BYPASS_ALLOWCONN);
+
+	/* worker will have a separate entry in pg_stat_progress_data_checksums */
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	/*
+	 * Get a list of all temp tables present as we start in this database. We
+	 * need to wait until they are all gone until we are done, since we cannot
+	 * access these relations and modify them.
+	 */
+	InitialTempTableList = BuildRelationList(true, false);
+
+	/*
+	 * Enable vacuum cost delay, if any.
+	 */
+	Assert(DataChecksumsWorkerShmem->operation == ENABLE_DATACHECKSUMS);
+	VacuumCostDelay = DataChecksumsWorkerShmem->cost_delay;
+	VacuumCostLimit = DataChecksumsWorkerShmem->cost_limit;
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumCostPageHit = 0;
+	VacuumCostPageMiss = 0;
+	VacuumCostPageDirty = 0;
+
+	/*
+	 * Create and set the vacuum strategy as our buffer strategy.
+	 */
+	strategy = GetAccessStrategy(BAS_VACUUM);
+
+	RelationList = BuildRelationList(false,
+									 DataChecksumsWorkerShmem->process_shared_catalogs);
+
+	/* Update the total number of relations to be processed in this DB. */
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE
+		};
+
+		int64		vals[2];
+
+		vals[0] = list_length(RelationList);
+		vals[1] = 0;
+
+		pgstat_progress_update_multi_param(2, index, vals);
+	}
+
+	/* Process the relations */
+	rels_done = 0;
+	foreach_oid(reloid, RelationList)
+	{
+		if (!ProcessSingleRelationByOid(reloid, strategy))
+		{
+			aborted = true;
+			break;
+		}
+
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_RELS_DONE,
+									 ++rels_done);
+	}
+	list_free(RelationList);
+
+	if (aborted)
+	{
+		DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+		ereport(DEBUG1,
+				errmsg("data checksum processing aborted in database OID %u",
+					   dboid));
+		return;
+	}
+
+	/* The worker is about to wait for temporary tables to go away. */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL);
+
+	/*
+	 * Wait for all temp tables that existed when we started to go away. This
+	 * is necessary since we cannot "reach" them to enable checksums. Any temp
+	 * tables created after we started will already have checksums in them
+	 * (due to the "inprogress-on" state), so no need to wait for those.
+	 */
+	for (;;)
+	{
+		List	   *CurrentTempTables;
+		int			numleft;
+		char		activity[64];
+
+		CurrentTempTables = BuildRelationList(true, false);
+		numleft = 0;
+		foreach_oid(tmptbloid, InitialTempTableList)
+		{
+			if (list_member_oid(CurrentTempTables, tmptbloid))
+				numleft++;
+		}
+		list_free(CurrentTempTables);
+
+		INJECTION_POINT("datachecksumsworker-fake-temptable-wait", &numleft);
+
+		if (numleft == 0)
+			break;
+
+		/*
+		 * At least one temp table is left to wait for, indicate in pgstat
+		 * activity and progress reporting.
+		 */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for %d temp tables to be removed", numleft);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 3 seconds */
+		ResetLatch(MyLatch);
+		(void) WaitLatch(MyLatch,
+						 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+						 3000,
+						 WAIT_EVENT_CHECKSUM_ENABLE_TEMPTABLE_WAIT);
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		aborted = DataChecksumsWorkerShmem->launch_operation != operation;
+		LWLockRelease(DataChecksumsWorkerLock);
+
+		if (aborted || abort_requested)
+		{
+			DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+			ereport(DEBUG1,
+					errmsg("data checksum processing aborted in database OID %u",
+						   dboid));
+			return;
+		}
+	}
+
+	list_free(InitialTempTableList);
+
+	/* worker done */
+	pgstat_progress_end_command();
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
+}
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index bf6b55ee830..955df32be5d 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -204,6 +204,9 @@ static child_process_kind child_process_kinds[] = {
 	[B_WAL_SUMMARIZER] = {"wal_summarizer", WalSummarizerMain, true},
 	[B_WAL_WRITER] = {"wal_writer", WalWriterMain, true},
 
+	[B_DATACHECKSUMSWORKER_LAUNCHER] = {"datachecksum launcher", NULL, false},
+	[B_DATACHECKSUMSWORKER_WORKER] = {"datachecksum worker", NULL, false},
+
 	[B_LOGGER] = {"syslogger", SysLoggerMain, false},
 };
 
diff --git a/src/backend/postmaster/meson.build b/src/backend/postmaster/meson.build
index 0008603cfee..ce10ef1059a 100644
--- a/src/backend/postmaster/meson.build
+++ b/src/backend/postmaster/meson.build
@@ -6,6 +6,7 @@ backend_sources += files(
   'bgworker.c',
   'bgwriter.c',
   'checkpointer.c',
+  'datachecksumsworker.c',
   'fork_process.c',
   'interrupt.c',
   'launch_backend.c',
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index e1d643b013d..3d15a894c3a 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2983,6 +2983,11 @@ PostmasterStateMachine(void)
 									B_INVALID,
 									B_STANDALONE_BACKEND);
 
+			/* also add checksumming processes */
+			remainMask = btmask_add(remainMask,
+									B_DATACHECKSUMSWORKER_LAUNCHER,
+									B_DATACHECKSUMSWORKER_WORKER);
+
 			/* All types should be included in targetMask or remainMask */
 			Assert((remainMask.mask | targetMask.mask) == BTYPE_MASK_ALL.mask);
 		}
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index cc03f0706e9..f9f06821a8f 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -186,6 +186,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 		case XLOG_FPW_CHANGE:
 		case XLOG_FPI_FOR_HINT:
 		case XLOG_FPI:
+		case XLOG_CHECKSUMS:
 		case XLOG_OVERWRITE_CONTRECORD:
 		case XLOG_CHECKPOINT_REDO:
 			break;
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 2fa045e6b0f..44213d140ae 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -30,6 +30,8 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "postmaster/postmaster.h"
 #include "postmaster/walsummarizer.h"
 #include "replication/logicallauncher.h"
 #include "replication/origin.h"
@@ -150,6 +152,7 @@ CalculateShmemSize(int *num_semaphores)
 	size = add_size(size, InjectionPointShmemSize());
 	size = add_size(size, SlotSyncShmemSize());
 	size = add_size(size, AioShmemSize());
+	size = add_size(size, DataChecksumsWorkerShmemSize());
 
 	/* include additional requested shmem from preload libraries */
 	size = add_size(size, total_addin_request);
@@ -332,6 +335,7 @@ CreateOrAttachShmemStructs(void)
 	PgArchShmemInit();
 	ApplyLauncherShmemInit();
 	SlotSyncShmemInit();
+	DataChecksumsWorkerShmemInit();
 
 	/*
 	 * Set up other modules that need some shared memory space
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 087821311cc..6881c6f4069 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -18,6 +18,7 @@
 #include <unistd.h>
 
 #include "access/parallel.h"
+#include "access/xlog.h"
 #include "commands/async.h"
 #include "miscadmin.h"
 #include "pgstat.h"
@@ -576,6 +577,18 @@ ProcessProcSignalBarrier(void)
 					case PROCSIGNAL_BARRIER_SMGRRELEASE:
 						processed = ProcessBarrierSmgrRelease();
 						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON:
+						processed = AbsorbChecksumsOnInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_ON:
+						processed = AbsorbChecksumsOnBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF:
+						processed = AbsorbChecksumsOffInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_OFF:
+						processed = AbsorbChecksumsOffBarrier();
+						break;
 				}
 
 				/*
diff --git a/src/backend/storage/page/README b/src/backend/storage/page/README
index e30d7ac59ad..73c36a63908 100644
--- a/src/backend/storage/page/README
+++ b/src/backend/storage/page/README
@@ -10,7 +10,9 @@ http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf, discussed
 2010/12/22 on -hackers list.
 
 Current implementation requires this be enabled system-wide at initdb time, or
-by using the pg_checksums tool on an offline cluster.
+by using the pg_checksums tool on an offline cluster.  Checksums can also be
+enabled at runtime using pg_enable_data_checksums(), and disabled by using
+pg_disable_data_checksums().
 
 The checksum is not valid at all times on a data page!!
 The checksum is valid when the page leaves the shared pool and is checked
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index dbb49ed9197..19cf6512e52 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -107,7 +107,7 @@ PageIsVerified(PageData *page, BlockNumber blkno, int flags, bool *checksum_fail
 	 */
 	if (!PageIsNew(page))
 	{
-		if (DataChecksumsEnabled())
+		if (DataChecksumsNeedVerify())
 		{
 			checksum = pg_checksum_page(page, blkno);
 
@@ -1511,7 +1511,7 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 	static char *pageCopy = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return page;
 
 	/*
@@ -1541,7 +1541,7 @@ void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
 	/* If we don't need a checksum, just return */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return;
 
 	((PageHeader) page)->pd_checksum = pg_checksum_page(page, blkno);
diff --git a/src/backend/utils/activity/pgstat_backend.c b/src/backend/utils/activity/pgstat_backend.c
index 8714a85e2d9..edc2512d79f 100644
--- a/src/backend/utils/activity/pgstat_backend.c
+++ b/src/backend/utils/activity/pgstat_backend.c
@@ -378,6 +378,8 @@ pgstat_tracks_backend_bktype(BackendType bktype)
 		case B_CHECKPOINTER:
 		case B_IO_WORKER:
 		case B_STARTUP:
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 			return false;
 
 		case B_AUTOVAC_WORKER:
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index 13ae57ed649..a290d56f409 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -362,6 +362,8 @@ pgstat_tracks_io_bktype(BackendType bktype)
 		case B_LOGGER:
 			return false;
 
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 		case B_AUTOVAC_LAUNCHER:
 		case B_AUTOVAC_WORKER:
 		case B_BACKEND:
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index 5427da5bc1b..7f26d78cb77 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -116,6 +116,9 @@ CHECKPOINT_DELAY_COMPLETE	"Waiting for a backend that blocks a checkpoint from c
 CHECKPOINT_DELAY_START	"Waiting for a backend that blocks a checkpoint from starting."
 CHECKPOINT_DONE	"Waiting for a checkpoint to complete."
 CHECKPOINT_START	"Waiting for a checkpoint to start."
+CHECKSUM_ENABLE_STARTCONDITION	"Waiting for data checksums enabling to start."
+CHECKSUM_ENABLE_FINISHCONDITION	"Waiting for data checksums to be enabled."
+CHECKSUM_ENABLE_TEMPTABLE_WAIT	"Waiting for temporary tables to be dropped for data checksums to be enabled."
 EXECUTE_GATHER	"Waiting for activity from a child process while executing a <literal>Gather</literal> plan node."
 HASH_BATCH_ALLOCATE	"Waiting for an elected Parallel Hash participant to allocate a hash table."
 HASH_BATCH_ELECT	"Waiting to elect a Parallel Hash participant to allocate a hash table."
@@ -352,6 +355,7 @@ DSMRegistry	"Waiting to read or update the dynamic shared memory registry."
 InjectionPoint	"Waiting to read or update information related to injection points."
 SerialControl	"Waiting to read or update shared <filename>pg_serial</filename> state."
 AioWorkerSubmissionQueue	"Waiting to access AIO worker submission queue."
+DataChecksumsWorker	"Waiting for data checksumsworker."
 
 #
 # END OF PREDEFINED LWLOCKS (DO NOT CHANGE THIS LINE)
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index c756c2bebaa..f4e264ebf33 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -274,6 +274,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		cmdtype = PROGRESS_COMMAND_BASEBACKUP;
 	else if (pg_strcasecmp(cmd, "COPY") == 0)
 		cmdtype = PROGRESS_COMMAND_COPY;
+	else if (pg_strcasecmp(cmd, "DATACHECKSUMS") == 0)
+		cmdtype = PROGRESS_COMMAND_DATACHECKSUMS;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -1146,9 +1148,6 @@ pg_stat_get_db_checksum_failures(PG_FUNCTION_ARGS)
 	int64		result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
@@ -1164,9 +1163,6 @@ pg_stat_get_db_checksum_last_failure(PG_FUNCTION_ARGS)
 	TimestampTz result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index 545d1e90fbd..34cce2ce0be 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -293,9 +293,18 @@ GetBackendTypeDesc(BackendType backendType)
 		case B_CHECKPOINTER:
 			backendDesc = gettext_noop("checkpointer");
 			break;
+
 		case B_IO_WORKER:
 			backendDesc = gettext_noop("io worker");
 			break;
+
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+			backendDesc = "datachecksumsworker launcher";
+			break;
+		case B_DATACHECKSUMSWORKER_WORKER:
+			backendDesc = "datachecksumsworker worker";
+			break;
+
 		case B_LOGGER:
 			backendDesc = gettext_noop("logger");
 			break;
@@ -895,7 +904,8 @@ InitializeSessionUserIdStandalone(void)
 	 * workers, in slot sync worker and in background workers.
 	 */
 	Assert(!IsUnderPostmaster || AmAutoVacuumWorkerProcess() ||
-		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess());
+		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess() ||
+		   AmDataChecksumsWorkerProcess());
 
 	/* call only once */
 	Assert(!OidIsValid(AuthenticatedUserId));
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 641e535a73c..589e7eab9e8 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -750,6 +750,24 @@ InitPostgres(const char *in_dbname, Oid dboid,
 
 	ProcSignalInit(MyCancelKey, MyCancelKeyLength);
 
+	/*
+	 * Initialize a local cache of the data_checksum_version, to be updated by
+	 * the procsignal-based barriers.
+	 *
+	 * This intentionally happens after initializing the procsignal, otherwise
+	 * we might miss a state change. This means we can get a barrier for the
+	 * state we've just initialized - but it can happen only once.
+	 *
+	 * The postmaster (which is what gets forked into the new child process)
+	 * does not handle barriers, therefore it may not have the current value
+	 * of LocalDataChecksumVersion value (it'll have the value read from the
+	 * control file, which may be arbitrarily old).
+	 *
+	 * NB: Even if the postmaster handled barriers, the value might still be
+	 * stale, as it might have changed after this process forked.
+	 */
+	InitLocalDataChecksumVersion();
+
 	/*
 	 * Also set up timeout handlers needed for backend operation.  We need
 	 * these in every case except bootstrap.
@@ -878,7 +896,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
 					 errhint("You should immediately run CREATE USER \"%s\" SUPERUSER;.",
 							 username != NULL ? username : "postgres")));
 	}
-	else if (AmBackgroundWorkerProcess())
+	else if (AmBackgroundWorkerProcess() || AmDataChecksumsWorkerProcess())
 	{
 		if (username == NULL && !OidIsValid(useroid))
 		{
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index f137129209f..36fba8496df 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -491,6 +491,14 @@ static const struct config_enum_entry file_copy_method_options[] = {
 	{NULL, 0, false}
 };
 
+static const struct config_enum_entry data_checksums_options[] = {
+	{"on", PG_DATA_CHECKSUM_VERSION, true},
+	{"off", PG_DATA_CHECKSUM_OFF, true},
+	{"inprogress-on", PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION, true},
+	{"inprogress-off", PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION, true},
+	{NULL, 0, false}
+};
+
 /*
  * Options for enum values stored in other modules
  */
@@ -616,7 +624,6 @@ static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
 static int	num_os_semaphores;
-static bool data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
@@ -2043,17 +2050,6 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
-	{
-		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
-			gettext_noop("Shows whether data checksums are turned on for this cluster."),
-			NULL,
-			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
-		},
-		&data_checksums,
-		false,
-		NULL, NULL, NULL
-	},
-
 	{
 		{"syslog_sequence_numbers", PGC_SIGHUP, LOGGING_WHERE,
 			gettext_noop("Add sequence number to syslog messages to avoid duplicate suppression."),
@@ -5489,6 +5485,16 @@ struct config_enum ConfigureNamesEnum[] =
 		DEFAULT_IO_METHOD, io_method_options,
 		NULL, assign_io_method, NULL
 	},
+	{
+		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows whether data checksums are turned on for this cluster."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
+		},
+		&data_checksums,
+		PG_DATA_CHECKSUM_OFF, data_checksums_options,
+		NULL, NULL, show_data_checksums
+	},
 
 	/* End-of-list marker */
 	{
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index f20be82862a..8411cecf3ff 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -568,7 +568,7 @@ main(int argc, char *argv[])
 		ControlFile->state != DB_SHUTDOWNED_IN_RECOVERY)
 		pg_fatal("cluster must be shut down");
 
-	if (ControlFile->data_checksum_version == 0 &&
+	if (ControlFile->data_checksum_version != PG_DATA_CHECKSUM_VERSION &&
 		mode == PG_MODE_CHECK)
 		pg_fatal("data checksums are not enabled in cluster");
 
@@ -576,7 +576,7 @@ main(int argc, char *argv[])
 		mode == PG_MODE_DISABLE)
 		pg_fatal("data checksums are already disabled in cluster");
 
-	if (ControlFile->data_checksum_version > 0 &&
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION &&
 		mode == PG_MODE_ENABLE)
 		pg_fatal("data checksums are already enabled in cluster");
 
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index 10de058ce91..acf5c7b026e 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -280,6 +280,8 @@ main(int argc, char *argv[])
 		   ControlFile->checkPointCopy.oldestCommitTsXid);
 	printf(_("Latest checkpoint's newestCommitTsXid:%u\n"),
 		   ControlFile->checkPointCopy.newestCommitTsXid);
+	printf(_("Latest checkpoint's data_checksum_version:%u\n"),
+		   ControlFile->checkPointCopy.data_checksum_version);
 	printf(_("Time of latest checkpoint:            %s\n"),
 		   ckpttime_str);
 	printf(_("Fake LSN counter for unlogged rels:   %X/%08X\n"),
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index 90cef0864de..29684e82440 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -15,6 +15,7 @@
 #include "access/xlog_internal.h"
 #include "common/string.h"
 #include "pg_upgrade.h"
+#include "storage/bufpage.h"
 
 
 /*
@@ -736,6 +737,14 @@ check_control_data(ControlData *oldctrl,
 	 * check_for_isn_and_int8_passing_mismatch().
 	 */
 
+	/*
+	 * If data checksums are in any in-progress state then disallow the
+	 * upgrade. The user should either let the process finish, or turn off
+	 * data checksums, before retrying.
+	 */
+	if (oldctrl->data_checksum_version > PG_DATA_CHECKSUM_VERSION)
+		pg_fatal("checksums are being enabled in the old cluster");
+
 	/*
 	 * We might eventually allow upgrades from checksum to no-checksum
 	 * clusters.
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index d12798be3d8..8bcc5aa8a63 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -56,6 +56,7 @@ extern PGDLLIMPORT int CommitDelay;
 extern PGDLLIMPORT int CommitSiblings;
 extern PGDLLIMPORT bool track_wal_io_timing;
 extern PGDLLIMPORT int wal_decode_buffer_size;
+extern PGDLLIMPORT int data_checksums;
 
 extern PGDLLIMPORT int CheckPointSegments;
 
@@ -117,7 +118,7 @@ extern PGDLLIMPORT int wal_level;
  * of the bits make it to disk, but the checksum wouldn't match.  Also WAL-log
  * them if forced by wal_log_hints=on.
  */
-#define XLogHintBitIsNeeded() (DataChecksumsEnabled() || wal_log_hints)
+#define XLogHintBitIsNeeded() (wal_log_hints || DataChecksumsNeedWrite())
 
 /* Do we need to WAL-log information required only for Hot Standby and logical replication? */
 #define XLogStandbyInfoActive() (wal_level >= WAL_LEVEL_REPLICA)
@@ -229,7 +230,19 @@ extern XLogRecPtr GetXLogWriteRecPtr(void);
 
 extern uint64 GetSystemIdentifier(void);
 extern char *GetMockAuthenticationNonce(void);
-extern bool DataChecksumsEnabled(void);
+extern bool DataChecksumsNeedWrite(void);
+extern bool DataChecksumsNeedVerify(void);
+extern bool DataChecksumsOnInProgress(void);
+extern bool DataChecksumsOffInProgress(void);
+extern void SetDataChecksumsOnInProgress(void);
+extern void SetDataChecksumsOn(void);
+extern void SetDataChecksumsOff(void);
+extern bool AbsorbChecksumsOnInProgressBarrier(void);
+extern bool AbsorbChecksumsOffInProgressBarrier(void);
+extern bool AbsorbChecksumsOnBarrier(void);
+extern bool AbsorbChecksumsOffBarrier(void);
+extern const char *show_data_checksums(void);
+extern void InitLocalDataChecksumVersion(void);
 extern bool GetDefaultCharSignedness(void);
 extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
 extern Size XLOGShmemSize(void);
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index cc06fc29ab2..cc78b00fe4c 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,6 +25,7 @@
 #include "lib/stringinfo.h"
 #include "pgtime.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/relfilelocator.h"
 
 
@@ -289,6 +290,12 @@ typedef struct xl_restore_point
 	char		rp_name[MAXFNAMELEN];
 } xl_restore_point;
 
+/* Information logged when data checksum level is changed */
+typedef struct xl_checksum_state
+{
+	uint32		new_checksumtype;
+} xl_checksum_state;
+
 /* Overwrite of prior contrecord */
 typedef struct xl_overwrite_contrecord
 {
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 63e834a6ce4..a8877fb87d1 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -62,6 +62,9 @@ typedef struct CheckPoint
 	 * set to InvalidTransactionId.
 	 */
 	TransactionId oldestActiveXid;
+
+	/* data checksums at the time of the checkpoint  */
+	uint32		data_checksum_version;
 } CheckPoint;
 
 /* XLOG info values for XLOG rmgr */
@@ -80,6 +83,7 @@ typedef struct CheckPoint
 /* 0xC0 is used in Postgres 9.5-11 */
 #define XLOG_OVERWRITE_CONTRECORD		0xD0
 #define XLOG_CHECKPOINT_REDO			0xE0
+#define XLOG_CHECKSUMS					0xF0
 
 
 /*
@@ -219,7 +223,7 @@ typedef struct ControlFileData
 	bool		float8ByVal;	/* float8, int8, etc pass-by-value? */
 
 	/* Are data pages protected by checksums? Zero if no checksum version */
-	uint32		data_checksum_version;
+	uint32		data_checksum_version;	/* persistent */
 
 	/*
 	 * True if the default signedness of char is "signed" on a platform where
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 118d6da1ace..c6f4e31a12f 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12356,6 +12356,25 @@
   proname => 'jsonb_subscript_handler', prorettype => 'internal',
   proargtypes => 'internal', prosrc => 'jsonb_subscript_handler' },
 
+# data checksum management functions
+{ oid => '9258',
+  descr => 'disable data checksums',
+  proname => 'pg_disable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'bool', proallargtypes => '{bool}',
+  proargmodes => '{i}',
+  proargnames => '{fast}',
+  prosrc => 'disable_data_checksums' },
+
+{ oid => '9257',
+  descr => 'enable data checksums',
+  proname => 'pg_enable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'int4 int4 bool', proallargtypes => '{int4,int4,bool}',
+  proargmodes => '{i,i,i}',
+  proargnames => '{cost_delay,cost_limit,fast}',
+  prosrc => 'enable_data_checksums' },
+
 # collation management functions
 { oid => '3445', descr => 'import collations from operating system',
   proname => 'pg_import_system_collations', procost => '100',
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 1cde4bd9bcf..cf6de4ef12d 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -162,4 +162,20 @@
 #define PROGRESS_COPY_TYPE_PIPE 3
 #define PROGRESS_COPY_TYPE_CALLBACK 4
 
+/* Progress parameters for PROGRESS_DATACHECKSUMS */
+#define PROGRESS_DATACHECKSUMS_PHASE		0
+#define PROGRESS_DATACHECKSUMS_DBS_TOTAL	1
+#define PROGRESS_DATACHECKSUMS_DBS_DONE		2
+#define PROGRESS_DATACHECKSUMS_RELS_TOTAL	3
+#define PROGRESS_DATACHECKSUMS_RELS_DONE	4
+#define PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL	5
+#define PROGRESS_DATACHECKSUMS_BLOCKS_DONE	6
+
+/* Phases of datachecksumsworker operation */
+#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING			0
+#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING			1
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING			2
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL	3
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT 4
+
 #endif
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1bef98471c3..2a0d7b6de42 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -366,6 +366,9 @@ typedef enum BackendType
 	B_WAL_SUMMARIZER,
 	B_WAL_WRITER,
 
+	B_DATACHECKSUMSWORKER_LAUNCHER,
+	B_DATACHECKSUMSWORKER_WORKER,
+
 	/*
 	 * Logger is not connected to shared memory and does not have a PGPROC
 	 * entry.
@@ -391,6 +394,9 @@ extern PGDLLIMPORT BackendType MyBackendType;
 #define AmWalSummarizerProcess()	(MyBackendType == B_WAL_SUMMARIZER)
 #define AmWalWriterProcess()		(MyBackendType == B_WAL_WRITER)
 #define AmIoWorkerProcess()			(MyBackendType == B_IO_WORKER)
+#define AmDataChecksumsWorkerProcess() \
+	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || \
+	 MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
 
 #define AmSpecialWorkerProcess() \
 	(AmAutoVacuumLauncherProcess() || \
diff --git a/src/include/postmaster/datachecksumsworker.h b/src/include/postmaster/datachecksumsworker.h
new file mode 100644
index 00000000000..2cd066fd0fe
--- /dev/null
+++ b/src/include/postmaster/datachecksumsworker.h
@@ -0,0 +1,51 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.h
+ *	  header file for data checksum helper background worker
+ *
+ *
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/postmaster/datachecksumsworker.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef DATACHECKSUMSWORKER_H
+#define DATACHECKSUMSWORKER_H
+
+/* Shared memory */
+extern Size DataChecksumsWorkerShmemSize(void);
+extern void DataChecksumsWorkerShmemInit(void);
+
+/* Possible operations the Datachecksumsworker can perform */
+typedef enum DataChecksumsWorkerOperation
+{
+	ENABLE_DATACHECKSUMS,
+	DISABLE_DATACHECKSUMS,
+	/* TODO: VERIFY_DATACHECKSUMS, */
+}			DataChecksumsWorkerOperation;
+
+/*
+ * Possible states for a database entry which has been processed. Exported
+ * here since we want to be able to reference this from injection point tests.
+ */
+typedef enum
+{
+	DATACHECKSUMSWORKER_SUCCESSFUL = 0,
+	DATACHECKSUMSWORKER_ABORTED,
+	DATACHECKSUMSWORKER_FAILED,
+	DATACHECKSUMSWORKER_RETRYDB,
+} DataChecksumsWorkerResult;
+
+/* Start the background processes for enabling or disabling checksums */
+void		StartDataChecksumsWorkerLauncher(DataChecksumsWorkerOperation op,
+											 int cost_delay,
+											 int cost_limit,
+											 bool fast);
+
+/* Background worker entrypoints */
+void		DataChecksumsWorkerLauncherMain(Datum arg);
+void		DataChecksumsWorkerMain(Datum arg);
+
+#endif							/* DATACHECKSUMSWORKER_H */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index aeb67c498c5..30fb0f62d4c 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -16,6 +16,7 @@
 
 #include "access/xlogdefs.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/item.h"
 #include "storage/off.h"
 
@@ -205,7 +206,6 @@ typedef PageHeaderData *PageHeader;
  * handling pages.
  */
 #define PG_PAGE_LAYOUT_VERSION		4
-#define PG_DATA_CHECKSUM_VERSION	1
 
 /* ----------------------------------------------------------------
  *						page support functions
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 25d13a798d1..b3f368a15b5 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -15,6 +15,20 @@
 
 #include "storage/block.h"
 
+/*
+ * Checksum version 0 is used for when data checksums are disabled (OFF).
+ * PG_DATA_CHECKSUM_VERSION defines that data checksums are enabled in the
+ * cluster and PG_DATA_CHECKSUM_INPROGRESS_{ON|OFF}_VERSION defines that data
+ * checksums are either currently being enabled or disabled.
+ */
+typedef enum ChecksumType
+{
+	PG_DATA_CHECKSUM_OFF = 0,
+	PG_DATA_CHECKSUM_VERSION,
+	PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION,
+	PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION
+} ChecksumType;
+
 /*
  * Compute the checksum for a Postgres page.  The page must be aligned on a
  * 4-byte boundary.
diff --git a/src/include/storage/lwlocklist.h b/src/include/storage/lwlocklist.h
index 06a1ffd4b08..b8f7ba0be51 100644
--- a/src/include/storage/lwlocklist.h
+++ b/src/include/storage/lwlocklist.h
@@ -85,6 +85,7 @@ PG_LWLOCK(50, DSMRegistry)
 PG_LWLOCK(51, InjectionPoint)
 PG_LWLOCK(52, SerialControl)
 PG_LWLOCK(53, AioWorkerSubmissionQueue)
+PG_LWLOCK(54, DataChecksumsWorker)
 
 /*
  * There also exist several built-in LWLock tranches.  As with the predefined
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index c6f5ebceefd..d90d35b1d6f 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -463,11 +463,11 @@ extern PGDLLIMPORT PGPROC *PreparedXactProcs;
  * Background writer, checkpointer, WAL writer, WAL summarizer, and archiver
  * run during normal operation.  Startup process and WAL receiver also consume
  * 2 slots, but WAL writer is launched only after startup has exited, so we
- * only need 6 slots.
+ * only need 6 slots to cover these. The DataChecksums worker and launcher
+ * can consume 2 slots when data checksums are enabled or disabled.
  */
 #define MAX_IO_WORKERS          32
-#define NUM_AUXILIARY_PROCS		(6 + MAX_IO_WORKERS)
-
+#define NUM_AUXILIARY_PROCS		(8 + MAX_IO_WORKERS)
 
 /* configurable options */
 extern PGDLLIMPORT int DeadlockTimeout;
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index afeeb1ca019..c54c61e2cd8 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -54,6 +54,11 @@ typedef enum
 typedef enum
 {
 	PROCSIGNAL_BARRIER_SMGRRELEASE, /* ask smgr to close files */
+
+	PROCSIGNAL_BARRIER_CHECKSUM_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_ON,
 } ProcSignalBarrierType;
 
 /*
diff --git a/src/include/utils/backend_progress.h b/src/include/utils/backend_progress.h
index dda813ab407..c664e92dbfe 100644
--- a/src/include/utils/backend_progress.h
+++ b/src/include/utils/backend_progress.h
@@ -28,6 +28,7 @@ typedef enum ProgressCommandType
 	PROGRESS_COMMAND_CREATE_INDEX,
 	PROGRESS_COMMAND_BASEBACKUP,
 	PROGRESS_COMMAND_COPY,
+	PROGRESS_COMMAND_DATACHECKSUMS,
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	20
diff --git a/src/test/Makefile b/src/test/Makefile
index 511a72e6238..278ce3e8a86 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,16 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = perl postmaster regress isolation modules authentication recovery subscription
+SUBDIRS = \
+		perl \
+		postmaster \
+		regress \
+		isolation \
+		modules \
+		authentication \
+		recovery \
+		subscription \
+		checksum
 
 ifeq ($(with_icu),yes)
 SUBDIRS += icu
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 903a8ac151a..c8f2747b261 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -17,6 +17,7 @@ SUBDIRS = \
 		  test_aio \
 		  test_binaryheap \
 		  test_bloomfilter \
+		  test_checksums \
 		  test_copy_callbacks \
 		  test_custom_rmgrs \
 		  test_ddl_deparse \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 93be0f57289..6b4450eb473 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -16,6 +16,7 @@ subdir('ssl_passphrase_callback')
 subdir('test_aio')
 subdir('test_binaryheap')
 subdir('test_bloomfilter')
+subdir('test_checksums')
 subdir('test_copy_callbacks')
 subdir('test_custom_rmgrs')
 subdir('test_ddl_deparse')
diff --git a/src/test/modules/test_checksums/.gitignore b/src/test/modules/test_checksums/.gitignore
new file mode 100644
index 00000000000..871e943d50e
--- /dev/null
+++ b/src/test/modules/test_checksums/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/modules/test_checksums/Makefile b/src/test/modules/test_checksums/Makefile
new file mode 100644
index 00000000000..a5b6259a728
--- /dev/null
+++ b/src/test/modules/test_checksums/Makefile
@@ -0,0 +1,40 @@
+#-------------------------------------------------------------------------
+#
+# Makefile for src/test/checksum
+#
+# Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/checksum/Makefile
+#
+#-------------------------------------------------------------------------
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+export enable_injection_points
+
+MODULE_big = test_checksums
+OBJS = \
+	$(WIN32RES) \
+	test_checksums.o
+PGFILEDESC = "test_checksums - test code for data checksums"
+
+EXTENSION = test_checksums
+DATA = test_checksums--1.0.sql
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_checksums
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
diff --git a/src/test/modules/test_checksums/README b/src/test/modules/test_checksums/README
new file mode 100644
index 00000000000..0f0317060b3
--- /dev/null
+++ b/src/test/modules/test_checksums/README
@@ -0,0 +1,22 @@
+src/test/checksum/README
+
+Regression tests for data checksums
+===================================
+
+This directory contains a test suite for enabling data checksums
+in a running cluster.
+
+Running the tests
+=================
+
+    make check
+
+or
+
+    make installcheck
+
+NOTE: This creates a temporary installation (in the case of "check"),
+with multiple nodes, be they master or standby(s) for the purpose of
+the tests.
+
+NOTE: This requires the --enable-tap-tests argument to configure.
diff --git a/src/test/modules/test_checksums/meson.build b/src/test/modules/test_checksums/meson.build
new file mode 100644
index 00000000000..57156b63599
--- /dev/null
+++ b/src/test/modules/test_checksums/meson.build
@@ -0,0 +1,35 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+test_checksums_sources = files(
+	'test_checksums.c',
+)
+
+test_checksums = shared_module('test_checksums',
+	test_checksums_sources,
+	kwargs: pg_test_mod_args,
+)
+test_install_libs += test_checksums
+
+test_install_data += files(
+	'test_checksums.control',
+	'test_checksums--1.0.sql',
+)
+
+tests += {
+  'name': 'test_checksums',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'env': {
+       'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+	},
+    'tests': [
+      't/001_basic.pl',
+      't/002_restarts.pl',
+      't/003_standby_restarts.pl',
+      't/004_offline.pl',
+      't/005_injection.pl',
+      't/006_concurrent_pgbench.pl',
+    ],
+  },
+}
diff --git a/src/test/modules/test_checksums/t/001_basic.pl b/src/test/modules/test_checksums/t/001_basic.pl
new file mode 100644
index 00000000000..728a5c4510c
--- /dev/null
+++ b/src/test/modules/test_checksums/t/001_basic.pl
@@ -0,0 +1,63 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are turned off
+test_checksum_state($node, 'off');
+
+# Enable data checksums and wait for the state transition to 'on'
+enable_data_checksums($node, wait => 'on');
+
+# Run a dummy query just to make sure we can read back data
+my $result =
+  $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1 ");
+is($result, '9999', 'ensure checksummed pages can be read back');
+
+# Enable data checksums again which should be a no-op so we explicitly don't
+# wait for any state transition as none should happen here
+enable_data_checksums($node);
+test_checksum_state($node, 'on');
+# ..and make sure we can still read/write data
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums again and wait for the state transition
+disable_data_checksums($node, wait => 'on');
+
+# Test reading data again
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '10000', 'ensure previously checksummed pages can be read back');
+
+# Re-enable checksums and make sure that the underlying data has changed to
+# ensure that checksums will be different.
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+enable_data_checksums($node, wait => 'on');
+
+# Run a dummy query just to make sure we can read back the data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/002_restarts.pl b/src/test/modules/test_checksums/t/002_restarts.pl
new file mode 100644
index 00000000000..75599cf41f2
--- /dev/null
+++ b/src/test/modules/test_checksums/t/002_restarts.pl
@@ -0,0 +1,110 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with a
+# restart which breaks processing.
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Initialize result storage for queries
+my $result;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+test_checksum_state($node, 'off');
+
+SKIP:
+{
+	skip 'Data checksum delay tests not enabled in PG_TEST_EXTRA', 6
+	  if (!$ENV{PG_TEST_EXTRA}
+		|| $ENV{PG_TEST_EXTRA} !~ /\bchecksum_extended\b/);
+
+	# Create a barrier for checksumming to block on, in this case a pre-
+	# existing temporary table which is kept open while processing is started.
+	# We can accomplish this by setting up an interactive psql process which
+	# keeps the temporary table created as we enable checksums in another psql
+	# process.
+	#
+	# This is a similar test to the synthetic variant in 005_injection.pl
+	# which fakes this scenario.
+	my $bsession = $node->background_psql('postgres');
+	$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+	# In another session, make sure we can see the blocking temp table but
+	# start processing anyways and check that we are blocked with a proper
+	# wait event.
+	$result = $node->safe_psql('postgres',
+		"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';"
+	);
+	is($result, 't', 'ensure we can see the temporary table');
+
+	# Enabling data checksums shouldn't work as the process is blocked on the
+	# temporary table held open by $bsession. Ensure that we reach inprogress-
+	# on before we do more tests.
+	enable_data_checksums($node, wait => 'inprogress-on');
+
+	# Wait for processing to finish and the worker waiting for leftover temp
+	# relations to be able to actually finish
+	$result = $node->poll_query_until(
+		'postgres',
+		"SELECT wait_event FROM pg_catalog.pg_stat_activity "
+		  . "WHERE backend_type = 'datachecksumsworker worker';",
+		'ChecksumEnableTemptableWait');
+
+	# The datachecksumsworker waits for temporary tables to disappear for 3
+	# seconds before retrying, so sleep for 4 seconds to be guaranteed to see
+	# a retry cycle
+	sleep(4);
+
+	# Re-check the wait event to ensure we are blocked on the right thing.
+	$result = $node->safe_psql('postgres',
+			"SELECT wait_event FROM pg_catalog.pg_stat_activity "
+		  . "WHERE backend_type = 'datachecksumsworker worker';");
+	is($result, 'ChecksumEnableTemptableWait',
+		'ensure the correct wait condition is set');
+	test_checksum_state($node, 'inprogress-on');
+
+	# Stop the cluster while bsession is still attached.  We can't close the
+	# session first since the brief period between closing and stopping might
+	# be enough for checksums to get enabled.
+	$node->stop;
+	$bsession->quit;
+	$node->start;
+
+	# Ensure the checksums aren't enabled across the restart.  This leaves the
+	# cluster in the same state as before we entered the SKIP block.
+	test_checksum_state($node, 'off');
+}
+
+enable_data_checksums($node, wait => 'on');
+
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '9999', 'ensure checksummed pages can be read back');
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+disable_data_checksums($node, wait => 1);
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/003_standby_restarts.pl b/src/test/modules/test_checksums/t/003_standby_restarts.pl
new file mode 100644
index 00000000000..fe34b4d7d05
--- /dev/null
+++ b/src/test/modules/test_checksums/t/003_standby_restarts.pl
@@ -0,0 +1,114 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# streaming replication
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+# Initialize primary node
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+$node_primary->init(allows_streaming => 1, no_data_checksums => 1);
+$node_primary->start;
+
+my $slotname = 'physical_slot';
+$node_primary->safe_psql('postgres',
+	"SELECT pg_create_physical_replication_slot('$slotname')");
+
+# Take backup
+my $backup_name = 'my_backup';
+$node_primary->backup($backup_name);
+
+# Create streaming standby linking to primary
+my $node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $backup_name,
+	has_streaming => 1);
+$node_standby_1->append_conf(
+	'postgresql.conf', qq[
+primary_slot_name = '$slotname'
+]);
+$node_standby_1->start;
+
+# Create some content on the primary to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Wait for standbys to catch up
+$node_primary->wait_for_catchup($node_standby_1, 'replay',
+	$node_primary->lsn('insert'));
+
+# Check that checksums are turned off on all nodes
+test_checksum_state($node_primary, 'off');
+test_checksum_state($node_standby_1, 'off');
+
+# ---------------------------------------------------------------------------
+# Enable checksums for the cluster, and make sure that both the primary and
+# standby change state.
+#
+
+# Ensure that the primary switches to "inprogress-on"
+enable_data_checksums($node_primary, wait => 'inprogress-on');
+# Wait for checksum enable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Ensure that the standby has switched to "inprogress-on" or "on".  Normally it
+# would be "inprogress-on", but it is theoretically possible for the primary to
+# complete the checksum enabling *and* have the standby replay that record
+# before we reach the check below.
+my $result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting = 'off' FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'f');
+is($result, 1, 'ensure standby has absorbed the inprogress-on barrier');
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+
+is(($result eq 'inprogress-on' || $result eq 'on'),
+	1, 'ensure checksums are on, or in progress, on standby_1');
+
+# Insert some more data which should be checksummed on INSERT
+$node_primary->safe_psql('postgres',
+	"INSERT INTO t VALUES (generate_series(1, 10000));");
+
+# Wait for checksums enabled on the primary and standby
+wait_for_checksum_state($node_primary, 'on');
+wait_for_checksum_state($node_standby_1, 'on');
+
+$result =
+  $node_primary->safe_psql('postgres', "SELECT count(a) FROM t WHERE a > 1");
+is($result, '19998', 'ensure we can safely read all data with checksums');
+
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+#
+# Disable checksums and ensure it's propagated to standby and that we can
+# still read all data
+#
+
+# Disable checksums and wait for the operation to be replayed
+disable_data_checksums($node_primary);
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+# Ensure that the primary abd standby has switched to off
+wait_for_checksum_state($node_primary, 'off');
+wait_for_checksum_state($node_standby_1, 'off');
+# Doublecheck reading data withourt errors
+$result =
+  $node_primary->safe_psql('postgres', "SELECT count(a) FROM t WHERE a > 1");
+is($result, "19998", 'ensure we can safely read all data without checksums');
+
+$node_standby_1->stop;
+$node_primary->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/004_offline.pl b/src/test/modules/test_checksums/t/004_offline.pl
new file mode 100644
index 00000000000..e9fbcf77eab
--- /dev/null
+++ b/src/test/modules/test_checksums/t/004_offline.pl
@@ -0,0 +1,82 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums offline from various states
+# of checksum processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+test_checksum_state($node, 'off');
+
+# Enable checksums offline using pg_checksums
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are enabled
+test_checksum_state($node, 'on');
+
+# Run a dummy query just to make sure we can read back some data
+my $result =
+  $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '9999', 'ensure checksummed pages can be read back');
+
+# Disable checksums offline again using pg_checksums
+$node->stop;
+$node->checksum_disable_offline;
+$node->start;
+
+# Ensure that checksums are disabled
+test_checksum_state($node, 'off');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+enable_data_checksums($node, wait => 'inprogress-on');
+
+# Turn the cluster off and enable checksums offline, then start back up
+$bsession->quit;
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are now enabled even though processing wasn't
+# restarted
+test_checksum_state($node, 'on');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '9999', 'ensure checksummed pages can be read back');
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/005_injection.pl b/src/test/modules/test_checksums/t/005_injection.pl
new file mode 100644
index 00000000000..f4459e0e636
--- /dev/null
+++ b/src/test/modules/test_checksums/t/005_injection.pl
@@ -0,0 +1,76 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# injection point tests injecting failures into the processing
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+# ---------------------------------------------------------------------------
+# Test cluster setup
+#
+
+# Initiate testcluster
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Set up test environment
+$node->safe_psql('postgres', 'CREATE EXTENSION test_checksums;');
+
+# ---------------------------------------------------------------------------
+# Inducing failures in processing
+
+# Force enabling checksums to fail by marking one of the databases as having
+# failed in processing.
+disable_data_checksums($node, wait => 1);
+$node->safe_psql('postgres', 'SELECT dcw_inject_fail_database(true);');
+enable_data_checksums($node, wait => 'off');
+$node->safe_psql('postgres', 'SELECT dcw_inject_fail_database(false);');
+
+# Force the enable checksums processing to make multiple passes by removing
+# one database from the list in the first pass.  This will simulate a CREATE
+# DATABASE during processing.  Doing this via fault injection makes the test
+# not be subject to exact timing.
+$node->safe_psql('postgres', 'SELECT dcw_prune_dblist(true);');
+enable_data_checksums($node, wait => 'on');
+
+# ---------------------------------------------------------------------------
+# Timing and retry related tests
+#
+SKIP:
+{
+	skip 'Data checksum delay tests not enabled in PG_TEST_EXTRA', 4
+	  if (!$ENV{PG_TEST_EXTRA}
+		|| $ENV{PG_TEST_EXTRA} !~ /\bchecksum_extended\b/);
+
+	# Inject a delay in the barrier for enabling checksums
+	disable_data_checksums($node, wait => 1);
+	$node->safe_psql('postgres', 'SELECT dcw_inject_delay_barrier();');
+	enable_data_checksums($node, wait => 'on');
+
+	# Fake the existence of a temporary table at the start of processing, which
+	# will force the processing to wait and retry in order to wait for it to
+	# disappear.
+	disable_data_checksums($node, wait => 1);
+	$node->safe_psql('postgres', 'SELECT dcw_fake_temptable(true);');
+	enable_data_checksums($node, wait => 'on');
+}
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/006_concurrent_pgbench.pl b/src/test/modules/test_checksums/t/006_concurrent_pgbench.pl
new file mode 100644
index 00000000000..df037d4b177
--- /dev/null
+++ b/src/test/modules/test_checksums/t/006_concurrent_pgbench.pl
@@ -0,0 +1,283 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# concurrent activity via pgbench runs
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+my $node_primary_slot = 'physical_slot';
+my $node_primary_backup = 'primary_backup';
+my $node_primary;
+my $node_standby_1;
+
+# The number of full test iterations which will be performed
+my $TEST_ITERATIONS = 250;
+
+# Variables which record the current state of the cluster
+my $data_checksum_state = 'off';
+my $pgbench_running = 0;
+
+# Variables holding state for managing the cluster and aux processes in
+# various ways
+my @stop_modes = ();
+my ($pgb_primary_stdin, $pgb_primary_stdout, $pgb_primary_stderr) = ('', '', '');
+my ($pgb_standby_1_stdin, $pgb_standby_1_stdout, $pgb_standby_1_stderr) = ('', '', '');
+
+if (!$ENV{PG_TEST_EXTRA} || $ENV{PG_TEST_EXTRA} !~ /\bchecksum_extended\b/)
+{
+	plan skip_all => 'Extended tests not enabled';
+}
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+# Helper for retrieving a binary value with random distribution for deciding
+# whether to turn things off during testing.
+sub cointoss
+{
+	return int(rand(2) == 1);
+}
+
+# Helper for injecting random sleeps here and there in the testrun. The sleep
+# duration wont be predictable in order to avoid sleep patterns that manage to
+# avoid race conditions and timing bugs.
+sub random_sleep
+{
+	return if cointoss;
+	sleep(int(rand(3)));
+}
+
+# Start a read-only pgbench run in the background against the server specified
+# via the port passed as parameter
+sub background_ro_pgbench
+{
+	my ($port, $stdin, $stdout, $stderr) = @_;
+
+	my $pgbench_primary = IPC::Run::start(
+		[
+			'pgbench', '-p', $port, '-S', '-T', '600', '-c', '10', 'postgres'
+		],
+		'<' => \$stdin,
+		'>' => \$stdout,
+		'2>' => \$stderr,
+		IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default));
+}
+
+# Start a pgbench run in the background against the server specified via the
+# port passed as parameter
+sub background_rw_pgbench
+{
+	my ($port, $stdin, $stdout, $stderr) = @_;
+
+	my $pgbench_primary = IPC::Run::start(
+		[
+			'pgbench', '-p', $port, '-T', '600', '-c', '10', 'postgres'
+		],
+		'<' => \$stdin,
+		'>' => \$stdout,
+		'2>' => \$stderr,
+		IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default));
+}
+
+# Invert the state of data checksums in the cluster, if data checksums are on
+# then disable them and vice versa. Also performs proper validation of the
+# before and after state.
+sub flip_data_checksums
+{
+	test_checksum_state($node_primary, $data_checksum_state);
+	test_checksum_state($node_standby_1, $data_checksum_state);
+
+	if ($data_checksum_state eq 'off')
+	{
+		# Coin-toss to see if we are injecting a retry due to a temptable
+		$node_primary->safe_psql('postgres', 'SELECT dcw_fake_temptable(true);')
+			if cointoss();
+
+		# Ensure that the primary switches to "inprogress-on"
+		enable_data_checksums($node_primary, wait => 'inprogress-on');
+		random_sleep();
+		# Wait for checksum enable to be replayed
+		$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+		# Ensure that the standby has switched to "inprogress-on" or "on".
+		# Normally it would be "inprogress-on", but it is theoretically
+		# possible for the primary to complete the checksum enabling *and* have
+		# the standby replay that record before we reach the check below.
+		my $result = $node_standby_1->poll_query_until(
+			'postgres',
+			"SELECT setting = 'off' "
+			  . "FROM pg_catalog.pg_settings "
+			  . "WHERE name = 'data_checksums';",
+			'f');
+		is($result, 1,
+			'ensure standby has absorbed the inprogress-on barrier');
+		random_sleep();
+		$result = $node_standby_1->safe_psql('postgres',
+			"SELECT setting " .
+			"FROM pg_catalog.pg_settings " .
+			"WHERE name = 'data_checksums';"
+		);
+
+		is(($result eq 'inprogress-on' || $result eq 'on'),
+			1, 'ensure checksums are on, or in progress, on standby_1');
+
+		# Wait for checksums enabled on the primary and standby
+		wait_for_checksum_state($node_primary, 'on');
+		random_sleep();
+		wait_for_checksum_state($node_standby_1, 'on');
+
+		$node_primary->safe_psql('postgres',
+			'SELECT dcw_fake_temptable(false);');
+		$data_checksum_state = 'on';
+	}
+	elsif ($data_checksum_state eq 'on')
+	{
+		random_sleep();
+		disable_data_checksums($node_primary);
+		$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+		# Wait for checksums disabled on the primary and standby
+		wait_for_checksum_state($node_primary, 'off');
+		random_sleep();
+		wait_for_checksum_state($node_standby_1, 'off');
+
+		$data_checksum_state = 'off';
+	}
+	else
+	{
+		# This should only happen due to programmer error when hacking on the
+		# test code, but since that might pass subtly by let's ensure it gets
+		# caught with a test error if so.
+		is(1, 0, 'data_checksum_state variable has invalid state');
+	}
+}
+
+# Prepare an array with pg_ctl stop modes which we later can randomly select
+# from in order to stop the cluster in some way.
+for (my $i = 1; $i <= 100; $i++)
+{
+	if (int(rand($i * 2)) > $i)
+	{
+		push(@stop_modes, "immediate");
+	}
+	else
+	{
+		push(@stop_modes, "fast");
+	}
+}
+
+# Create and start a cluster with one primary and one standby node, and ensure
+# they are caught up and in sync.
+$node_primary = PostgreSQL::Test::Cluster->new('main');
+$node_primary->init(allows_streaming => 1, no_data_checksums => 1);
+# max_connections need to be bumped in order to accomodate for pgbench clients
+# and log_statement is dialled down since it otherwise will generate enormous
+# amounts of logging. Page verification failures are still logged.
+$node_primary->append_conf('postgresql.conf',
+	qq[
+max_connections = 30
+log_statement = none
+]);
+$node_primary->start;
+$node_primary->safe_psql('postgres', 'CREATE EXTENSION test_checksums;');
+# Create some content to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1, 100000) AS a;");
+$node_primary->safe_psql('postgres',
+	"SELECT pg_create_physical_replication_slot('$node_primary_slot');");
+$node_primary->backup($node_primary_backup);
+
+$node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $node_primary_backup,
+	has_streaming => 1);
+$node_standby_1->append_conf(
+	'postgresql.conf', qq[
+primary_slot_name = '$node_primary_slot'
+]);
+$node_standby_1->start;
+
+$node_primary->command_ok([ 'pgbench', '-i', '-s', '100', '-q', 'postgres' ]);
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Start the test suite with pgbench running.
+background_ro_pgbench($node_standby_1->port,
+	$pgb_standby_1_stdin, $pgb_standby_1_stdout, $pgb_standby_1_stderr);
+background_rw_pgbench($node_primary->port,
+	$pgb_primary_stdin, $pgb_primary_stdout, $pgb_primary_stderr);
+
+# Main test suite. This loop will start a pgbench run on the cluster and while
+# that's running flip the state of data checksums concurrently. It will then
+# randomly restart thec cluster (in fast or immediate) mode and then check for
+# the desired state.  The idea behind doing things randomly is to stress out
+# any timing related issues by subjecting the cluster for varied workloads.
+# A TODO is to generate a trace such that any test failure can be traced to
+# its order of operations for debugging.
+for (my $i = 0; $i < $TEST_ITERATIONS; $i++)
+{
+	if (!$node_primary->is_alive)
+	{
+		# If data checksums are enabled, take the opportunity to verify them
+		# while the cluster is offline
+		$node_primary->checksum_verify_offline()
+			unless $data_checksum_state eq 'off';
+		random_sleep();
+		$node_primary->start;
+		# Start a pgbench in the background against the primary
+		background_rw_pgbench($node_primary->port, 0, $pgb_primary_stdin,
+			$pgb_primary_stdout, $pgb_primary_stderr);
+	}
+
+	if (!$node_standby_1->is_alive)
+	{
+		# If data checksums are enabled, take the opportunity to verify them
+		# while the cluster is offline
+		$node_standby_1->checksum_verify_offline()
+			unless $data_checksum_state eq 'off';
+		random_sleep();
+		$node_standby_1->start;
+		# Start a select-only pgbench in the background on the standby
+		background_ro_pgbench($node_standby_1->port, 1, $pgb_standby_1_stdin,
+			$pgb_standby_1_stdout, $pgb_standby_1_stderr);
+	}
+
+	$node_primary->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+
+	flip_data_checksums();
+	random_sleep();
+	my $result = $node_primary->safe_psql('postgres',
+		"SELECT count(*) FROM t WHERE a > 1");
+	is($result, '100000', 'ensure data pages can be read back on primary');
+	random_sleep();
+	$node_primary->wait_for_catchup($node_standby_1, 'write');
+
+	# Potentially powercycle the cluster
+	$node_primary->stop($stop_modes[ int(rand(100)) ]) if cointoss();
+	random_sleep();
+	$node_standby_1->stop($stop_modes[ int(rand(100)) ]) if cointoss();
+}
+
+# Testrun is over, ensure that data reads back as expected and perform a final
+# verification of the data checksum state.
+my $result = $node_primary->safe_psql('postgres',
+	"SELECT count(*) FROM t WHERE a > 1");
+is($result, '100000', 'ensure data pages can be read back on primary');
+test_checksum_state($node_primary, $data_checksum_state);
+test_checksum_state($node_standby_1, $data_checksum_state);
+
+$node_standby_1->teardown_node;
+$node_primary->teardown_node;
+
+done_testing();
diff --git a/src/test/modules/test_checksums/t/DataChecksums/Utils.pm b/src/test/modules/test_checksums/t/DataChecksums/Utils.pm
new file mode 100644
index 00000000000..ee2f2a1428f
--- /dev/null
+++ b/src/test/modules/test_checksums/t/DataChecksums/Utils.pm
@@ -0,0 +1,185 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+=pod
+
+=head1 NAME
+
+DataChecksums::Utils - Utility functions for testing data checksums in a running cluster
+
+=head1 SYNOPSIS
+
+  use PostgreSQL::Test::Cluster;
+  use DataChecksums::Utils qw( .. );
+
+  # Create, and start, a new cluster
+  my $node = PostgreSQL::Test::Cluster->new('primary');
+  $node->init;
+  $node->start;
+
+  test_checksum_state($node, 'off');
+
+  enable_data_checksums($node);
+
+  wait_for_checksum_state($node, 'on');
+
+
+=cut
+
+package DataChecksums::Utils;
+
+use strict;
+use warnings FATAL => 'all';
+use Exporter 'import';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+our @EXPORT = qw(
+  test_checksum_state
+  wait_for_checksum_state
+  enable_data_checksums
+  disable_data_checksums
+);
+
+=pod
+
+=head1 METHODS
+
+=over
+
+=item test_checksum_state(node, state)
+
+Test that the current value of the data checksum GUC in the server running
+at B<node> matches B<state>.  If the values differ, a test failure is logged.
+Returns True if the values match, otherwise False.
+
+=cut
+
+sub test_checksum_state
+{
+	my ($postgresnode, $state) = @_;
+
+	my $result = $postgresnode->safe_psql('postgres',
+		"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+	);
+	is($result, $state, 'ensure checksums are set to ' . $state);
+	return $result eq $state;
+}
+
+=item wait_for_checksum_state(node, state)
+
+Test the value of the data checksum GUC in the server running at B<node>
+repeatedly until it matches B<state> or times out.  Processing will run for
+$PostgreSQL::Test::Utils::timeout_default seconds before timing out.  If the
+values differ when the process times out, False is returned and a test failure
+is logged, otherwise True.
+
+=cut
+
+sub wait_for_checksum_state
+{
+	my ($postgresnode, $state) = @_;
+
+	my $res = $postgresnode->poll_query_until(
+		'postgres',
+		"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+		$state);
+	is($res, 1, 'ensure data checksums are transitioned to ' . $state);
+	return $res == 1;
+}
+
+=item enable_data_checksums($node, %params)
+
+Function for enabling data checksums in the cluster running at B<node>.
+
+=over
+
+=item cost_delay
+
+The C<cost_delay> to use when enabling data checksums, default is 0.
+
+=item cost_limit
+
+The C<cost_limit> to use when enabling data checksums, default is 100.
+
+=item fast
+
+If set to C<true> an immediate checkpoint will be issued after data
+checksums are enabled.  Setting this to false will lead to slower tests.
+The default is true.
+
+=item wait
+
+If defined, the function will wait for the state defined in this parameter,
+waiting timing out, before returning.  The function will wait for
+$PostgreSQL::Test::Utils::timeout_default seconds before timing out.
+
+=back
+
+=cut
+
+sub enable_data_checksums
+{
+	my $postgresnode = shift;
+	my %params = @_;
+
+	# Set sane defaults for the parameters
+	$params{cost_delay} = 0 unless (defined($params{cost_delay}));
+	$params{cost_limit} = 100 unless (defined($params{cost_limit}));
+	$params{fast} = 'true' unless (defined($params{fast}));
+
+	my $query = <<'EOQ';
+SELECT pg_enable_data_checksums(%s, %s, %s);
+EOQ
+
+	$postgresnode->safe_psql(
+		'postgres',
+		sprintf($query,
+			$params{cost_delay}, $params{cost_limit}, $params{fast}));
+
+	wait_for_checksum_state($postgresnode, $params{wait})
+	  if (defined($params{wait}));
+}
+
+=item disable_data_checksums($node, %params)
+
+Function for disabling data checksums in the cluster running at B<node>.
+
+=over
+
+=item wait
+
+If defined, the function will wait for the state to turn to B<off>, or
+waiting timing out, before returning.  The function will wait for
+$PostgreSQL::Test::Utils::timeout_default seconds before timing out.
+Unlike in C<enable_data_checksums> the value of the parameter is discarded.
+
+=back
+
+=cut
+
+sub disable_data_checksums
+{
+	my $postgresnode = shift;
+	my %params = @_;
+
+	# Set sane defaults for the parameters
+	$params{fast} = 'true' unless (defined($params{fast}));
+
+	my $query = <<'EOQ';
+SELECT pg_disable_data_checksums(%s);
+EOQ
+
+	$postgresnode->safe_psql('postgres', sprintf($query, $params{fast}));
+
+	wait_for_checksum_state($postgresnode, 'off') if (defined($params{wait}));
+}
+
+=pod
+
+=back
+
+=cut
+
+1;
diff --git a/src/test/modules/test_checksums/test_checksums--1.0.sql b/src/test/modules/test_checksums/test_checksums--1.0.sql
new file mode 100644
index 00000000000..704b45a3186
--- /dev/null
+++ b/src/test/modules/test_checksums/test_checksums--1.0.sql
@@ -0,0 +1,20 @@
+/* src/test/modules/test_checksums/test_checksums--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_checksums" to load this file. \quit
+
+CREATE FUNCTION dcw_inject_delay_barrier(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION dcw_inject_fail_database(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION dcw_prune_dblist(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION dcw_fake_temptable(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_checksums/test_checksums.c b/src/test/modules/test_checksums/test_checksums.c
new file mode 100644
index 00000000000..26897bff960
--- /dev/null
+++ b/src/test/modules/test_checksums/test_checksums.c
@@ -0,0 +1,173 @@
+/*--------------------------------------------------------------------------
+ *
+ * test_checksums.c
+ *		Test data checksums
+ *
+ * Copyright (c) 2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *		src/test/modules/test_checksums/test_checksums.c
+ *
+ * -------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "funcapi.h"
+#include "miscadmin.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/latch.h"
+#include "utils/injection_point.h"
+#include "utils/wait_event.h"
+
+#define USEC_PER_SEC    1000000
+
+
+PG_MODULE_MAGIC;
+
+extern PGDLLEXPORT void dc_delay_barrier(const char *name, const void *private_data, void *arg);
+extern PGDLLEXPORT void dc_fail_database(const char *name, const void *private_data, void *arg);
+extern PGDLLEXPORT void dc_dblist(const char *name, const void *private_data, void *arg);
+extern PGDLLEXPORT void dc_fake_temptable(const char *name, const void *private_data, void *arg);
+
+/*
+ * Test for delaying emission of procsignalbarriers.
+ */
+void
+dc_delay_barrier(const char *name, const void *private_data, void *arg)
+{
+	(void) name;
+	(void) private_data;
+
+	(void) WaitLatch(MyLatch,
+					 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+					 (3 * 1000),
+					 WAIT_EVENT_PG_SLEEP);
+}
+
+PG_FUNCTION_INFO_V1(dcw_inject_delay_barrier);
+Datum
+dcw_inject_delay_barrier(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksums-enable-checksums-delay",
+							 "test_checksums",
+							 "dc_delay_barrier",
+							 NULL,
+							 0);
+	else
+		InjectionPointDetach("datachecksums-enable-checksums-delay");
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
+
+void
+dc_fail_database(const char *name, const void *private_data, void *arg)
+{
+	static bool first_pass = true;
+	DataChecksumsWorkerResult *res = (DataChecksumsWorkerResult *) arg;
+
+	if (first_pass)
+		*res = DATACHECKSUMSWORKER_FAILED;
+	first_pass = false;
+}
+
+PG_FUNCTION_INFO_V1(dcw_inject_fail_database);
+Datum
+dcw_inject_fail_database(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksumsworker-fail-db",
+							 "test_checksums",
+							 "dc_fail_database",
+							 NULL,
+							 0);
+	else
+		InjectionPointDetach("datachecksumsworker-fail-db");
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
+
+/*
+ * Test to remove an entry from the Databaselist to force re-processing since
+ * not all databases could be processed in the first iteration of the loop.
+ */
+void
+dc_dblist(const char *name, const void *private_data, void *arg)
+{
+	static bool first_pass = true;
+	List	   *DatabaseList = (List *) arg;
+
+	if (first_pass)
+		DatabaseList = list_delete_last(DatabaseList);
+	first_pass = false;
+}
+
+PG_FUNCTION_INFO_V1(dcw_prune_dblist);
+Datum
+dcw_prune_dblist(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksumsworker-initial-dblist",
+							 "test_checksums",
+							 "dc_dblist",
+							 NULL,
+							 0);
+	else
+		InjectionPointDetach("datachecksumsworker-initial-dblist");
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
+
+/*
+ * Test to force waiting for existing temptables.
+ */
+void
+dc_fake_temptable(const char *name, const void *private_data, void *arg)
+{
+	static bool first_pass = true;
+	int		   *numleft = (int *) arg;
+
+	if (first_pass)
+		*numleft = 1;
+	first_pass = false;
+}
+
+PG_FUNCTION_INFO_V1(dcw_fake_temptable);
+Datum
+dcw_fake_temptable(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksumsworker-fake-temptable-wait",
+							 "test_checksums",
+							 "dc_fake_temptable",
+							 NULL,
+							 0);
+	else
+		InjectionPointDetach("datachecksumsworker-fake-temptable-wait");
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
diff --git a/src/test/modules/test_checksums/test_checksums.control b/src/test/modules/test_checksums/test_checksums.control
new file mode 100644
index 00000000000..84b4cc035a7
--- /dev/null
+++ b/src/test/modules/test_checksums/test_checksums.control
@@ -0,0 +1,4 @@
+comment = 'Test code for data checksums'
+default_version = '1.0'
+module_pathname = '$libdir/test_checksums'
+relocatable = true
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 35413f14019..3af7944acea 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3872,6 +3872,51 @@ sub advance_wal
 	}
 }
 
+=item $node->checksum_enable_offline()
+
+Enable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_enable_offline
+{
+	my ($self) = @_;
+
+	print "# Enabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-e');
+	return;
+}
+
+=item checksum_disable_offline
+
+Disable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_disable_offline
+{
+	my ($self) = @_;
+
+	print "# Disabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-d');
+	return;
+}
+
+sub checksum_verify_offline
+{
+	my ($self) = @_;
+
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-c');
+	return;
+}
+
 =pod
 
 =back
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 35e8aad7701..4b9c5526e50 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2071,6 +2071,42 @@ pg_stat_progress_create_index| SELECT s.pid,
     s.param15 AS partitions_done
    FROM (pg_stat_get_progress_info('CREATE INDEX'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
+pg_stat_progress_data_checksums| SELECT s.pid,
+    s.datid,
+    d.datname,
+        CASE s.param1
+            WHEN 0 THEN 'enabling'::text
+            WHEN 1 THEN 'disabling'::text
+            WHEN 2 THEN 'waiting'::text
+            WHEN 3 THEN 'waiting on temporary tables'::text
+            WHEN 4 THEN 'waiting on checkpoint'::text
+            WHEN 5 THEN 'done'::text
+            ELSE NULL::text
+        END AS phase,
+        CASE s.param2
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param2
+        END AS databases_total,
+    s.param3 AS databases_done,
+        CASE s.param4
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param4
+        END AS relations_total,
+        CASE s.param5
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param5
+        END AS relations_done,
+        CASE s.param6
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param6
+        END AS blocks_total,
+        CASE s.param7
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param7
+        END AS blocks_done
+   FROM (pg_stat_get_progress_info('DATACHECKSUMS'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)))
+  ORDER BY s.datid;
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,
diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 605f5070376..9042e4d38e3 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -59,6 +59,22 @@ io worker|relation|vacuum
 io worker|temp relation|normal
 io worker|wal|init
 io worker|wal|normal
+datachecksumsworker launcher|relation|bulkread
+datachecksumsworker launcher|relation|bulkwrite
+datachecksumsworker launcher|relation|init
+datachecksumsworker launcher|relation|normal
+datachecksumsworker launcher|relation|vacuum
+datachecksumsworker launcher|temp relation|normal
+datachecksumsworker launcher|wal|init
+datachecksumsworker launcher|wal|normal
+datachecksumsworker worker|relation|bulkread
+datachecksumsworker worker|relation|bulkwrite
+datachecksumsworker worker|relation|init
+datachecksumsworker worker|relation|normal
+datachecksumsworker worker|relation|vacuum
+datachecksumsworker worker|temp relation|normal
+datachecksumsworker worker|wal|init
+datachecksumsworker worker|wal|normal
 slotsync worker|relation|bulkread
 slotsync worker|relation|bulkwrite
 slotsync worker|relation|init
@@ -95,7 +111,7 @@ walsummarizer|wal|init
 walsummarizer|wal|normal
 walwriter|wal|init
 walwriter|wal|normal
-(79 rows)
+(87 rows)
 \a
 -- ensure that both seqscan and indexscan plans are allowed
 SET enable_seqscan TO on;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a13e8162890..df0f49ea2aa 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -416,6 +416,7 @@ CheckPointStmt
 CheckpointStatsData
 CheckpointerRequest
 CheckpointerShmemStruct
+ChecksumType
 Chromosome
 CkptSortItem
 CkptTsStatus
@@ -608,6 +609,10 @@ DataPageDeleteStack
 DataTypesUsageChecks
 DataTypesUsageVersionCheck
 DatabaseInfo
+DataChecksumsWorkerDatabase
+DataChecksumsWorkerResult
+DataChecksumsWorkerResultEntry
+DataChecksumsWorkerShmemStruct
 DateADT
 DateTimeErrorExtra
 Datum
@@ -4243,6 +4248,7 @@ xl_btree_split
 xl_btree_unlink_page
 xl_btree_update
 xl_btree_vacuum
+xl_checksum_state
 xl_clog_truncate
 xl_commit_ts_truncate
 xl_dbase_create_file_copy_rec
-- 
2.39.3 (Apple Git-146)

#54Tomas Vondra
tomas@vondra.me
In reply to: Daniel Gustafsson (#53)
Re: Changing the state of data checksums in a running cluster

On 8/27/25 10:30, Daniel Gustafsson wrote:

On 26 Aug 2025, at 01:06, Tomas Vondra <tomas@vondra.me> wrote:

I think this TAP looks very nice, but there's a couple issues with it.
See the attached patch fixing those.

Thanks, I have incorporated (most of) your patch in the attached. I did keep
the PG_TEST_EXTRA check for injection points though which I assume were removed
out of mistake.

Yes, that was a mistake.

With these changes it runs for me, and I even saw some

LOG: page verification failed

in tmp_check/log/006_concurrent_pgbench_standby_1.log. But it takes a
while - a couple minutes, maybe? I think I saw it at

t/006_concurrent_pgbench.pl .. 427/?

That's very interesting, I have been running it to timeout several times in a
row without hitting any verification failures. Will keep running.

Just to be clear - I don't see any pg_checksums failures either. I only
see failures in the standby log, and I don't think the script checks
that (it probably should).

or something like that. I think the bash version did a couple things
differently, which might make the failures more frequent (but it's just
a wild guess).

In particular, I think the script restarts the two nodes independently,
while the TAP always stops both primary and standby, in this order. I
think it'd be useful to restart one or both.

Done in the attached, it will now randomly stop one or both or none. If the
node is stopped I've added an offline pg_checksum step to validate the
datafiles as a why-not test.

The other thing is the bash script added some random delays/sleep, which
increases the test duration, but it also means generating somewhat
random amounts of data, etc. It also randomized some other stuff (scale,
client count, ...). But that can wait.

Added as well in a few places, maybe more can be sprinkled in.

Thanks. I'll take a look.

regards

--
Tomas Vondra

#55Daniel Gustafsson
daniel@yesql.se
In reply to: Tomas Vondra (#54)
1 attachment(s)
Re: Changing the state of data checksums in a running cluster

On 27 Aug 2025, at 11:39, Tomas Vondra <tomas@vondra.me> wrote:

Just to be clear - I don't see any pg_checksums failures either. I only
see failures in the standby log, and I don't think the script checks
that (it probably should).

Right, that's what I'm been checking too. I have been considering adding
another background process for monitoring all the log entries but I just
thought of a much simpler solution. When the clusters are turned off we can
take the opportunity to slurp the log written since last restart and inspect
it. The attached adds this.

It would probably be good to at some point clean this up a little by placing
all of variables for a single node in an associative hash which can be passed
around, and place repeated code in subroutines etc..

--
Daniel Gustafsson

Attachments:

v202508272-0001-Online-enabling-and-disabling-of-data-che.patchapplication/octet-stream; name=v202508272-0001-Online-enabling-and-disabling-of-data-che.patch; x-unix-mode=0644Download
From 670561e12c97e0a36b0efff06d778f9f6c63e978 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Fri, 15 Aug 2025 14:48:02 +0200
Subject: [PATCH v202508272] Online enabling and disabling of data checksums

This allows data checksums to be enabled, or disabled, in a running
cluster without restricting access to the cluster during processing.

Data checksums could prior to this only be enabled during initdb or
when the cluster is offline using the pg_checksums app. This commit
introduce functionality to enable, or disable, data checksums while
the cluster is running regardless of how it was initialized.

A background worker launcher process is responsible for launching a
dynamic per-database background worker which will mark all buffers
dirty for all relation with storage in order for them to have data
checksums calcuated on write.  Once all relations in all databases
have been processed, the data_checksums state will be set to on and
the cluster will at that point be identical to one which had data
checksums enabled during initialization or via offline processing.

When data checksums are being enabled, concurrent I/O operations
from backends other than the data checksums worker will write the
checksums but not verify them on reading.  Only when all backends
have absorbed the procsignalbarrier for setting data_checksums to
on will they also start verifying checksums on reading.  The same
process is repeated during disabling; all backends write checksums
but do not verify them until the barrier for setting the state to
off has been absorbed by all.  This in-progress state is used to
ensure there are no false negatives (or positives) due to reading
a checksum which is not in sync with the page.

This work is based on an earlier version of this patch which was
reviewed by among others Heikki Linnakangas, Robert Haas, Andres
Freund, Tomas Vondra, Michael Banck and Andrey Borodin.  Tomas
Vondra has given invaluable assistance with not only review but
very in-depth testing.

Author: Daniel Gustafsson <daniel@yesql.se>
Author: Magnus Hagander <magnus@hagander.net>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com
Discussion: https://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de
Discussion: https://postgr.es/m/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com
---
 doc/src/sgml/func/func-admin.sgml             |   71 +
 doc/src/sgml/glossary.sgml                    |   23 +
 doc/src/sgml/monitoring.sgml                  |  208 ++-
 doc/src/sgml/ref/pg_checksums.sgml            |    6 +
 doc/src/sgml/regress.sgml                     |   12 +
 doc/src/sgml/wal.sgml                         |   59 +-
 src/backend/access/rmgrdesc/xlogdesc.c        |   24 +
 src/backend/access/transam/xlog.c             |  547 +++++-
 src/backend/access/transam/xlogfuncs.c        |   43 +
 src/backend/backup/basebackup.c               |    6 +-
 src/backend/catalog/system_functions.sql      |   16 +
 src/backend/catalog/system_views.sql          |   20 +
 src/backend/postmaster/Makefile               |    1 +
 src/backend/postmaster/auxprocess.c           |   19 +
 src/backend/postmaster/bgworker.c             |    7 +
 src/backend/postmaster/datachecksumsworker.c  | 1463 +++++++++++++++++
 src/backend/postmaster/launch_backend.c       |    3 +
 src/backend/postmaster/meson.build            |    1 +
 src/backend/postmaster/postmaster.c           |    5 +
 src/backend/replication/logical/decode.c      |    1 +
 src/backend/storage/ipc/ipci.c                |    4 +
 src/backend/storage/ipc/procsignal.c          |   13 +
 src/backend/storage/page/README               |    4 +-
 src/backend/storage/page/bufpage.c            |    6 +-
 src/backend/utils/activity/pgstat_backend.c   |    2 +
 src/backend/utils/activity/pgstat_io.c        |    2 +
 .../utils/activity/wait_event_names.txt       |    4 +
 src/backend/utils/adt/pgstatfuncs.c           |    8 +-
 src/backend/utils/init/miscinit.c             |   12 +-
 src/backend/utils/init/postinit.c             |   20 +-
 src/backend/utils/misc/guc_tables.c           |   30 +-
 src/bin/pg_checksums/pg_checksums.c           |    4 +-
 src/bin/pg_controldata/pg_controldata.c       |    2 +
 src/bin/pg_upgrade/controldata.c              |    9 +
 src/include/access/xlog.h                     |   17 +-
 src/include/access/xlog_internal.h            |    7 +
 src/include/catalog/pg_control.h              |    6 +-
 src/include/catalog/pg_proc.dat               |   19 +
 src/include/commands/progress.h               |   16 +
 src/include/miscadmin.h                       |    6 +
 src/include/postmaster/datachecksumsworker.h  |   51 +
 src/include/storage/bufpage.h                 |    2 +-
 src/include/storage/checksum.h                |   14 +
 src/include/storage/lwlocklist.h              |    1 +
 src/include/storage/proc.h                    |    6 +-
 src/include/storage/procsignal.h              |    5 +
 src/include/utils/backend_progress.h          |    1 +
 src/test/Makefile                             |   11 +-
 src/test/modules/Makefile                     |    1 +
 src/test/modules/meson.build                  |    1 +
 src/test/modules/test_checksums/.gitignore    |    2 +
 src/test/modules/test_checksums/Makefile      |   40 +
 src/test/modules/test_checksums/README        |   22 +
 src/test/modules/test_checksums/meson.build   |   35 +
 .../modules/test_checksums/t/001_basic.pl     |   63 +
 .../modules/test_checksums/t/002_restarts.pl  |  110 ++
 .../test_checksums/t/003_standby_restarts.pl  |  114 ++
 .../modules/test_checksums/t/004_offline.pl   |   82 +
 .../modules/test_checksums/t/005_injection.pl |   76 +
 .../t/006_concurrent_pgbench.pl               |  326 ++++
 .../test_checksums/t/DataChecksums/Utils.pm   |  185 +++
 .../test_checksums/test_checksums--1.0.sql    |   20 +
 .../modules/test_checksums/test_checksums.c   |  173 ++
 .../test_checksums/test_checksums.control     |    4 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |   45 +
 src/test/regress/expected/rules.out           |   36 +
 src/test/regress/expected/stats.out           |   18 +-
 src/tools/pgindent/typedefs.list              |    6 +
 68 files changed, 4120 insertions(+), 56 deletions(-)
 create mode 100644 src/backend/postmaster/datachecksumsworker.c
 create mode 100644 src/include/postmaster/datachecksumsworker.h
 create mode 100644 src/test/modules/test_checksums/.gitignore
 create mode 100644 src/test/modules/test_checksums/Makefile
 create mode 100644 src/test/modules/test_checksums/README
 create mode 100644 src/test/modules/test_checksums/meson.build
 create mode 100644 src/test/modules/test_checksums/t/001_basic.pl
 create mode 100644 src/test/modules/test_checksums/t/002_restarts.pl
 create mode 100644 src/test/modules/test_checksums/t/003_standby_restarts.pl
 create mode 100644 src/test/modules/test_checksums/t/004_offline.pl
 create mode 100644 src/test/modules/test_checksums/t/005_injection.pl
 create mode 100644 src/test/modules/test_checksums/t/006_concurrent_pgbench.pl
 create mode 100644 src/test/modules/test_checksums/t/DataChecksums/Utils.pm
 create mode 100644 src/test/modules/test_checksums/test_checksums--1.0.sql
 create mode 100644 src/test/modules/test_checksums/test_checksums.c
 create mode 100644 src/test/modules/test_checksums/test_checksums.control

diff --git a/doc/src/sgml/func/func-admin.sgml b/doc/src/sgml/func/func-admin.sgml
index 57ff333159f..88d260795b8 100644
--- a/doc/src/sgml/func/func-admin.sgml
+++ b/doc/src/sgml/func/func-admin.sgml
@@ -2960,4 +2960,75 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 
   </sect2>
 
+  <sect2 id="functions-admin-checksum">
+   <title>Data Checksum Functions</title>
+
+   <para>
+    The functions shown in <xref linkend="functions-checksums-table" /> can
+    be used to enable or disable data checksums in a running cluster.
+    See <xref linkend="checksums" /> for details.
+   </para>
+
+   <table id="functions-checksums-table">
+    <title>Data Checksum Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_enable_data_checksums</primary>
+        </indexterm>
+        <function>pg_enable_data_checksums</function> ( <optional><parameter>cost_delay</parameter> <type>int</type>, <parameter>cost_limit</parameter> <type>int</type></optional> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Initiates data checksums for the cluster. This will switch the data
+        checksums mode to <literal>inprogress-on</literal> as well as start a
+        background worker that will process all pages in the database and
+        enable checksums on them. When all data pages have had checksums
+        enabled, the cluster will automatically switch data checksums mode to
+        <literal>on</literal>.
+       </para>
+       <para>
+        If <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter> are
+        specified, the speed of the process is throttled using the same principles as
+        <link linkend="runtime-config-resource-vacuum-cost">Cost-based Vacuum Delay</link>.
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_disable_data_checksums</primary>
+        </indexterm>
+        <function>pg_disable_data_checksums</function> ()
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Disables data checksum validation and calculation for the cluster. This
+        will switch the data checksum mode to <literal>inprogress-off</literal>
+        while data checksums are being disabled. When all active backends have
+        stopped validating data checksums, the data checksum mode will be
+        changed to <literal>off</literal>.  At this point the data pages will
+        still have checksums recorded but they are not updated when pages are
+        modified.
+       </para></entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+
+  </sect2>
+
   </sect1>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index b88cac598e9..a4e16d03aae 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -184,6 +184,8 @@
      (but not the autovacuum workers),
      the <glossterm linkend="glossary-background-writer">background writer</glossterm>,
      the <glossterm linkend="glossary-checkpointer">checkpointer</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker">data checksums worker</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker-launcher">data checksums worker launcher</glossterm>,
      the <glossterm linkend="glossary-logger">logger</glossterm>,
      the <glossterm linkend="glossary-startup-process">startup process</glossterm>,
      the <glossterm linkend="glossary-wal-archiver">WAL archiver</glossterm>,
@@ -573,6 +575,27 @@
    </glossdef>
   </glossentry>
 
+  <glossentry id="glossary-data-checksums-worker">
+   <glossterm>Data Checksums Worker</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which enables or disables data checksums in a specific database.
+    </para>
+   </glossdef>
+  </glossentry>
+
+  <glossentry id="glossary-data-checksums-worker-launcher">
+   <glossterm>Data Checksums Worker Launcher</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which starts <glossterm linkend="glossary-data-checksums-worker"> processes</glossterm>
+     for each database.
+    </para>
+   </glossdef>
+  </glossentry>
+
   <glossentry id="glossary-db-cluster">
    <glossterm>Database cluster</glossterm>
    <glossdef>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 3f4a27a736e..6082d991497 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3527,8 +3527,9 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Number of data page checksum failures detected in this
-       database (or on a shared object), or NULL if data checksums are
-       disabled.
+       database (or on a shared object).
+       Detected failures are reported regardless of the
+       <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -3538,8 +3539,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Time at which the last data page checksum failure was detected in
-       this database (or on a shared object), or NULL if data checksums are
-       disabled.
+       this database (or on a shared object). Last failure is reported
+       regardless of the <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -6877,6 +6878,205 @@ FROM pg_stat_get_backend_idset() AS backendid;
 
  </sect2>
 
+ <sect2 id="data-checksum-progress-reporting">
+  <title>Data Checksum Progress Reporting</title>
+
+  <indexterm>
+   <primary>pg_stat_progress_data_checksums</primary>
+  </indexterm>
+
+  <para>
+   When data checksums are being enabled on a running cluster, the
+   <structname>pg_stat_progress_data_checksums</structname> view will contain
+   a row for the launcher process, and one row for each worker process which
+   is currently calculating checksums for the data pages in one database.
+  </para>
+
+  <table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
+   <title><structname>pg_stat_progress_data_checksums</structname> View</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        Column Type
+       </para>
+       <para>
+        Description>
+       </para>
+      </entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>pid</structfield> <type>integer</type>
+       </para>
+       <para>
+        Process ID of a datachecksumworker process.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datid</structfield> <type>oid</type>
+      </para>
+      <para>
+       OID of this database, or 0 for the launcher process
+       relation
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datname</structfield> <type>name</type>
+      </para>
+      <para>
+       Name of this database, or <literal>NULL</literal> for the
+       launcher process.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>phase</structfield> <type>text</type>
+       </para>
+       <para>
+        Current processing phase, see <xref linkend="datachecksum-phases"/>
+        for description of the phases.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of databases which will be processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this set to <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of databases which have been processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this set to <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of relations which will be processed, or
+        <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of relations yet. The launcher process has
+        this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of relations which have been processed. The launcher
+        process has this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>blocks_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks in the current relation which will be processed,
+        or <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of blocks yet. The launcher process has
+        this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>blocks_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks in the current relation which have been processed.
+        The launcher process has this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+  <table id="datachecksum-phases">
+   <title>Data Checksum Phases</title>
+   <tgroup cols="2">
+    <colspec colname="col1" colwidth="1*"/>
+    <colspec colname="col2" colwidth="2*"/>
+    <thead>
+     <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+    <tbody>
+     <row>
+      <entry><literal>enabling</literal></entry>
+      <entry>
+       The command is currently enabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>disabling</literal></entry>
+      <entry>
+       The command is currently disabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on temporary tables</literal></entry>
+      <entry>
+       The command is currently waiting for all temporary tables which existed
+       at the time the command was started to be removed.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on checkpoint</literal></entry>
+      <entry>
+       The command is currently waiting for a checkpoint to update the checksum
+       state before finishing.
+      </entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+ </sect2>
+
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index 95043aa329c..0343710af53 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -45,6 +45,12 @@ PostgreSQL documentation
    exit status is nonzero if the operation failed.
   </para>
 
+  <para>
+   When enabling checksums, if checksums were in the process of being enabled
+   when the cluster was shut down, <application>pg_checksums</application>
+   will still process all relations regardless of the online processing.
+  </para>
+
   <para>
    When verifying checksums, every file in the cluster is scanned. When
    enabling checksums, each relation file block with a changed checksum is
diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index 8838fe7f022..7074751834e 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -263,6 +263,18 @@ make check-world PG_TEST_EXTRA='kerberos ldap ssl load_balance libpq_encryption'
 </programlisting>
    The following values are currently supported:
    <variablelist>
+    <varlistentry>
+     <term><literal>checksum_extended</literal></term>
+     <listitem>
+      <para>
+       Runs additional tests for enabling data checksums which inject delays
+       and re-tries in the processing, as well as tests that run pgbench
+       concurrently and randomly restarts the cluster.  Some of these test
+       suites requires injection points enabled in the installation.
+      </para>
+     </listitem>
+    </varlistentry>
+
     <varlistentry>
      <term><literal>kerberos</literal></term>
      <listitem>
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index f3b86b26be9..0ada90ca0b1 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -246,9 +246,10 @@
   <para>
    Checksums can be disabled when the cluster is initialized using <link
    linkend="app-initdb-data-checksums"><application>initdb</application></link>.
-   They can also be enabled or disabled at a later time as an offline
-   operation. Data checksums are enabled or disabled at the full cluster
-   level, and cannot be specified individually for databases or tables.
+   They can also be enabled or disabled at a later time either as an offline
+   operation or online in a running cluster allowing concurrent access. Data
+   checksums are enabled or disabled at the full cluster level, and cannot be
+   specified individually for databases or tables.
   </para>
 
   <para>
@@ -265,7 +266,7 @@
   </para>
 
   <sect2 id="checksums-offline-enable-disable">
-   <title>Off-line Enabling of Checksums</title>
+   <title>Offline Enabling of Checksums</title>
 
    <para>
     The <link linkend="app-pgchecksums"><application>pg_checksums</application></link>
@@ -274,6 +275,56 @@
    </para>
 
   </sect2>
+
+  <sect2 id="checksums-online-enable-disable">
+   <title>Online Enabling of Checksums</title>
+
+   <para>
+    Checksums can be enabled or disabled online, by calling the appropriate
+    <link linkend="functions-admin-checksum">functions</link>.
+   </para>
+
+   <para>
+    Enabling checksums will put the cluster checksum mode in
+    <literal>inprogress-on</literal> mode.  During this time, checksums will be
+    written but not verified. In addition to this, a background worker process
+    is started that enables checksums on all existing data in the cluster. Once
+    this worker has completed processing all databases in the cluster, the
+    checksum mode will automatically switch to <literal>on</literal>. The
+    processing will consume two background worker processes, make sure that
+    <varname>max_worker_processes</varname> allows for at least two more
+    additional processes.
+   </para>
+
+   <para>
+    The process will initially wait for all open transactions to finish before
+    it starts, so that it can be certain that there are no tables that have been
+    created inside a transaction that has not committed yet and thus would not
+    be visible to the process enabling checksums. It will also, for each database,
+    wait for all pre-existing temporary tables to get removed before it finishes.
+    If long-lived temporary tables are used in the application it may be necessary
+    to terminate these application connections to allow the process to complete.
+   </para>
+
+   <para>
+    If the cluster is stopped while in <literal>inprogress-on</literal> mode, for
+    any reason, then this process must be restarted manually. To do this,
+    re-execute the function <function>pg_enable_data_checksums()</function>
+    once the cluster has been restarted. The process will start over, there is
+    no support for resuming work from where it was interrupted.
+   </para>
+
+   <note>
+    <para>
+     Enabling checksums can cause significant I/O to the system, as most of the
+     database pages will need to be rewritten, and will be written both to the
+     data files and the WAL. The impact may be limited by throttling using the
+     <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter>
+     parameters of the <function>pg_enable_data_checksums</function> function.
+    </para>
+   </note>
+
+  </sect2>
  </sect1>
 
   <sect1 id="wal-intro">
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index cd6c2a2f650..c50d654db30 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -18,6 +18,7 @@
 #include "access/xlog.h"
 #include "access/xlog_internal.h"
 #include "catalog/pg_control.h"
+#include "storage/bufpage.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -167,6 +168,26 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
 		memcpy(&wal_level, rec, sizeof(int));
 		appendStringInfo(buf, "wal_level %s", get_wal_level_string(wal_level));
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state xlrec;
+
+		memcpy(&xlrec, rec, sizeof(xl_checksum_state));
+		switch (xlrec.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_VERSION:
+				appendStringInfoString(buf, "on");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				appendStringInfoString(buf, "inprogress-off");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				appendStringInfoString(buf, "inprogress-on");
+				break;
+			default:
+				appendStringInfoString(buf, "off");
+		}
+	}
 }
 
 const char *
@@ -218,6 +239,9 @@ xlog_identify(uint8 info)
 		case XLOG_CHECKPOINT_REDO:
 			id = "CHECKPOINT_REDO";
 			break;
+		case XLOG_CHECKSUMS:
+			id = "CHECKSUMS";
+			break;
 	}
 
 	return id;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 7ffb2179151..46edf531359 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -550,6 +550,9 @@ typedef struct XLogCtlData
 	 */
 	XLogRecPtr	lastFpwDisableRecPtr;
 
+	/* last data_checksum_version we've seen */
+	uint32		data_checksum_version;
+
 	slock_t		info_lck;		/* locks shared variables shown above */
 } XLogCtlData;
 
@@ -647,6 +650,36 @@ static XLogRecPtr LocalMinRecoveryPoint;
 static TimeLineID LocalMinRecoveryPointTLI;
 static bool updateMinRecoveryPoint = true;
 
+/*
+ * Local state fror Controlfile data_checksum_version.  After initialization
+ * this is only updated when absorbing a procsignal barrier during interrupt
+ * processing.  The reason for keeping a copy in backend-private memory is to
+ * avoid locking for interrogating checksum state.  Possible values are the
+ * checksum versions defined in storage/bufpage.h as well as zero when data
+ * checksums are disabled.
+ */
+static uint32 LocalDataChecksumVersion = 0;
+
+/*
+ * Flag to remember if the procsignalbarrier being absorbed for checksums is
+ * the first one.  The first procsignalbarrier can in rare cases be for the
+ * state we've initialized, i.e. a duplicate.  This may happen for any
+ * data_checksum_version value, but for PG_DATA_CHECKSUM_ON_VERSION this would
+ * trigger an assert failure (this is the only transition with an assert) when
+ * processing the barrier.  This may happen if the process is spawned between
+ * the update of XLogCtl->data_checksum_version and the barrier being emitted.
+ * This can only happen on the very first barrier so mark that with this flag.
+ */
+static bool InitialDataChecksumTransition = true;
+
+/*
+ * Variable backing the GUC, keep it in sync with LocalDataChecksumVersion.
+ * See SetLocalDataChecksumVersion().
+ */
+int			data_checksums = 0;
+
+static void SetLocalDataChecksumVersion(uint32 data_checksum_version);
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -715,6 +748,8 @@ static void WALInsertLockAcquireExclusive(void);
 static void WALInsertLockRelease(void);
 static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);
 
+static void XLogChecksums(uint32 new_type);
+
 /*
  * Insert an XLOG record represented by an already-constructed chain of data
  * chunks.  This is a low-level routine; to construct the WAL record header
@@ -828,9 +863,10 @@ XLogInsertRecord(XLogRecData *rdata,
 		 * only happen just after a checkpoint, so it's better to be slow in
 		 * this case and fast otherwise.
 		 *
-		 * Also check to see if fullPageWrites was just turned on or there's a
-		 * running backup (which forces full-page writes); if we weren't
-		 * already doing full-page writes then go back and recompute.
+		 * Also check to see if fullPageWrites was just turned on, there's a
+		 * running backup or if checksums are enabled (all of which forces
+		 * full-page writes); if we weren't already doing full-page writes
+		 * then go back and recompute.
 		 *
 		 * If we aren't doing full-page writes then RedoRecPtr doesn't
 		 * actually affect the contents of the XLOG record, so we'll update
@@ -843,7 +879,9 @@ XLogInsertRecord(XLogRecData *rdata,
 			Assert(RedoRecPtr < Insert->RedoRecPtr);
 			RedoRecPtr = Insert->RedoRecPtr;
 		}
-		doPageWrites = (Insert->fullPageWrites || Insert->runningBackups > 0);
+		doPageWrites = (Insert->fullPageWrites ||
+						Insert->runningBackups > 0 ||
+						DataChecksumsNeedWrite());
 
 		if (doPageWrites &&
 			(!prevDoPageWrites ||
@@ -4229,6 +4267,12 @@ InitControlFile(uint64 sysidentifier, uint32 data_checksum_version)
 	ControlFile->wal_log_hints = wal_log_hints;
 	ControlFile->track_commit_timestamp = track_commit_timestamp;
 	ControlFile->data_checksum_version = data_checksum_version;
+
+	/*
+	 * Set the data_checksum_version value into XLogCtl, which is where all
+	 * processes get the current value from. (Maybe it should go just there?)
+	 */
+	XLogCtl->data_checksum_version = data_checksum_version;
 }
 
 static void
@@ -4552,10 +4596,6 @@ ReadControlFile(void)
 		(SizeOfXLogLongPHD - SizeOfXLogShortPHD);
 
 	CalculateCheckpointSegments();
-
-	/* Make the initdb settings visible as GUC variables, too */
-	SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no",
-					PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
 }
 
 /*
@@ -4589,13 +4629,374 @@ GetMockAuthenticationNonce(void)
 }
 
 /*
- * Are checksums enabled for data pages?
+ * DataChecksumsNeedWrite
+ *		Returns whether data checksums must be written or not
+ *
+ * Returns true iff data checksums are enabled or are in the process of being
+ * enabled. During "inprogress-on" and "inprogress-off" states checksums must
+ * be written even though they are not verified (see datachecksumsworker.c for
+ * a longer discussion).
+ *
+ * This function is intended for callsites which are about to write a data page
+ * to storage, and need to know whether to re-calculate the checksum for the
+ * page header. Calling this function must be performed as close to the write
+ * operation as possible to keep the critical section short.
+ */
+bool
+DataChecksumsNeedWrite(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * DataChecksumsNeedVerify
+ *		Returns whether data checksums must be verified or not
+ *
+ * Data checksums are only verified if they are fully enabled in the cluster.
+ * During the "inprogress-on" and "inprogress-off" states they are only
+ * updated, not verified (see datachecksumsworker.c for a longer discussion).
+ *
+ * This function is intended for callsites which have read data and are about
+ * to perform checksum validation based on the result of this.  Calling this
+ * function must be performed as close to the validation call as possible to
+ * keep the critical section short. This is in order to protect against time of
+ * check/time of use situations around data checksum validation.
+ */
+bool
+DataChecksumsNeedVerify(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
+}
+
+/*
+ * DataChecksumsOnInProgress
+ *		Returns whether data checksums are being enabled
+ *
+ * Most operations don't need to worry about the "inprogress" states, and
+ * should use DataChecksumsNeedVerify() or DataChecksumsNeedWrite(). The
+ * "inprogress-on" state for enabling checksums is used when the checksum
+ * worker is setting checksums on all pages, it can thus be used to check for
+ * aborted checksum processing which need to be restarted.
+ */
+inline bool
+DataChecksumsOnInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+}
+
+/*
+ * DataChecksumsOffInProgress
+ *		Returns whether data checksums are being disabled
+ *
+ * The "inprogress-off" state for disabling checksums is used for when the
+ * worker resets the catalog state.  DataChecksumsNeedVerify() or
+ * DataChecksumsNeedWrite() should be used for deciding whether to read/write
+ * checksums.
  */
 bool
-DataChecksumsEnabled(void)
+DataChecksumsOffInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * SetDataChecksumsOnInProgress
+ *		Sets the data checksum state to "inprogress-on" to enable checksums
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on". See
+ * SetDataChecksumsOn below for a description on how this state change works.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOnInProgress(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile != NULL);
+
+	/*
+	 * The state transition is performed in a critical section with
+	 * checkpoints held off to provide crash safety.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state change in all backends to ensure that all backends are in
+	 * "inprogress-on". Once done we know that all backends are writing data
+	 * checksums.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOn
+ *		Enables data checksums cluster-wide
+ *
+ * Enabling data checksums is performed using two barriers, the first one to
+ * set the state to "inprogress-on" (done by SetDataChecksumsOnInProgress())
+ * and the second one to set the state to "on" (done here). Below is a short
+ * description of the processing, a more detailed write-up can be found in
+ * datachecksumsworker.c.
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on".  This state
+ * requires data checksums to be written but not verified. This ensures that
+ * all data pages can be checksummed without the risk of false negatives in
+ * validation during the process.  When all existing pages are guaranteed to
+ * have checksums, and all new pages will be initiated with checksums, the
+ * state can be changed to "on". Once the state is "on" checksums will be both
+ * written and verified. See datachecksumsworker.c for a longer discussion on
+ * how data checksums can be enabled in a running cluster.
+ *
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOn(void)
 {
+	uint64		barrier;
+
 	Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+
+	/*
+	 * The only allowed state transition to "on" is from "inprogress-on" since
+	 * that state ensures that all pages will have data checksums written.
+	 */
+	if (XLogCtl->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		SpinLockRelease(&XLogCtl->info_lck);
+		elog(ERROR, "checksums not in \"inprogress-on\" mode");
+	}
+
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	INJECTION_POINT("datachecksums-enable-checksums-delay", NULL);
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_VERSION);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state transition of "on" in all backends. When done we know that
+	 * data checksums are enabled in all backends and data checksums are both
+	 * written and verified.
+	 */
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * SetDataChecksumsOff
+ *		Disables data checksums cluster-wide
+ *
+ * Disabling data checksums must be performed with two sets of barriers, each
+ * carrying a different state. The state is first set to "inprogress-off"
+ * during which checksums are still written but not verified. This ensures that
+ * backends which have yet to observe the state change from "on" won't get
+ * validation errors on concurrently modified pages. Once all backends have
+ * changed to "inprogress-off", the barrier for moving to "off" can be emitted.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOff(void)
+{
+	uint64		barrier;
+
+	Assert(ControlFile);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+
+	/* If data checksums are already disabled there is nothing to do */
+	if (XLogCtl->data_checksum_version == 0)
+	{
+		SpinLockRelease(&XLogCtl->info_lck);
+		return;
+	}
+
+	/*
+	 * If data checksums are currently enabled we first transition to the
+	 * "inprogress-off" state during which backends continue to write
+	 * checksums without verifying them. When all backends are in
+	 * "inprogress-off" the next transition to "off" can be performed, after
+	 * which all data checksum processing is disabled.
+	 */
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
+	{
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+		START_CRIT_SECTION();
+
+		XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+
+		END_CRIT_SECTION();
+		MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+		/*
+		 * Update local state in all backends to ensure that any backend in
+		 * "on" state is changed to "inprogress-off".
+		 */
+		WaitForProcSignalBarrier(barrier);
+
+		/*
+		 * At this point we know that no backends are verifying data checksums
+		 * during reading. Next, we can safely move to state "off" to also
+		 * stop writing checksums.
+		 */
+	}
+	else
+	{
+		/*
+		 * Ending up here implies that the checksums state is "inprogress-on"
+		 * or "inprogress-off" and we can transition directly to "off" from
+		 * there.
+		 */
+		SpinLockRelease(&XLogCtl->info_lck);
+	}
+
+	/*
+	 * Ensure that we don't incur a checkpoint during disabling checksums.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(0);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = 0;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	WaitForProcSignalBarrier(barrier);
+}
+
+/*
+ * ProcSignalBarrier absorption functions for enabling and disabling data
+ * checksums in a running cluster. The procsignalbarriers are emitted in the
+ * SetDataChecksums* functions.
+ */
+bool
+AbsorbChecksumsOnInProgressBarrier(void)
+{
+	Assert(LocalDataChecksumVersion != PG_DATA_CHECKSUM_VERSION);
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+	return true;
+}
+
+bool
+AbsorbChecksumsOnBarrier(void)
+{
+	/*
+	 * If the process was spawned between updating XLogCtl and emitting the
+	 * barrier it will have seen the updated value, so for the first barrier
+	 * we accept both "on" and "inprogress-on".
+	 */
+	Assert((LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION) ||
+		   (InitialDataChecksumTransition &&
+			(LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)));
+
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_VERSION);
+	InitialDataChecksumTransition = false;
+	return true;
+}
+
+bool
+AbsorbChecksumsOffInProgressBarrier(void)
+{
+	Assert(LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+	return true;
+}
+
+bool
+AbsorbChecksumsOffBarrier(void)
+{
+	/*
+	 * We should never get here directly from a cluster with data checksums
+	 * enabled, an inprogress state should be in between.  When there are no
+	 * failures the inprogress-off state should preceed, but in case of error
+	 * in processing we can also reach here from the inprogress-on state.
+	 */
+	Assert((LocalDataChecksumVersion != PG_DATA_CHECKSUM_VERSION) &&
+		   (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION));
+	SetLocalDataChecksumVersion(PG_DATA_CHECKSUM_OFF);
+	return true;
+}
+
+/*
+ * InitLocalControlData
+ *
+ * Set up backend local caches of controldata variables which may change at
+ * any point during runtime and thus require special cased locking. So far
+ * this only applies to data_checksum_version, but it's intended to be general
+ * purpose enough to handle future cases.
+ */
+void
+InitLocalDataChecksumVersion(void)
+{
+	SpinLockAcquire(&XLogCtl->info_lck);
+	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+	SpinLockRelease(&XLogCtl->info_lck);
+}
+
+void
+SetLocalDataChecksumVersion(uint32 data_checksum_version)
+{
+	LocalDataChecksumVersion = data_checksum_version;
+
+	data_checksums = data_checksum_version;
+}
+
+/* guc hook */
+const char *
+show_data_checksums(void)
+{
+	if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)
+		return "on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		return "inprogress-on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+		return "inprogress-off";
+	else
+		return "off";
 }
 
 /*
@@ -4870,6 +5271,7 @@ LocalProcessControlFile(bool reset)
 	Assert(reset || ControlFile == NULL);
 	ControlFile = palloc(sizeof(ControlFileData));
 	ReadControlFile();
+	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
 }
 
 /*
@@ -5039,6 +5441,11 @@ XLOGShmemInit(void)
 	XLogCtl->InstallXLogFileSegmentActive = false;
 	XLogCtl->WalWriterSleeping = false;
 
+	/* Use the checksum info from control file */
+	XLogCtl->data_checksum_version = ControlFile->data_checksum_version;
+
+	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+
 	SpinLockInit(&XLogCtl->Insert.insertpos_lck);
 	SpinLockInit(&XLogCtl->info_lck);
 	pg_atomic_init_u64(&XLogCtl->logInsertResult, InvalidXLogRecPtr);
@@ -6180,6 +6587,47 @@ StartupXLOG(void)
 	pfree(endOfRecoveryInfo->recoveryStopReason);
 	pfree(endOfRecoveryInfo);
 
+	/*
+	 * If we reach this point with checksums in the state inprogress-on, it
+	 * means that data checksums were in the process of being enabled when the
+	 * cluster shut down. Since processing didn't finish, the operation will
+	 * have to be restarted from scratch since there is no capability to
+	 * continue where it was when the cluster shut down. Thus, revert the
+	 * state back to off, and inform the user with a warning message. Being
+	 * able to restart processing is a TODO, but it wouldn't be possible to
+	 * restart here since we cannot launch a dynamic background worker
+	 * directly from here (it has to be from a regular backend).
+	 */
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		XLogChecksums(0);
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = 0;
+		SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		ereport(WARNING,
+				(errmsg("data checksums state has been set of off"),
+				 errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
+	}
+
+	/*
+	 * If data checksums were being disabled when the cluster was shut down,
+	 * we know that we have a state where all backends have stopped validating
+	 * checksums and we can move to off instead of prompting the user to
+	 * perform any action.
+	 */
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+	{
+		XLogChecksums(0);
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = 0;
+		SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+		SpinLockRelease(&XLogCtl->info_lck);
+	}
+
 	/*
 	 * All done with end-of-recovery actions.
 	 *
@@ -6471,7 +6919,7 @@ GetRedoRecPtr(void)
 	XLogRecPtr	ptr;
 
 	/*
-	 * The possibly not up-to-date copy in XlogCtl is enough. Even if we
+	 * The possibly not up-to-date copy in XLogCtl is enough. Even if we
 	 * grabbed a WAL insertion lock to read the authoritative value in
 	 * Insert->RedoRecPtr, someone might update it just after we've released
 	 * the lock.
@@ -7035,6 +7483,12 @@ CreateCheckPoint(int flags)
 	checkPoint.fullPageWrites = Insert->fullPageWrites;
 	checkPoint.wal_level = wal_level;
 
+	/*
+	 * Get the current data_checksum_version value from xlogctl, valid at the
+	 * time of the checkpoint.
+	 */
+	checkPoint.data_checksum_version = XLogCtl->data_checksum_version;
+
 	if (shutdown)
 	{
 		XLogRecPtr	curInsert = XLogBytePosToRecPtr(Insert->CurrBytePos);
@@ -7290,6 +7744,9 @@ CreateCheckPoint(int flags)
 	ControlFile->minRecoveryPoint = InvalidXLogRecPtr;
 	ControlFile->minRecoveryPointTLI = 0;
 
+	/* make sure we start with the checksum version as of the checkpoint */
+	ControlFile->data_checksum_version = checkPoint.data_checksum_version;
+
 	/*
 	 * Persist unloggedLSN value. It's reset on crash recovery, so this goes
 	 * unused on non-shutdown checkpoints, but seems useful to store it always
@@ -7435,6 +7892,10 @@ CreateEndOfRecoveryRecord(void)
 	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
 	ControlFile->minRecoveryPoint = recptr;
 	ControlFile->minRecoveryPointTLI = xlrec.ThisTimeLineID;
+
+	/* start with the latest checksum version (as of the end of recovery) */
+	ControlFile->data_checksum_version = XLogCtl->data_checksum_version;
+
 	UpdateControlFile();
 	LWLockRelease(ControlFileLock);
 
@@ -7776,6 +8237,10 @@ CreateRestartPoint(int flags)
 			if (flags & CHECKPOINT_IS_SHUTDOWN)
 				ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY;
 		}
+
+		/* we shall start with the latest checksum version */
+		ControlFile->data_checksum_version = lastCheckPoint.data_checksum_version;
+
 		UpdateControlFile();
 	}
 	LWLockRelease(ControlFileLock);
@@ -8187,6 +8652,24 @@ XLogReportParameters(void)
 	}
 }
 
+/*
+ * Log the new state of checksums
+ */
+static void
+XLogChecksums(uint32 new_type)
+{
+	xl_checksum_state xlrec;
+	XLogRecPtr	recptr;
+
+	xlrec.new_checksumtype = new_type;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, sizeof(xl_checksum_state));
+
+	recptr = XLogInsert(RM_XLOG_ID, XLOG_CHECKSUMS);
+	XLogFlush(recptr);
+}
+
 /*
  * Update full_page_writes in shared memory, and write an
  * XLOG_FPW_CHANGE record if necessary.
@@ -8605,6 +9088,46 @@ xlog_redo(XLogReaderState *record)
 	{
 		/* nothing to do here, just for informational purposes */
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state state;
+		uint64		barrier;
+
+		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = state.new_checksumtype;
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		/*
+		 * Block on a procsignalbarrier to await all processes having seen the
+		 * change to checksum status. Once the barrier has been passed we can
+		 * initiate the corresponding processing.
+		 */
+		switch (state.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			default:
+				Assert(state.new_checksumtype == 0);
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+		}
+	}
 }
 
 /*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 8c3090165f0..337932a89e5 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -26,6 +26,7 @@
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "postmaster/datachecksumsworker.h"
 #include "replication/walreceiver.h"
 #include "storage/fd.h"
 #include "storage/latch.h"
@@ -748,3 +749,45 @@ pg_promote(PG_FUNCTION_ARGS)
 						   wait_seconds)));
 	PG_RETURN_BOOL(false);
 }
+
+/*
+ * Disables data checksums for the cluster, if applicable. Starts a background
+ * worker which turns off the data checksums.
+ */
+Datum
+disable_data_checksums(PG_FUNCTION_ARGS)
+{
+	bool		fast = PG_GETARG_BOOL(0);
+
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	StartDataChecksumsWorkerLauncher(DISABLE_DATACHECKSUMS, 0, 0, fast);
+	PG_RETURN_VOID();
+}
+
+/*
+ * Enables data checksums for the cluster, if applicable.  Supports vacuum-
+ * like cost based throttling to limit system load. Starts a background worker
+ * which updates data checksums on existing data.
+ */
+Datum
+enable_data_checksums(PG_FUNCTION_ARGS)
+{
+	int			cost_delay = PG_GETARG_INT32(0);
+	int			cost_limit = PG_GETARG_INT32(1);
+	bool		fast = PG_GETARG_BOOL(2);
+
+	if (!superuser())
+		ereport(ERROR, errmsg("must be superuser"));
+
+	if (cost_delay < 0)
+		ereport(ERROR, errmsg("cost delay cannot be a negative value"));
+
+	if (cost_limit <= 0)
+		ereport(ERROR, errmsg("cost limit must be greater than zero"));
+
+	StartDataChecksumsWorkerLauncher(ENABLE_DATACHECKSUMS, cost_delay, cost_limit, fast);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index bb7d90aa5d9..54dcfbcb333 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1613,7 +1613,8 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	 * enabled for this cluster, and if this is a relation file, then verify
 	 * the checksum.
 	 */
-	if (!noverify_checksums && DataChecksumsEnabled() &&
+	if (!noverify_checksums &&
+		DataChecksumsNeedWrite() &&
 		RelFileNumberIsValid(relfilenumber))
 		verify_checksum = true;
 
@@ -2007,6 +2008,9 @@ verify_page_checksum(Page page, XLogRecPtr start_lsn, BlockNumber blkno,
 	if (PageIsNew(page) || PageGetLSN(page) >= start_lsn)
 		return true;
 
+	if (!DataChecksumsNeedVerify())
+		return true;
+
 	/* Perform the actual checksum calculation. */
 	checksum = pg_checksum_page(page, blkno);
 
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 566f308e443..dea7ad3cf30 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -650,6 +650,18 @@ LANGUAGE INTERNAL
 CALLED ON NULL INPUT VOLATILE PARALLEL SAFE
 AS 'pg_stat_reset_slru';
 
+CREATE OR REPLACE FUNCTION
+  pg_enable_data_checksums(cost_delay integer DEFAULT 0,
+                           cost_limit integer DEFAULT 100,
+						   fast boolean DEFAULT false)
+  RETURNS void STRICT VOLATILE LANGUAGE internal AS 'enable_data_checksums'
+  PARALLEL RESTRICTED;
+
+CREATE OR REPLACE FUNCTION
+  pg_disable_data_checksums(fast boolean DEFAULT false)
+  RETURNS void STRICT VOLATILE LANGUAGE internal AS 'disable_data_checksums'
+  PARALLEL RESTRICTED;
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
@@ -775,6 +787,10 @@ REVOKE EXECUTE ON FUNCTION pg_ls_logicalmapdir() FROM PUBLIC;
 
 REVOKE EXECUTE ON FUNCTION pg_ls_replslotdir(text) FROM PUBLIC;
 
+REVOKE EXECUTE ON FUNCTION pg_enable_data_checksums(integer, integer, boolean) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_disable_data_checksums(boolean) FROM public;
+
 --
 -- We also set up some things as accessible to standard roles.
 --
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 1b3c5a55882..22f67c7ee4a 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1354,6 +1354,26 @@ CREATE VIEW pg_stat_progress_copy AS
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_data_checksums AS
+    SELECT
+        S.pid AS pid, S.datid, D.datname AS datname,
+        CASE S.param1 WHEN 0 THEN 'enabling'
+                      WHEN 1 THEN 'disabling'
+                      WHEN 2 THEN 'waiting'
+                      WHEN 3 THEN 'waiting on temporary tables'
+                      WHEN 4 THEN 'waiting on checkpoint'
+                      WHEN 5 THEN 'done'
+                      END AS phase,
+        CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+        S.param3 AS databases_done,
+        CASE S.param4 WHEN -1 THEN NULL ELSE S.param4 END AS relations_total,
+        CASE S.param5 WHEN -1 THEN NULL ELSE S.param5 END AS relations_done,
+        CASE S.param6 WHEN -1 THEN NULL ELSE S.param6 END AS blocks_total,
+        CASE S.param7 WHEN -1 THEN NULL ELSE S.param7 END AS blocks_done
+    FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid
+    ORDER BY S.datid; -- return the launcher process first
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/postmaster/Makefile b/src/backend/postmaster/Makefile
index 0f4435d2d97..0c36765acfe 100644
--- a/src/backend/postmaster/Makefile
+++ b/src/backend/postmaster/Makefile
@@ -18,6 +18,7 @@ OBJS = \
 	bgworker.o \
 	bgwriter.o \
 	checkpointer.o \
+	datachecksumsworker.o \
 	fork_process.o \
 	interrupt.o \
 	launch_backend.o \
diff --git a/src/backend/postmaster/auxprocess.c b/src/backend/postmaster/auxprocess.c
index a6d3630398f..5742a1dd724 100644
--- a/src/backend/postmaster/auxprocess.c
+++ b/src/backend/postmaster/auxprocess.c
@@ -15,6 +15,7 @@
 #include <unistd.h>
 #include <signal.h>
 
+#include "access/xlog.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/auxprocess.h"
@@ -68,6 +69,24 @@ AuxiliaryProcessMainCommon(void)
 
 	ProcSignalInit(NULL, 0);
 
+	/*
+	 * Initialize a local cache of the data_checksum_version, to be updated by
+	 * the procsignal-based barriers.
+	 *
+	 * This intentionally happens after initializing the procsignal, otherwise
+	 * we might miss a state change. This means we can get a barrier for the
+	 * state we've just initialized - but it can happen only once.
+	 *
+	 * The postmaster (which is what gets forked into the new child process)
+	 * does not handle barriers, therefore it may not have the current value
+	 * of LocalDataChecksumVersion value (it'll have the value read from the
+	 * control file, which may be arbitrarily old).
+	 *
+	 * NB: Even if the postmaster handled barriers, the value might still be
+	 * stale, as it might have changed after this process forked.
+	 */
+	InitLocalDataChecksumVersion();
+
 	/*
 	 * Auxiliary processes don't run transactions, but they may need a
 	 * resource owner anyway to manage buffer pins acquired outside
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 1ad65c237c3..0d2ade1f905 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -18,6 +18,7 @@
 #include "pgstat.h"
 #include "port/atomics.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "replication/logicallauncher.h"
 #include "replication/logicalworker.h"
@@ -132,6 +133,12 @@ static const struct
 	},
 	{
 		"TablesyncWorkerMain", TablesyncWorkerMain
+	},
+	{
+		"DataChecksumsWorkerLauncherMain", DataChecksumsWorkerLauncherMain
+	},
+	{
+		"DataChecksumsWorkerMain", DataChecksumsWorkerMain
 	}
 };
 
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
new file mode 100644
index 00000000000..ff451d502ba
--- /dev/null
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -0,0 +1,1463 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.c
+ *	  Background worker for enabling or disabling data checksums online
+ *
+ * When enabling data checksums on a database at initdb time or when shut down
+ * with pg_checksums, no extra process is required as each page is checksummed,
+ * and verified, when accessed.  When enabling checksums on an already running
+ * cluster, this worker will ensure that all pages are checksummed before
+ * verification of the checksums is turned on. In the case of disabling
+ * checksums, the state transition is performed only in the control file, no
+ * changes are performed on the data pages.
+ *
+ * Checksums can be either enabled or disabled cluster-wide, with on/off being
+ * the end state for data_checksums.
+ *
+ * Enabling checksums
+ * ------------------
+ * When enabling checksums in an online cluster, data_checksums will be set to
+ * "inprogress-on" which signals that write operations MUST compute and write
+ * the checksum on the data page, but during reading the checksum SHALL NOT be
+ * verified. This ensures that all objects created during checksumming will
+ * have checksums set, but no reads will fail due to incorrect checksum. The
+ * DataChecksumsWorker will compile a list of databases which exist at the
+ * start of checksumming, and all of these which haven't been dropped during
+ * the processing MUST have been processed successfully in order for checksums
+ * to be enabled. Any new relation created during processing will see the
+ * in-progress state and will automatically be checksummed.
+ *
+ * For each database, all relations which have storage are read and every data
+ * page is marked dirty to force a write with the checksum. This will generate
+ * a lot of WAL as the entire database is read and written.
+ *
+ * If the processing is interrupted by a cluster restart, it will be restarted
+ * from the beginning again as state isn't persisted.
+ *
+ * Disabling checksums
+ * -------------------
+ * When disabling checksums, data_checksums will be set to "inprogress-off"
+ * which signals that checksums are written but no longer verified. This ensure
+ * that backends which have yet to move from the "on" state will still be able
+ * to process data checksum validation.
+ *
+ * Synchronization and Correctness
+ * -------------------------------
+ * The processes involved in enabling, or disabling, data checksums in an
+ * online cluster must be properly synchronized with the normal backends
+ * serving concurrent queries to ensure correctness. Correctness is defined
+ * as the following:
+ *
+ *    - Backends SHALL NOT violate local data_checksums state
+ *    - Data checksums SHALL NOT be considered enabled cluster-wide until all
+ *      currently connected backends have the local state "enabled"
+ *
+ * There are two levels of synchronization required for enabling data checksums
+ * in an online cluster: (i) changing state in the active backends ("on",
+ * "off", "inprogress-on" and "inprogress-off"), and (ii) ensuring no
+ * incompatible objects and processes are left in a database when workers end.
+ * The former deals with cluster-wide agreement on data checksum state and the
+ * latter with ensuring that any concurrent activity cannot break the data
+ * checksum contract during processing.
+ *
+ * Synchronizing the state change is done with procsignal barriers, where the
+ * WAL logging backend updating the global state in the controlfile will wait
+ * for all other backends to absorb the barrier. Barrier absorption will happen
+ * during interrupt processing, which means that connected backends will change
+ * state at different times. To prevent data checksum state changes when
+ * writing and verifying checksums, interrupts shall be held off before
+ * interrogating state and resumed when the IO operation has been performed.
+ *
+ *   When Enabling Data Checksums
+ *   ----------------------------
+ *   A process which fails to observe data checksums being enabled can induce
+ *   two types of errors: failing to write the checksum when modifying the page
+ *   and failing to validate the data checksum on the page when reading it.
+ *
+ *   When processing starts all backends belong to one of the below sets, with
+ *   one set being empty:
+ *
+ *   Bd: Backends in "off" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   If processing is started in an online cluster then all backends are in Bd.
+ *   If processing was halted by the cluster shutting down, the controlfile
+ *   state "inprogress-on" will be observed on system startup and all backends
+ *   will be in Bd. Backends transition Bd -> Bi via a procsignalbarrier.  When
+ *   the DataChecksumsWorker has finished writing checksums on all pages and
+ *   enables data checksums cluster-wide, there are four sets of backends where
+ *   Bd shall be an empty set:
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   Backends in Bi and Be will write checksums when modifying a page, but only
+ *   backends in Be will verify the checksum during reading. The Bg backend is
+ *   blocked waiting for all backends in Bi to process interrupts and move to
+ *   Be. Any backend starting while Bg is waiting on the procsignalbarrier will
+ *   observe the global state being "on" and will thus automatically belong to
+ *   Be.  Checksums are enabled cluster-wide when Bi is an empty set. Bi and Be
+ *   are compatible sets while still operating based on their local state as
+ *   both write data checksums.
+ *
+ *   When Disabling Data Checksums
+ *   -----------------------------
+ *   A process which fails to observe that data checksums have been disabled
+ *   can induce two types of errors: writing the checksum when modifying the
+ *   page and validating a data checksum which is no longer correct due to
+ *   modifications to the page.
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bo: Backends in "inprogress-off" state
+ *
+ *   Backends transition from the Be state to Bd like so: Be -> Bo -> Bd
+ *
+ *   The goal is to transition all backends to Bd making the others empty sets.
+ *   Backends in Bo write data checksums, but don't validate them, such that
+ *   backends still in Be can continue to validate pages until the barrier has
+ *   been absorbed such that they are in Bo. Once all backends are in Bo, the
+ *   barrier to transition to "off" can be raised and all backends can safely
+ *   stop writing data checksums as no backend is enforcing data checksum
+ *   validation any longer.
+ *
+ *
+ * Potential optimizations
+ * -----------------------
+ * Below are some potential optimizations and improvements which were brought
+ * up during reviews of this feature, but which weren't implemented in the
+ * initial version. These are ideas listed without any validation on their
+ * feasibility or potential payoff. More discussion on these can be found on
+ * the -hackers threads linked to in the commit message of this feature.
+ *
+ *   * Launching datachecksumsworker for resuming operation from the startup
+ *     process: Currently users have to restart processing manually after a
+ *     restart since dynamic background worker cannot be started from the
+ *     postmaster. Changing the startup process could make restarting the
+ *     processing automatic on cluster restart.
+ *   * Avoid dirtying the page when checksums already match: Iff the checksum
+ *     on the page happens to already match we still dirty the page. It should
+ *     be enough to only do the log_newpage_buffer() call in that case.
+ *   * Invent a lightweight WAL record that doesn't contain the full-page
+ *     image but just the block number: On replay, the redo routine would read
+ *     the page from disk.
+ *   * Teach pg_checksums to avoid checksummed pages when pg_checksums is used
+ *     to enable checksums on a cluster which is in inprogress-on state and
+ *     may have checksummed pages (make pg_checksums be able to resume an
+ *     online operation).
+ *
+ *
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/postmaster/datachecksumsworker.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/htup_details.h"
+#include "access/xact.h"
+#include "access/xloginsert.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_database.h"
+#include "commands/progress.h"
+#include "commands/vacuum.h"
+#include "common/relpath.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "postmaster/bgworker.h"
+#include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/bufmgr.h"
+#include "storage/checksum.h"
+#include "storage/ipc.h"
+#include "storage/lmgr.h"
+#include "storage/lwlock.h"
+#include "storage/procarray.h"
+#include "storage/smgr.h"
+#include "tcop/tcopprot.h"
+#include "utils/fmgroids.h"
+#include "utils/injection_point.h"
+#include "utils/lsyscache.h"
+#include "utils/ps_status.h"
+#include "utils/syscache.h"
+
+/*
+ * Number of times we retry to open a database before giving up and consider
+ * it to have failed processing.
+ */
+#define DATACHECKSUMSWORKER_MAX_DB_RETRIES 5
+
+/*
+ * Signaling between backends calling pg_enable/disable_data_checksums, the
+ * checksums launcher process, and the checksums worker process.
+ *
+ * This struct is protected by DataChecksumsWorkerLock
+ */
+typedef struct DataChecksumsWorkerShmemStruct
+{
+	/*
+	 * These are set by pg_{enable|disable|verify}_data_checksums, to tell the
+	 * launcher what the target state is.
+	 */
+	DataChecksumsWorkerOperation launch_operation;
+	int			launch_cost_delay;
+	int			launch_cost_limit;
+	bool		launch_fast;
+
+	/*
+	 * Is a launcher process is currently running?
+	 *
+	 * This is set by the launcher process, after it has read the above
+	 * launch_* parameters.
+	 */
+	bool		launcher_running;
+
+	/*
+	 * These fields indicate the target state that the launcher is currently
+	 * working towards. They can be different from the corresponding launch_*
+	 * fields, if a new pg_enable/disable_data_checksums() call was made while
+	 * the launcher/worker was already running.
+	 *
+	 * The below members are set when the launcher starts, and are only
+	 * accessed read-only by the single worker. Thus, we can access these
+	 * without a lock. If multiple workers, or dynamic cost parameters, are
+	 * supported at some point then this would need to be revisited.
+	 */
+	DataChecksumsWorkerOperation operation;
+	int			cost_delay;
+	int			cost_limit;
+	bool		immediate_checkpoint;
+
+	/*
+	 * Signaling between the launcher and the worker process.
+	 *
+	 * As there is only a single worker, and the launcher won't read these
+	 * until the worker exits, they can be accessed without the need for a
+	 * lock. If multiple workers are supported then this will have to be
+	 * revisited.
+	 */
+
+	/* result, set by worker before exiting */
+	DataChecksumsWorkerResult success;
+
+	/*
+	 * tells the worker process whether it should also process the shared
+	 * catalogs
+	 */
+	bool		process_shared_catalogs;
+} DataChecksumsWorkerShmemStruct;
+
+/* Shared memory segment for datachecksumsworker */
+static DataChecksumsWorkerShmemStruct *DataChecksumsWorkerShmem;
+
+typedef struct DataChecksumsWorkerDatabase
+{
+	Oid			dboid;
+	char	   *dbname;
+} DataChecksumsWorkerDatabase;
+
+typedef struct DataChecksumsWorkerResultEntry
+{
+	Oid			dboid;
+	DataChecksumsWorkerResult result;
+	int			retries;
+} DataChecksumsWorkerResultEntry;
+
+
+/*
+ * Flag set by the interrupt handler
+ */
+static volatile sig_atomic_t abort_requested = false;
+
+/*
+ * Have we set the DataChecksumsWorkerShmemStruct->launcher_running flag?
+ * If we have, we need to clear it before exiting!
+ */
+static volatile sig_atomic_t launcher_running = false;
+
+/*
+ * Are we enabling data checksums, or disabling them?
+ */
+static DataChecksumsWorkerOperation operation;
+
+/* Prototypes */
+static List *BuildDatabaseList(void);
+static List *BuildRelationList(bool temp_relations, bool include_shared);
+static void FreeDatabaseList(List *dblist);
+static DataChecksumsWorkerResult ProcessDatabase(DataChecksumsWorkerDatabase *db);
+static bool ProcessAllDatabases(bool immediate_checkpoint);
+static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy);
+static void launcher_cancel_handler(SIGNAL_ARGS);
+static void WaitForAllTransactionsToFinish(void);
+
+/*
+ * StartDataChecksumsWorkerLauncher
+ *		Main entry point for datachecksumsworker launcher process
+ *
+ * The main entrypoint for starting data checksums processing for enabling as
+ * well as disabling.
+ */
+void
+StartDataChecksumsWorkerLauncher(DataChecksumsWorkerOperation op,
+								 int cost_delay,
+								 int cost_limit,
+								 bool fast)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	bool		launcher_running;
+
+#ifdef USE_ASSERT_CHECKING
+	/* The cost delay settings have no effect when disabling */
+	if (op == DISABLE_DATACHECKSUMS)
+		Assert(cost_delay == 0 && cost_limit == 0);
+#endif
+
+	/* Store the desired state in shared memory */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	DataChecksumsWorkerShmem->launch_operation = op;
+	DataChecksumsWorkerShmem->launch_cost_delay = cost_delay;
+	DataChecksumsWorkerShmem->launch_cost_limit = cost_limit;
+	DataChecksumsWorkerShmem->launch_fast = fast;
+
+	/* is the launcher already running? */
+	launcher_running = DataChecksumsWorkerShmem->launcher_running;
+
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * Launch a new launcher process, if it's not running already.
+	 *
+	 * If the launcher is currently busy enabling the checksums, and we want
+	 * them disabled (or vice versa), the launcher will notice that at latest
+	 * when it's about to exit, and will loop back process the new request. So
+	 * if the launcher is already running, we don't need to do anything more
+	 * here to abort it.
+	 *
+	 * If you call pg_enable/disable_data_checksums() twice in a row, before
+	 * the launcher has had a chance to start up, we still end up launching it
+	 * twice.  That's OK, the second invocation will see that a launcher is
+	 * already running and exit quickly.
+	 *
+	 * TODO: We could optimize here and skip launching the launcher, if we are
+	 * already in the desired state, i.e. if the checksums are already enabled
+	 * and you call pg_enable_data_checksums().
+	 */
+	if (!launcher_running)
+	{
+		/*
+		 * Prepare the BackgroundWorker and launch it.
+		 */
+		memset(&bgw, 0, sizeof(bgw));
+		bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+		bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+		snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+		snprintf(bgw.bgw_function_name, BGW_MAXLEN, "DataChecksumsWorkerLauncherMain");
+		snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker launcher");
+		snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker launcher");
+		bgw.bgw_restart_time = BGW_NEVER_RESTART;
+		bgw.bgw_notify_pid = MyProcPid;
+		bgw.bgw_main_arg = (Datum) 0;
+
+		if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+			ereport(ERROR,
+					errmsg("failed to start background worker to process data checksums"));
+	}
+}
+
+/*
+ * ProcessSingleRelationFork
+ *		Enable data checksums in a single relation/fork.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy)
+{
+	BlockNumber numblocks = RelationGetNumberOfBlocksInFork(reln, forkNum);
+	char		activity[NAMEDATALEN * 2 + 128];
+	char	   *relns;
+
+	relns = get_namespace_name(RelationGetNamespace(reln));
+
+	if (!relns)
+		return false;
+
+	/* Report the current relation to pgstat_activity */
+	snprintf(activity, sizeof(activity) - 1, "processing: %s.%s (%s, %dblocks)",
+			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	/*
+	 * As of now we only update the block counter for main forks in order to
+	 * not cause too frequent calls. TODO: investigate whether we should do it
+	 * more frequent?
+	 */
+	if (forkNum == MAIN_FORKNUM)
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+									 numblocks);
+
+	/*
+	 * We are looping over the blocks which existed at the time of process
+	 * start, which is safe since new blocks are created with checksums set
+	 * already due to the state being "inprogress-on".
+	 */
+	for (BlockNumber blknum = 0; blknum < numblocks; blknum++)
+	{
+		Buffer		buf = ReadBufferExtended(reln, forkNum, blknum, RBM_NORMAL, strategy);
+
+		/* Need to get an exclusive lock before we can flag as dirty */
+		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+
+		/*
+		 * Mark the buffer as dirty and force a full page write.  We have to
+		 * re-write the page to WAL even if the checksum hasn't changed,
+		 * because if there is a replica it might have a slightly different
+		 * version of the page with an invalid checksum, caused by unlogged
+		 * changes (e.g. hintbits) on the master happening while checksums
+		 * were off. This can happen if there was a valid checksum on the page
+		 * at one point in the past, so only when checksums are first on, then
+		 * off, and then turned on again.  TODO: investigate if this could be
+		 * avoided if the checksum is calculated to be correct and wal_level
+		 * is set to "minimal",
+		 */
+		START_CRIT_SECTION();
+		MarkBufferDirty(buf);
+		log_newpage_buffer(buf, false);
+		END_CRIT_SECTION();
+
+		UnlockReleaseBuffer(buf);
+
+		/*
+		 * This is the only place where we check if we are asked to abort, the
+		 * abortion will bubble up from here. It's safe to check this without
+		 * a lock, because if we miss it being set, we will try again soon.
+		 */
+		Assert(operation == ENABLE_DATACHECKSUMS);
+		if (DataChecksumsWorkerShmem->launch_operation == DISABLE_DATACHECKSUMS)
+			abort_requested = true;
+
+		if (abort_requested)
+			return false;
+
+		/*
+		 * As of now we only update the block counter for main forks in order
+		 * to not cause too frequent calls. TODO: investigate whether we
+		 * should do it more frequent?
+		 */
+		if (forkNum == MAIN_FORKNUM)
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+										 (blknum + 1));
+
+		vacuum_delay_point(false);
+	}
+
+	pfree(relns);
+	return true;
+}
+
+/*
+ * ProcessSingleRelationByOid
+ *		Process a single relation based on oid.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationByOid(Oid relationId, BufferAccessStrategy strategy)
+{
+	Relation	rel;
+	bool		aborted = false;
+
+	StartTransactionCommand();
+
+	rel = try_relation_open(relationId, AccessShareLock);
+	if (rel == NULL)
+	{
+		/*
+		 * Relation no longer exists. We don't consider this an error since
+		 * there are no pages in it that need data checksums, and thus return
+		 * true. The worker operates off a list of relations generated at the
+		 * start of processing, so relations being dropped in the meantime is
+		 * to be expected.
+		 */
+		CommitTransactionCommand();
+		pgstat_report_activity(STATE_IDLE, NULL);
+		return true;
+	}
+	RelationGetSmgr(rel);
+
+	for (ForkNumber fnum = 0; fnum <= MAX_FORKNUM; fnum++)
+	{
+		if (smgrexists(rel->rd_smgr, fnum))
+		{
+			if (!ProcessSingleRelationFork(rel, fnum, strategy))
+			{
+				aborted = true;
+				break;
+			}
+		}
+	}
+	relation_close(rel, AccessShareLock);
+	elog(DEBUG2,
+		 "data checksum processing done for relation with OID %u: %s",
+		 relationId, (aborted ? "aborted" : "finished"));
+
+	CommitTransactionCommand();
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return !aborted;
+}
+
+/*
+ * ProcessDatabase
+ *		Enable data checksums in a single database.
+ *
+ * We do this by launching a dynamic background worker into this database, and
+ * waiting for it to finish.  We have to do this in a separate worker, since
+ * each process can only be connected to one database during its lifetime.
+ */
+static DataChecksumsWorkerResult
+ProcessDatabase(DataChecksumsWorkerDatabase *db)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	BgwHandleStatus status;
+	pid_t		pid;
+	char		activity[NAMEDATALEN + 64];
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_FAILED;
+
+	memset(&bgw, 0, sizeof(bgw));
+	bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+	bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+	snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+	snprintf(bgw.bgw_function_name, BGW_MAXLEN, "%s", "DataChecksumsWorkerMain");
+	snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksumsworker worker");
+	snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksumsworker worker");
+	bgw.bgw_restart_time = BGW_NEVER_RESTART;
+	bgw.bgw_notify_pid = MyProcPid;
+	bgw.bgw_main_arg = ObjectIdGetDatum(db->dboid);
+
+	/*
+	 * If there are no worker slots available, make sure we retry processing
+	 * this database. This will make the datachecksumsworker move on to the
+	 * next database and quite likely fail with the same problem. TODO: Maybe
+	 * we need a backoff to avoid running through all the databases here in
+	 * short order.
+	 */
+	if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+	{
+		ereport(WARNING,
+				errmsg("failed to start worker for enabling data checksums in database \"%s\", retrying",
+					   db->dbname),
+				errhint("The max_worker_processes setting might be too low."));
+		return DATACHECKSUMSWORKER_RETRYDB;
+	}
+
+	status = WaitForBackgroundWorkerStartup(bgw_handle, &pid);
+	if (status == BGWH_STOPPED)
+	{
+		ereport(WARNING,
+				errmsg("could not start background worker for enabling data checksums in database \"%s\"",
+					   db->dbname),
+				errhint("More details on the error might be found in the server log."));
+		return DATACHECKSUMSWORKER_FAILED;
+	}
+
+	/*
+	 * If the postmaster crashed we cannot end up with a processed database so
+	 * we have no alternative other than exiting. When enabling checksums we
+	 * won't at this time have changed the pg_control version to enabled so
+	 * when the cluster comes back up processing will have to be restarted.
+	 * When disabling, the pg_control version will be set to off before this
+	 * so when the cluster comes up checksums will be off as expected.
+	 */
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				errmsg("cannot enable data checksums without the postmaster process"),
+				errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums()."));
+
+	Assert(status == BGWH_STARTED);
+	ereport(DEBUG1,
+			errmsg("initiating data checksum processing in database \"%s\"",
+				   db->dbname));
+
+	snprintf(activity, sizeof(activity) - 1,
+			 "Waiting for worker in database %s (pid %ld)", db->dbname, (long) pid);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	status = WaitForBackgroundWorkerShutdown(bgw_handle);
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				errmsg("postmaster exited during data checksum processing in \"%s\"",
+					   db->dbname),
+				errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums()."));
+
+	if (DataChecksumsWorkerShmem->success == DATACHECKSUMSWORKER_ABORTED)
+		ereport(LOG,
+				errmsg("data checksums processing was aborted in database \"%s\"",
+					   db->dbname));
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return DataChecksumsWorkerShmem->success;
+}
+
+/*
+ * launcher_exit
+ *
+ * Internal routine for cleaning up state when the launcher process exits. We
+ * need to clean up the abort flag to ensure that processing can be restarted
+ * again after it was previously aborted.
+ */
+static void
+launcher_exit(int code, Datum arg)
+{
+	if (launcher_running)
+	{
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		launcher_running = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		LWLockRelease(DataChecksumsWorkerLock);
+	}
+}
+
+/*
+ * launcher_cancel_handler
+ *
+ * Internal routine for reacting to SIGINT and flagging the worker to abort.
+ * The worker won't be interrupted immediately but will check for abort flag
+ * between each block in a relation.
+ */
+static void
+launcher_cancel_handler(SIGNAL_ARGS)
+{
+	int			save_errno = errno;
+
+	abort_requested = true;
+
+	/*
+	 * There is no sleeping in the main loop, the flag will be checked
+	 * periodically in ProcessSingleRelationFork. The worker does however
+	 * sleep when waiting for concurrent transactions to end so we still need
+	 * to set the latch.
+	 */
+	SetLatch(MyLatch);
+
+	errno = save_errno;
+}
+
+/*
+ * WaitForAllTransactionsToFinish
+ *		Blocks awaiting all current transactions to finish
+ *
+ * Returns when all transactions which are active at the call of the function
+ * have ended, or if the postmaster dies while waiting. If the postmaster dies
+ * the abort flag will be set to indicate that the caller of this shouldn't
+ * proceed.
+ *
+ * NB: this will return early, if aborted by SIGINT or if the target state
+ * is changed while we're running.
+ */
+static void
+WaitForAllTransactionsToFinish(void)
+{
+	TransactionId waitforxid;
+
+	LWLockAcquire(XidGenLock, LW_SHARED);
+	waitforxid = XidFromFullTransactionId(TransamVariables->nextXid);
+	LWLockRelease(XidGenLock);
+
+	while (TransactionIdPrecedes(GetOldestActiveTransactionId(false, true), waitforxid))
+	{
+		char		activity[64];
+		int			rc;
+
+		/* Oldest running xid is older than us, so wait */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for current transactions to finish (waiting for %u)",
+				 waitforxid);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 3 seconds */
+		ResetLatch(MyLatch);
+		rc = WaitLatch(MyLatch,
+					   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+					   3000,
+					   WAIT_EVENT_CHECKSUM_ENABLE_STARTCONDITION);
+
+		/*
+		 * If the postmaster died we won't be able to enable checksums
+		 * cluster-wide so abort and hope to continue when restarted.
+		 */
+		if (rc & WL_POSTMASTER_DEATH)
+			ereport(FATAL,
+					errmsg("postmaster exited during data checksum processing"),
+					errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums()."));
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_SHARED);
+		if (DataChecksumsWorkerShmem->launch_operation != operation)
+			abort_requested = true;
+		LWLockRelease(DataChecksumsWorkerLock);
+		if (abort_requested)
+			break;
+	}
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+	return;
+}
+
+/*
+ * DataChecksumsWorkerLauncherMain
+ *
+ * Main function for launching dynamic background workers for processing data
+ * checksums in databases. This function has the bgworker management, with
+ * ProcessAllDatabases being responsible for looping over the databases and
+ * initiating processing.
+ */
+void
+DataChecksumsWorkerLauncherMain(Datum arg)
+{
+	on_shmem_exit(launcher_exit, 0);
+
+	ereport(DEBUG1,
+			errmsg("background worker \"datachecksumsworker\" launcher started"));
+
+	pqsignal(SIGTERM, die);
+	pqsignal(SIGINT, launcher_cancel_handler);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_LAUNCHER;
+	init_ps_display(NULL);
+
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	if (DataChecksumsWorkerShmem->launcher_running)
+	{
+		/* Launcher was already running, let it finish */
+		LWLockRelease(DataChecksumsWorkerLock);
+		return;
+	}
+
+	launcher_running = true;
+
+	/*
+	 * Initialize a connection to shared catalogs only.
+	 */
+	BackgroundWorkerInitializeConnectionByOid(InvalidOid, InvalidOid, 0);
+
+	operation = DataChecksumsWorkerShmem->launch_operation;
+	DataChecksumsWorkerShmem->launcher_running = true;
+	DataChecksumsWorkerShmem->operation = operation;
+	DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+	DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+	DataChecksumsWorkerShmem->immediate_checkpoint = DataChecksumsWorkerShmem->launch_fast;
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * The target state can change while we are busy enabling/disabling
+	 * checksums, if the user calls pg_disable/enable_data_checksums() before
+	 * we are finished with the previous request. In that case, we will loop
+	 * back here, to process the new request.
+	 */
+again:
+
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	if (operation == ENABLE_DATACHECKSUMS)
+	{
+		/*
+		 * If we are asked to enable checksums in a cluster which already has
+		 * checksums enabled, exit immediately as there is nothing more to do.
+		 * Hold interrupts to make sure state doesn't change during checking.
+		 */
+		HOLD_INTERRUPTS();
+		if (DataChecksumsNeedVerify())
+		{
+			RESUME_INTERRUPTS();
+			goto done;
+		}
+		RESUME_INTERRUPTS();
+
+		/*
+		 * Set the state to inprogress-on and wait on the procsignal barrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_ENABLING);
+		SetDataChecksumsOnInProgress();
+
+		/*
+		 * All backends are now in inprogress-on state and are writing data
+		 * checksums.  Start processing all data at rest.
+		 */
+		if (!ProcessAllDatabases(DataChecksumsWorkerShmem->immediate_checkpoint))
+		{
+			/*
+			 * If the target state changed during processing then it's not a
+			 * failure, so restart processing instead.
+			 */
+			LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+			if (DataChecksumsWorkerShmem->launch_operation != operation)
+			{
+				LWLockRelease(DataChecksumsWorkerLock);
+				goto done;
+			}
+			LWLockRelease(DataChecksumsWorkerLock);
+			ereport(ERROR,
+					errmsg("unable to enable data checksums in cluster"));
+		}
+
+		/*
+		 * Data checksums have been set on all pages, set the state to on in
+		 * order to instruct backends to validate checksums on reading.
+		 */
+		SetDataChecksumsOn();
+	}
+	else
+	{
+		int			flags;
+
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_DISABLING);
+		SetDataChecksumsOff();
+
+		flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+		if (DataChecksumsWorkerShmem->immediate_checkpoint)
+			flags = flags | CHECKPOINT_FAST;
+		RequestCheckpoint(flags);
+	}
+
+done:
+
+	/*
+	 * All done. But before we exit, check if the target state was changed
+	 * while we were running. In that case we will have to start all over
+	 * again.
+	 */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+	if (DataChecksumsWorkerShmem->launch_operation != operation)
+	{
+		DataChecksumsWorkerShmem->operation = DataChecksumsWorkerShmem->launch_operation;
+		operation = DataChecksumsWorkerShmem->launch_operation;
+		DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+		DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+		LWLockRelease(DataChecksumsWorkerLock);
+		goto again;
+	}
+
+	/* Shut down progress reporting as we are done */
+	pgstat_progress_end_command();
+
+	launcher_running = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	LWLockRelease(DataChecksumsWorkerLock);
+}
+
+/*
+ * ProcessAllDatabases
+ *		Compute the list of all databases and process checksums in each
+ *
+ * This will repeatedly generate a list of databases to process for enabling
+ * checksums. Until no new databases are found, this will loop around computing
+ * a new list and comparing it to the already seen ones.
+ *
+ * If immediate_checkpoint is set to true then a CHECKPOINT_FAST will be
+ * issued. This is useful for testing but should be avoided in production use
+ * as it may affect cluster performance drastically.
+ */
+static bool
+ProcessAllDatabases(bool immediate_checkpoint)
+{
+	List	   *DatabaseList;
+	HTAB	   *ProcessedDatabases = NULL;
+	HASHCTL		hash_ctl;
+	bool		found_failed = false;
+	int			flags;
+
+	/* Initialize a hash tracking all processed databases */
+	memset(&hash_ctl, 0, sizeof(hash_ctl));
+	hash_ctl.keysize = sizeof(Oid);
+	hash_ctl.entrysize = sizeof(DataChecksumsWorkerResultEntry);
+	ProcessedDatabases = hash_create("Processed databases",
+									 64,
+									 &hash_ctl,
+									 HASH_ELEM | HASH_BLOBS);
+
+	/*
+	 * Set up so first run processes shared catalogs, but not once in every
+	 * db.
+	 */
+	DataChecksumsWorkerShmem->process_shared_catalogs = true;
+
+	/*
+	 * Get a list of all databases to process. This may include databases that
+	 * were created during our runtime.  Since a database can be created as a
+	 * copy of any other database (which may not have existed in our last
+	 * run), we have to repeat this loop until no new databases show up in the
+	 * list.
+	 */
+	DatabaseList = BuildDatabaseList();
+
+	/* Allow a test case to modify the initial list of databases */
+	INJECTION_POINT("datachecksumsworker-initial-dblist", DatabaseList);
+
+	/*
+	 * Update progress reporting with the total number of databases we need to
+	 * process.  This number should not be changed during processing, the
+	 * columns for processed databases is instead increased such that it can
+	 * be compared against the total.
+	 */
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_DBS_TOTAL,
+			PROGRESS_DATACHECKSUMS_DBS_DONE,
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE,
+			PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+			PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+		};
+
+		int64		vals[6];
+
+		vals[0] = list_length(DatabaseList);
+		vals[1] = 0;
+
+		/* translated to NULL */
+		vals[2] = -1;
+		vals[3] = -1;
+		vals[4] = -1;
+		vals[5] = -1;
+
+		pgstat_progress_update_multi_param(6, index, vals);
+	}
+
+	while (true)
+	{
+		int			processed_databases = 0;
+
+		foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+		{
+			DataChecksumsWorkerResult result;
+			DataChecksumsWorkerResultEntry *entry;
+			bool		found;
+
+			/*
+			 * Check if this database has been processed already, and if so
+			 * whether it should be retried or skipped.
+			 */
+			entry = (DataChecksumsWorkerResultEntry *) hash_search(ProcessedDatabases, &db->dboid,
+																   HASH_FIND, NULL);
+
+			if (entry)
+			{
+				if (entry->result == DATACHECKSUMSWORKER_RETRYDB)
+				{
+					/*
+					 * Limit the number of retries to avoid infinite looping
+					 * in case there simply won't be enough workers in the
+					 * cluster to finish this operation.
+					 */
+					if (entry->retries > DATACHECKSUMSWORKER_MAX_DB_RETRIES)
+						entry->result = DATACHECKSUMSWORKER_FAILED;
+				}
+
+				/* Skip if this database has been processed already */
+				if (entry->result != DATACHECKSUMSWORKER_RETRYDB)
+					continue;
+			}
+
+			result = ProcessDatabase(db);
+			processed_databases++;
+
+			/*
+			 * Update the number of processed databases in the progress
+			 * report.
+			 */
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_DBS_DONE,
+										 processed_databases);
+
+			/* Allow a test process to alter the result of the operation */
+			INJECTION_POINT("datachecksumsworker-fail-db", &result);
+
+			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
+			{
+				/*
+				 * If one database has completed shared catalogs, we don't
+				 * have to process them again.
+				 */
+				if (DataChecksumsWorkerShmem->process_shared_catalogs)
+					DataChecksumsWorkerShmem->process_shared_catalogs = false;
+			}
+			else if (result == DATACHECKSUMSWORKER_ABORTED)
+			{
+				/* Abort flag set, so exit the whole process */
+				return false;
+			}
+
+			entry = hash_search(ProcessedDatabases, &db->dboid, HASH_ENTER, &found);
+			entry->dboid = db->dboid;
+			entry->result = result;
+			if (!found)
+				entry->retries = 0;
+			else
+				entry->retries++;
+		}
+
+		elog(DEBUG1,
+			 "%i databases processed for data checksum enabling, %s",
+			 processed_databases,
+			 (processed_databases ? "process with restart" : "process completed"));
+
+		FreeDatabaseList(DatabaseList);
+
+		/*
+		 * If no databases were processed in this run of the loop, we have now
+		 * finished all databases and no concurrently created ones can exist.
+		 */
+		if (processed_databases == 0)
+			break;
+
+		/*
+		 * Re-generate the list of databases for another pass. Since we wait
+		 * for all pre-existing transactions finish, this way we can be
+		 * certain that there are no databases left without checksums.
+		 */
+		WaitForAllTransactionsToFinish();
+		DatabaseList = BuildDatabaseList();
+	}
+
+	/*
+	 * ProcessedDatabases now has all databases and the results of their
+	 * processing. Failure to enable checksums for a database can be because
+	 * they actually failed for some reason, or because the database was
+	 * dropped between us getting the database list and trying to process it.
+	 * Get a fresh list of databases to detect the second case where the
+	 * database was dropped before we had started processing it. If a database
+	 * still exists, but enabling checksums failed then we fail the entire
+	 * checksumming process and exit with an error.
+	 */
+	WaitForAllTransactionsToFinish();
+	DatabaseList = BuildDatabaseList();
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+	{
+		DataChecksumsWorkerResultEntry *entry;
+		bool		found;
+
+		entry = hash_search(ProcessedDatabases, (void *) &db->dboid,
+							HASH_FIND, &found);
+
+		/*
+		 * We are only interested in the processed databases which failed, and
+		 * where the failed database still exists.  This indicates that
+		 * enabling checksums actually failed, and not that the failure was
+		 * due to the db being concurrently dropped.
+		 */
+		if (found && entry->result == DATACHECKSUMSWORKER_FAILED)
+		{
+			ereport(WARNING,
+					errmsg("failed to enable data checksums in \"%s\"", db->dbname));
+			found_failed = found;
+			continue;
+		}
+	}
+
+	FreeDatabaseList(DatabaseList);
+
+	if (found_failed)
+	{
+		/* Disable checksums on cluster, because we failed */
+		SetDataChecksumsOff();
+		/* Force a checkpoint to make everything consistent */
+		flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+		if (immediate_checkpoint)
+			flags = flags | CHECKPOINT_FAST;
+		RequestCheckpoint(flags);
+		ereport(ERROR,
+				errmsg("data checksums failed to get enabled in all databases, aborting"),
+				errhint("The server log might have more information on the cause of the error."));
+	}
+
+	/*
+	 * When enabling checksums, we have to wait for a checkpoint for the
+	 * checksums to change from in-progress to on.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT);
+
+	/*
+	 * Force a checkpoint to get everything out to disk. The use of immediate
+	 * checkpoints is for running tests, as they would otherwise not execute
+	 * in such a way that they can reliably be placed under timeout control.
+	 */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_FAST;
+	RequestCheckpoint(flags);
+
+	return true;
+}
+
+/*
+ * DataChecksumsWorkerShmemSize
+ *		Compute required space for datachecksumsworker-related shared memory
+ */
+Size
+DataChecksumsWorkerShmemSize(void)
+{
+	Size		size;
+
+	size = sizeof(DataChecksumsWorkerShmemStruct);
+	size = MAXALIGN(size);
+
+	return size;
+}
+
+/*
+ * DataChecksumsWorkerShmemInit
+ *		Allocate and initialize datachecksumsworker-related shared memory
+ */
+void
+DataChecksumsWorkerShmemInit(void)
+{
+	bool		found;
+
+	DataChecksumsWorkerShmem = (DataChecksumsWorkerShmemStruct *)
+		ShmemInitStruct("DataChecksumsWorker Data",
+						DataChecksumsWorkerShmemSize(),
+						&found);
+
+	if (!found)
+	{
+		MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
+
+		/*
+		 * Even if this is a redundant assignment, we want to be explicit
+		 * about our intent for readability, since we want to be able to query
+		 * this state in case of restartability.
+		 */
+		DataChecksumsWorkerShmem->launch_operation = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		DataChecksumsWorkerShmem->launch_fast = false;
+	}
+}
+
+/*
+ * BuildDatabaseList
+ *		Compile a list of all currently available databases in the cluster
+ *
+ * This creates the list of databases for the datachecksumsworker workers to
+ * add checksums to. If the caller wants to ensure that no concurrently
+ * running CREATE DATABASE calls exist, this needs to be preceded by a call
+ * to WaitForAllTransactionsToFinish().
+ */
+static List *
+BuildDatabaseList(void)
+{
+	List	   *DatabaseList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(DatabaseRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_database pgdb = (Form_pg_database) GETSTRUCT(tup);
+		DataChecksumsWorkerDatabase *db;
+
+		oldctx = MemoryContextSwitchTo(ctx);
+
+		db = (DataChecksumsWorkerDatabase *) palloc0(sizeof(DataChecksumsWorkerDatabase));
+
+		db->dboid = pgdb->oid;
+		db->dbname = pstrdup(NameStr(pgdb->datname));
+
+		DatabaseList = lappend(DatabaseList, db);
+
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return DatabaseList;
+}
+
+static void
+FreeDatabaseList(List *dblist)
+{
+	if (!dblist)
+		return;
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, dblist)
+	{
+		if (db->dbname != NULL)
+			pfree(db->dbname);
+	}
+
+	list_free_deep(dblist);
+}
+
+/*
+ * BuildRelationList
+ *		Compile a list of relations in the database
+ *
+ * Returns a list of OIDs for the request relation types. If temp_relations
+ * is True then only temporary relations are returned. If temp_relations is
+ * False then non-temporary relations which have data checksums are returned.
+ * If include_shared is True then shared relations are included as well in a
+ * non-temporary list. include_shared has no relevance when building a list of
+ * temporary relations.
+ */
+static List *
+BuildRelationList(bool temp_relations, bool include_shared)
+{
+	List	   *RelationList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(RelationRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_class pgc = (Form_pg_class) GETSTRUCT(tup);
+
+		/*
+		 * Only include temporary relations when asked for a temp relation
+		 * list.
+		 */
+		if (pgc->relpersistence == RELPERSISTENCE_TEMP)
+		{
+			if (!temp_relations)
+				continue;
+		}
+		else
+		{
+			/*
+			 * If we are only interested in temp relations then continue
+			 * immediately as the current relation isn't a temp relation.
+			 */
+			if (temp_relations)
+				continue;
+
+			if (!RELKIND_HAS_STORAGE(pgc->relkind))
+				continue;
+
+			if (pgc->relisshared && !include_shared)
+				continue;
+		}
+
+		oldctx = MemoryContextSwitchTo(ctx);
+		RelationList = lappend_oid(RelationList, pgc->oid);
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return RelationList;
+}
+
+/*
+ * DataChecksumsWorkerMain
+ *
+ * Main function for enabling checksums in a single database, This is the
+ * function set as the bgw_function_name in the dynamic background worker
+ * process initiated for each database by the worker launcher. After enabling
+ * data checksums in each applicable relation in the database, it will wait for
+ * all temporary relations that were present when the function started to
+ * disappear before returning. This is required since we cannot rewrite
+ * existing temporary relations with data checksums.
+ */
+void
+DataChecksumsWorkerMain(Datum arg)
+{
+	Oid			dboid = DatumGetObjectId(arg);
+	List	   *RelationList = NIL;
+	List	   *InitialTempTableList = NIL;
+	BufferAccessStrategy strategy;
+	bool		aborted = false;
+	int64		rels_done;
+
+	operation = ENABLE_DATACHECKSUMS;
+
+	pqsignal(SIGTERM, die);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_WORKER;
+	init_ps_display(NULL);
+
+	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
+											  BGWORKER_BYPASS_ALLOWCONN);
+
+	/* worker will have a separate entry in pg_stat_progress_data_checksums */
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	/*
+	 * Get a list of all temp tables present as we start in this database. We
+	 * need to wait until they are all gone until we are done, since we cannot
+	 * access these relations and modify them.
+	 */
+	InitialTempTableList = BuildRelationList(true, false);
+
+	/*
+	 * Enable vacuum cost delay, if any.
+	 */
+	Assert(DataChecksumsWorkerShmem->operation == ENABLE_DATACHECKSUMS);
+	VacuumCostDelay = DataChecksumsWorkerShmem->cost_delay;
+	VacuumCostLimit = DataChecksumsWorkerShmem->cost_limit;
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumCostPageHit = 0;
+	VacuumCostPageMiss = 0;
+	VacuumCostPageDirty = 0;
+
+	/*
+	 * Create and set the vacuum strategy as our buffer strategy.
+	 */
+	strategy = GetAccessStrategy(BAS_VACUUM);
+
+	RelationList = BuildRelationList(false,
+									 DataChecksumsWorkerShmem->process_shared_catalogs);
+
+	/* Update the total number of relations to be processed in this DB. */
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE
+		};
+
+		int64		vals[2];
+
+		vals[0] = list_length(RelationList);
+		vals[1] = 0;
+
+		pgstat_progress_update_multi_param(2, index, vals);
+	}
+
+	/* Process the relations */
+	rels_done = 0;
+	foreach_oid(reloid, RelationList)
+	{
+		if (!ProcessSingleRelationByOid(reloid, strategy))
+		{
+			aborted = true;
+			break;
+		}
+
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_RELS_DONE,
+									 ++rels_done);
+	}
+	list_free(RelationList);
+
+	if (aborted)
+	{
+		DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+		ereport(DEBUG1,
+				errmsg("data checksum processing aborted in database OID %u",
+					   dboid));
+		return;
+	}
+
+	/* The worker is about to wait for temporary tables to go away. */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL);
+
+	/*
+	 * Wait for all temp tables that existed when we started to go away. This
+	 * is necessary since we cannot "reach" them to enable checksums. Any temp
+	 * tables created after we started will already have checksums in them
+	 * (due to the "inprogress-on" state), so no need to wait for those.
+	 */
+	for (;;)
+	{
+		List	   *CurrentTempTables;
+		int			numleft;
+		char		activity[64];
+
+		CurrentTempTables = BuildRelationList(true, false);
+		numleft = 0;
+		foreach_oid(tmptbloid, InitialTempTableList)
+		{
+			if (list_member_oid(CurrentTempTables, tmptbloid))
+				numleft++;
+		}
+		list_free(CurrentTempTables);
+
+		INJECTION_POINT("datachecksumsworker-fake-temptable-wait", &numleft);
+
+		if (numleft == 0)
+			break;
+
+		/*
+		 * At least one temp table is left to wait for, indicate in pgstat
+		 * activity and progress reporting.
+		 */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for %d temp tables to be removed", numleft);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 3 seconds */
+		ResetLatch(MyLatch);
+		(void) WaitLatch(MyLatch,
+						 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+						 3000,
+						 WAIT_EVENT_CHECKSUM_ENABLE_TEMPTABLE_WAIT);
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		aborted = DataChecksumsWorkerShmem->launch_operation != operation;
+		LWLockRelease(DataChecksumsWorkerLock);
+
+		if (aborted || abort_requested)
+		{
+			DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+			ereport(DEBUG1,
+					errmsg("data checksum processing aborted in database OID %u",
+						   dboid));
+			return;
+		}
+	}
+
+	list_free(InitialTempTableList);
+
+	/* worker done */
+	pgstat_progress_end_command();
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
+}
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index bf6b55ee830..955df32be5d 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -204,6 +204,9 @@ static child_process_kind child_process_kinds[] = {
 	[B_WAL_SUMMARIZER] = {"wal_summarizer", WalSummarizerMain, true},
 	[B_WAL_WRITER] = {"wal_writer", WalWriterMain, true},
 
+	[B_DATACHECKSUMSWORKER_LAUNCHER] = {"datachecksum launcher", NULL, false},
+	[B_DATACHECKSUMSWORKER_WORKER] = {"datachecksum worker", NULL, false},
+
 	[B_LOGGER] = {"syslogger", SysLoggerMain, false},
 };
 
diff --git a/src/backend/postmaster/meson.build b/src/backend/postmaster/meson.build
index 0008603cfee..ce10ef1059a 100644
--- a/src/backend/postmaster/meson.build
+++ b/src/backend/postmaster/meson.build
@@ -6,6 +6,7 @@ backend_sources += files(
   'bgworker.c',
   'bgwriter.c',
   'checkpointer.c',
+  'datachecksumsworker.c',
   'fork_process.c',
   'interrupt.c',
   'launch_backend.c',
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index e1d643b013d..3d15a894c3a 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2983,6 +2983,11 @@ PostmasterStateMachine(void)
 									B_INVALID,
 									B_STANDALONE_BACKEND);
 
+			/* also add checksumming processes */
+			remainMask = btmask_add(remainMask,
+									B_DATACHECKSUMSWORKER_LAUNCHER,
+									B_DATACHECKSUMSWORKER_WORKER);
+
 			/* All types should be included in targetMask or remainMask */
 			Assert((remainMask.mask | targetMask.mask) == BTYPE_MASK_ALL.mask);
 		}
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index cc03f0706e9..f9f06821a8f 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -186,6 +186,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 		case XLOG_FPW_CHANGE:
 		case XLOG_FPI_FOR_HINT:
 		case XLOG_FPI:
+		case XLOG_CHECKSUMS:
 		case XLOG_OVERWRITE_CONTRECORD:
 		case XLOG_CHECKPOINT_REDO:
 			break;
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 2fa045e6b0f..44213d140ae 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -30,6 +30,8 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "postmaster/postmaster.h"
 #include "postmaster/walsummarizer.h"
 #include "replication/logicallauncher.h"
 #include "replication/origin.h"
@@ -150,6 +152,7 @@ CalculateShmemSize(int *num_semaphores)
 	size = add_size(size, InjectionPointShmemSize());
 	size = add_size(size, SlotSyncShmemSize());
 	size = add_size(size, AioShmemSize());
+	size = add_size(size, DataChecksumsWorkerShmemSize());
 
 	/* include additional requested shmem from preload libraries */
 	size = add_size(size, total_addin_request);
@@ -332,6 +335,7 @@ CreateOrAttachShmemStructs(void)
 	PgArchShmemInit();
 	ApplyLauncherShmemInit();
 	SlotSyncShmemInit();
+	DataChecksumsWorkerShmemInit();
 
 	/*
 	 * Set up other modules that need some shared memory space
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 087821311cc..6881c6f4069 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -18,6 +18,7 @@
 #include <unistd.h>
 
 #include "access/parallel.h"
+#include "access/xlog.h"
 #include "commands/async.h"
 #include "miscadmin.h"
 #include "pgstat.h"
@@ -576,6 +577,18 @@ ProcessProcSignalBarrier(void)
 					case PROCSIGNAL_BARRIER_SMGRRELEASE:
 						processed = ProcessBarrierSmgrRelease();
 						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON:
+						processed = AbsorbChecksumsOnInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_ON:
+						processed = AbsorbChecksumsOnBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF:
+						processed = AbsorbChecksumsOffInProgressBarrier();
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_OFF:
+						processed = AbsorbChecksumsOffBarrier();
+						break;
 				}
 
 				/*
diff --git a/src/backend/storage/page/README b/src/backend/storage/page/README
index e30d7ac59ad..73c36a63908 100644
--- a/src/backend/storage/page/README
+++ b/src/backend/storage/page/README
@@ -10,7 +10,9 @@ http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf, discussed
 2010/12/22 on -hackers list.
 
 Current implementation requires this be enabled system-wide at initdb time, or
-by using the pg_checksums tool on an offline cluster.
+by using the pg_checksums tool on an offline cluster.  Checksums can also be
+enabled at runtime using pg_enable_data_checksums(), and disabled by using
+pg_disable_data_checksums().
 
 The checksum is not valid at all times on a data page!!
 The checksum is valid when the page leaves the shared pool and is checked
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index dbb49ed9197..19cf6512e52 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -107,7 +107,7 @@ PageIsVerified(PageData *page, BlockNumber blkno, int flags, bool *checksum_fail
 	 */
 	if (!PageIsNew(page))
 	{
-		if (DataChecksumsEnabled())
+		if (DataChecksumsNeedVerify())
 		{
 			checksum = pg_checksum_page(page, blkno);
 
@@ -1511,7 +1511,7 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 	static char *pageCopy = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return page;
 
 	/*
@@ -1541,7 +1541,7 @@ void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
 	/* If we don't need a checksum, just return */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return;
 
 	((PageHeader) page)->pd_checksum = pg_checksum_page(page, blkno);
diff --git a/src/backend/utils/activity/pgstat_backend.c b/src/backend/utils/activity/pgstat_backend.c
index 8714a85e2d9..edc2512d79f 100644
--- a/src/backend/utils/activity/pgstat_backend.c
+++ b/src/backend/utils/activity/pgstat_backend.c
@@ -378,6 +378,8 @@ pgstat_tracks_backend_bktype(BackendType bktype)
 		case B_CHECKPOINTER:
 		case B_IO_WORKER:
 		case B_STARTUP:
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 			return false;
 
 		case B_AUTOVAC_WORKER:
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index 13ae57ed649..a290d56f409 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -362,6 +362,8 @@ pgstat_tracks_io_bktype(BackendType bktype)
 		case B_LOGGER:
 			return false;
 
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 		case B_AUTOVAC_LAUNCHER:
 		case B_AUTOVAC_WORKER:
 		case B_BACKEND:
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index 5427da5bc1b..7f26d78cb77 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -116,6 +116,9 @@ CHECKPOINT_DELAY_COMPLETE	"Waiting for a backend that blocks a checkpoint from c
 CHECKPOINT_DELAY_START	"Waiting for a backend that blocks a checkpoint from starting."
 CHECKPOINT_DONE	"Waiting for a checkpoint to complete."
 CHECKPOINT_START	"Waiting for a checkpoint to start."
+CHECKSUM_ENABLE_STARTCONDITION	"Waiting for data checksums enabling to start."
+CHECKSUM_ENABLE_FINISHCONDITION	"Waiting for data checksums to be enabled."
+CHECKSUM_ENABLE_TEMPTABLE_WAIT	"Waiting for temporary tables to be dropped for data checksums to be enabled."
 EXECUTE_GATHER	"Waiting for activity from a child process while executing a <literal>Gather</literal> plan node."
 HASH_BATCH_ALLOCATE	"Waiting for an elected Parallel Hash participant to allocate a hash table."
 HASH_BATCH_ELECT	"Waiting to elect a Parallel Hash participant to allocate a hash table."
@@ -352,6 +355,7 @@ DSMRegistry	"Waiting to read or update the dynamic shared memory registry."
 InjectionPoint	"Waiting to read or update information related to injection points."
 SerialControl	"Waiting to read or update shared <filename>pg_serial</filename> state."
 AioWorkerSubmissionQueue	"Waiting to access AIO worker submission queue."
+DataChecksumsWorker	"Waiting for data checksumsworker."
 
 #
 # END OF PREDEFINED LWLOCKS (DO NOT CHANGE THIS LINE)
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index c756c2bebaa..f4e264ebf33 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -274,6 +274,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		cmdtype = PROGRESS_COMMAND_BASEBACKUP;
 	else if (pg_strcasecmp(cmd, "COPY") == 0)
 		cmdtype = PROGRESS_COMMAND_COPY;
+	else if (pg_strcasecmp(cmd, "DATACHECKSUMS") == 0)
+		cmdtype = PROGRESS_COMMAND_DATACHECKSUMS;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -1146,9 +1148,6 @@ pg_stat_get_db_checksum_failures(PG_FUNCTION_ARGS)
 	int64		result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
@@ -1164,9 +1163,6 @@ pg_stat_get_db_checksum_last_failure(PG_FUNCTION_ARGS)
 	TimestampTz result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index 545d1e90fbd..34cce2ce0be 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -293,9 +293,18 @@ GetBackendTypeDesc(BackendType backendType)
 		case B_CHECKPOINTER:
 			backendDesc = gettext_noop("checkpointer");
 			break;
+
 		case B_IO_WORKER:
 			backendDesc = gettext_noop("io worker");
 			break;
+
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+			backendDesc = "datachecksumsworker launcher";
+			break;
+		case B_DATACHECKSUMSWORKER_WORKER:
+			backendDesc = "datachecksumsworker worker";
+			break;
+
 		case B_LOGGER:
 			backendDesc = gettext_noop("logger");
 			break;
@@ -895,7 +904,8 @@ InitializeSessionUserIdStandalone(void)
 	 * workers, in slot sync worker and in background workers.
 	 */
 	Assert(!IsUnderPostmaster || AmAutoVacuumWorkerProcess() ||
-		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess());
+		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess() ||
+		   AmDataChecksumsWorkerProcess());
 
 	/* call only once */
 	Assert(!OidIsValid(AuthenticatedUserId));
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 641e535a73c..589e7eab9e8 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -750,6 +750,24 @@ InitPostgres(const char *in_dbname, Oid dboid,
 
 	ProcSignalInit(MyCancelKey, MyCancelKeyLength);
 
+	/*
+	 * Initialize a local cache of the data_checksum_version, to be updated by
+	 * the procsignal-based barriers.
+	 *
+	 * This intentionally happens after initializing the procsignal, otherwise
+	 * we might miss a state change. This means we can get a barrier for the
+	 * state we've just initialized - but it can happen only once.
+	 *
+	 * The postmaster (which is what gets forked into the new child process)
+	 * does not handle barriers, therefore it may not have the current value
+	 * of LocalDataChecksumVersion value (it'll have the value read from the
+	 * control file, which may be arbitrarily old).
+	 *
+	 * NB: Even if the postmaster handled barriers, the value might still be
+	 * stale, as it might have changed after this process forked.
+	 */
+	InitLocalDataChecksumVersion();
+
 	/*
 	 * Also set up timeout handlers needed for backend operation.  We need
 	 * these in every case except bootstrap.
@@ -878,7 +896,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
 					 errhint("You should immediately run CREATE USER \"%s\" SUPERUSER;.",
 							 username != NULL ? username : "postgres")));
 	}
-	else if (AmBackgroundWorkerProcess())
+	else if (AmBackgroundWorkerProcess() || AmDataChecksumsWorkerProcess())
 	{
 		if (username == NULL && !OidIsValid(useroid))
 		{
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index f137129209f..36fba8496df 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -491,6 +491,14 @@ static const struct config_enum_entry file_copy_method_options[] = {
 	{NULL, 0, false}
 };
 
+static const struct config_enum_entry data_checksums_options[] = {
+	{"on", PG_DATA_CHECKSUM_VERSION, true},
+	{"off", PG_DATA_CHECKSUM_OFF, true},
+	{"inprogress-on", PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION, true},
+	{"inprogress-off", PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION, true},
+	{NULL, 0, false}
+};
+
 /*
  * Options for enum values stored in other modules
  */
@@ -616,7 +624,6 @@ static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
 static int	num_os_semaphores;
-static bool data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
@@ -2043,17 +2050,6 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
-	{
-		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
-			gettext_noop("Shows whether data checksums are turned on for this cluster."),
-			NULL,
-			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
-		},
-		&data_checksums,
-		false,
-		NULL, NULL, NULL
-	},
-
 	{
 		{"syslog_sequence_numbers", PGC_SIGHUP, LOGGING_WHERE,
 			gettext_noop("Add sequence number to syslog messages to avoid duplicate suppression."),
@@ -5489,6 +5485,16 @@ struct config_enum ConfigureNamesEnum[] =
 		DEFAULT_IO_METHOD, io_method_options,
 		NULL, assign_io_method, NULL
 	},
+	{
+		{"data_checksums", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows whether data checksums are turned on for this cluster."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
+		},
+		&data_checksums,
+		PG_DATA_CHECKSUM_OFF, data_checksums_options,
+		NULL, NULL, show_data_checksums
+	},
 
 	/* End-of-list marker */
 	{
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index f20be82862a..8411cecf3ff 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -568,7 +568,7 @@ main(int argc, char *argv[])
 		ControlFile->state != DB_SHUTDOWNED_IN_RECOVERY)
 		pg_fatal("cluster must be shut down");
 
-	if (ControlFile->data_checksum_version == 0 &&
+	if (ControlFile->data_checksum_version != PG_DATA_CHECKSUM_VERSION &&
 		mode == PG_MODE_CHECK)
 		pg_fatal("data checksums are not enabled in cluster");
 
@@ -576,7 +576,7 @@ main(int argc, char *argv[])
 		mode == PG_MODE_DISABLE)
 		pg_fatal("data checksums are already disabled in cluster");
 
-	if (ControlFile->data_checksum_version > 0 &&
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION &&
 		mode == PG_MODE_ENABLE)
 		pg_fatal("data checksums are already enabled in cluster");
 
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index 10de058ce91..acf5c7b026e 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -280,6 +280,8 @@ main(int argc, char *argv[])
 		   ControlFile->checkPointCopy.oldestCommitTsXid);
 	printf(_("Latest checkpoint's newestCommitTsXid:%u\n"),
 		   ControlFile->checkPointCopy.newestCommitTsXid);
+	printf(_("Latest checkpoint's data_checksum_version:%u\n"),
+		   ControlFile->checkPointCopy.data_checksum_version);
 	printf(_("Time of latest checkpoint:            %s\n"),
 		   ckpttime_str);
 	printf(_("Fake LSN counter for unlogged rels:   %X/%08X\n"),
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index 90cef0864de..29684e82440 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -15,6 +15,7 @@
 #include "access/xlog_internal.h"
 #include "common/string.h"
 #include "pg_upgrade.h"
+#include "storage/bufpage.h"
 
 
 /*
@@ -736,6 +737,14 @@ check_control_data(ControlData *oldctrl,
 	 * check_for_isn_and_int8_passing_mismatch().
 	 */
 
+	/*
+	 * If data checksums are in any in-progress state then disallow the
+	 * upgrade. The user should either let the process finish, or turn off
+	 * data checksums, before retrying.
+	 */
+	if (oldctrl->data_checksum_version > PG_DATA_CHECKSUM_VERSION)
+		pg_fatal("checksums are being enabled in the old cluster");
+
 	/*
 	 * We might eventually allow upgrades from checksum to no-checksum
 	 * clusters.
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index d12798be3d8..8bcc5aa8a63 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -56,6 +56,7 @@ extern PGDLLIMPORT int CommitDelay;
 extern PGDLLIMPORT int CommitSiblings;
 extern PGDLLIMPORT bool track_wal_io_timing;
 extern PGDLLIMPORT int wal_decode_buffer_size;
+extern PGDLLIMPORT int data_checksums;
 
 extern PGDLLIMPORT int CheckPointSegments;
 
@@ -117,7 +118,7 @@ extern PGDLLIMPORT int wal_level;
  * of the bits make it to disk, but the checksum wouldn't match.  Also WAL-log
  * them if forced by wal_log_hints=on.
  */
-#define XLogHintBitIsNeeded() (DataChecksumsEnabled() || wal_log_hints)
+#define XLogHintBitIsNeeded() (wal_log_hints || DataChecksumsNeedWrite())
 
 /* Do we need to WAL-log information required only for Hot Standby and logical replication? */
 #define XLogStandbyInfoActive() (wal_level >= WAL_LEVEL_REPLICA)
@@ -229,7 +230,19 @@ extern XLogRecPtr GetXLogWriteRecPtr(void);
 
 extern uint64 GetSystemIdentifier(void);
 extern char *GetMockAuthenticationNonce(void);
-extern bool DataChecksumsEnabled(void);
+extern bool DataChecksumsNeedWrite(void);
+extern bool DataChecksumsNeedVerify(void);
+extern bool DataChecksumsOnInProgress(void);
+extern bool DataChecksumsOffInProgress(void);
+extern void SetDataChecksumsOnInProgress(void);
+extern void SetDataChecksumsOn(void);
+extern void SetDataChecksumsOff(void);
+extern bool AbsorbChecksumsOnInProgressBarrier(void);
+extern bool AbsorbChecksumsOffInProgressBarrier(void);
+extern bool AbsorbChecksumsOnBarrier(void);
+extern bool AbsorbChecksumsOffBarrier(void);
+extern const char *show_data_checksums(void);
+extern void InitLocalDataChecksumVersion(void);
 extern bool GetDefaultCharSignedness(void);
 extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
 extern Size XLOGShmemSize(void);
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index cc06fc29ab2..cc78b00fe4c 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,6 +25,7 @@
 #include "lib/stringinfo.h"
 #include "pgtime.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/relfilelocator.h"
 
 
@@ -289,6 +290,12 @@ typedef struct xl_restore_point
 	char		rp_name[MAXFNAMELEN];
 } xl_restore_point;
 
+/* Information logged when data checksum level is changed */
+typedef struct xl_checksum_state
+{
+	uint32		new_checksumtype;
+} xl_checksum_state;
+
 /* Overwrite of prior contrecord */
 typedef struct xl_overwrite_contrecord
 {
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 63e834a6ce4..a8877fb87d1 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -62,6 +62,9 @@ typedef struct CheckPoint
 	 * set to InvalidTransactionId.
 	 */
 	TransactionId oldestActiveXid;
+
+	/* data checksums at the time of the checkpoint  */
+	uint32		data_checksum_version;
 } CheckPoint;
 
 /* XLOG info values for XLOG rmgr */
@@ -80,6 +83,7 @@ typedef struct CheckPoint
 /* 0xC0 is used in Postgres 9.5-11 */
 #define XLOG_OVERWRITE_CONTRECORD		0xD0
 #define XLOG_CHECKPOINT_REDO			0xE0
+#define XLOG_CHECKSUMS					0xF0
 
 
 /*
@@ -219,7 +223,7 @@ typedef struct ControlFileData
 	bool		float8ByVal;	/* float8, int8, etc pass-by-value? */
 
 	/* Are data pages protected by checksums? Zero if no checksum version */
-	uint32		data_checksum_version;
+	uint32		data_checksum_version;	/* persistent */
 
 	/*
 	 * True if the default signedness of char is "signed" on a platform where
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 118d6da1ace..c6f4e31a12f 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12356,6 +12356,25 @@
   proname => 'jsonb_subscript_handler', prorettype => 'internal',
   proargtypes => 'internal', prosrc => 'jsonb_subscript_handler' },
 
+# data checksum management functions
+{ oid => '9258',
+  descr => 'disable data checksums',
+  proname => 'pg_disable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'bool', proallargtypes => '{bool}',
+  proargmodes => '{i}',
+  proargnames => '{fast}',
+  prosrc => 'disable_data_checksums' },
+
+{ oid => '9257',
+  descr => 'enable data checksums',
+  proname => 'pg_enable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'int4 int4 bool', proallargtypes => '{int4,int4,bool}',
+  proargmodes => '{i,i,i}',
+  proargnames => '{cost_delay,cost_limit,fast}',
+  prosrc => 'enable_data_checksums' },
+
 # collation management functions
 { oid => '3445', descr => 'import collations from operating system',
   proname => 'pg_import_system_collations', procost => '100',
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 1cde4bd9bcf..cf6de4ef12d 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -162,4 +162,20 @@
 #define PROGRESS_COPY_TYPE_PIPE 3
 #define PROGRESS_COPY_TYPE_CALLBACK 4
 
+/* Progress parameters for PROGRESS_DATACHECKSUMS */
+#define PROGRESS_DATACHECKSUMS_PHASE		0
+#define PROGRESS_DATACHECKSUMS_DBS_TOTAL	1
+#define PROGRESS_DATACHECKSUMS_DBS_DONE		2
+#define PROGRESS_DATACHECKSUMS_RELS_TOTAL	3
+#define PROGRESS_DATACHECKSUMS_RELS_DONE	4
+#define PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL	5
+#define PROGRESS_DATACHECKSUMS_BLOCKS_DONE	6
+
+/* Phases of datachecksumsworker operation */
+#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING			0
+#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING			1
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING			2
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL	3
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT 4
+
 #endif
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1bef98471c3..2a0d7b6de42 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -366,6 +366,9 @@ typedef enum BackendType
 	B_WAL_SUMMARIZER,
 	B_WAL_WRITER,
 
+	B_DATACHECKSUMSWORKER_LAUNCHER,
+	B_DATACHECKSUMSWORKER_WORKER,
+
 	/*
 	 * Logger is not connected to shared memory and does not have a PGPROC
 	 * entry.
@@ -391,6 +394,9 @@ extern PGDLLIMPORT BackendType MyBackendType;
 #define AmWalSummarizerProcess()	(MyBackendType == B_WAL_SUMMARIZER)
 #define AmWalWriterProcess()		(MyBackendType == B_WAL_WRITER)
 #define AmIoWorkerProcess()			(MyBackendType == B_IO_WORKER)
+#define AmDataChecksumsWorkerProcess() \
+	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || \
+	 MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
 
 #define AmSpecialWorkerProcess() \
 	(AmAutoVacuumLauncherProcess() || \
diff --git a/src/include/postmaster/datachecksumsworker.h b/src/include/postmaster/datachecksumsworker.h
new file mode 100644
index 00000000000..2cd066fd0fe
--- /dev/null
+++ b/src/include/postmaster/datachecksumsworker.h
@@ -0,0 +1,51 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.h
+ *	  header file for data checksum helper background worker
+ *
+ *
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/postmaster/datachecksumsworker.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef DATACHECKSUMSWORKER_H
+#define DATACHECKSUMSWORKER_H
+
+/* Shared memory */
+extern Size DataChecksumsWorkerShmemSize(void);
+extern void DataChecksumsWorkerShmemInit(void);
+
+/* Possible operations the Datachecksumsworker can perform */
+typedef enum DataChecksumsWorkerOperation
+{
+	ENABLE_DATACHECKSUMS,
+	DISABLE_DATACHECKSUMS,
+	/* TODO: VERIFY_DATACHECKSUMS, */
+}			DataChecksumsWorkerOperation;
+
+/*
+ * Possible states for a database entry which has been processed. Exported
+ * here since we want to be able to reference this from injection point tests.
+ */
+typedef enum
+{
+	DATACHECKSUMSWORKER_SUCCESSFUL = 0,
+	DATACHECKSUMSWORKER_ABORTED,
+	DATACHECKSUMSWORKER_FAILED,
+	DATACHECKSUMSWORKER_RETRYDB,
+} DataChecksumsWorkerResult;
+
+/* Start the background processes for enabling or disabling checksums */
+void		StartDataChecksumsWorkerLauncher(DataChecksumsWorkerOperation op,
+											 int cost_delay,
+											 int cost_limit,
+											 bool fast);
+
+/* Background worker entrypoints */
+void		DataChecksumsWorkerLauncherMain(Datum arg);
+void		DataChecksumsWorkerMain(Datum arg);
+
+#endif							/* DATACHECKSUMSWORKER_H */
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index aeb67c498c5..30fb0f62d4c 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -16,6 +16,7 @@
 
 #include "access/xlogdefs.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/item.h"
 #include "storage/off.h"
 
@@ -205,7 +206,6 @@ typedef PageHeaderData *PageHeader;
  * handling pages.
  */
 #define PG_PAGE_LAYOUT_VERSION		4
-#define PG_DATA_CHECKSUM_VERSION	1
 
 /* ----------------------------------------------------------------
  *						page support functions
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 25d13a798d1..b3f368a15b5 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -15,6 +15,20 @@
 
 #include "storage/block.h"
 
+/*
+ * Checksum version 0 is used for when data checksums are disabled (OFF).
+ * PG_DATA_CHECKSUM_VERSION defines that data checksums are enabled in the
+ * cluster and PG_DATA_CHECKSUM_INPROGRESS_{ON|OFF}_VERSION defines that data
+ * checksums are either currently being enabled or disabled.
+ */
+typedef enum ChecksumType
+{
+	PG_DATA_CHECKSUM_OFF = 0,
+	PG_DATA_CHECKSUM_VERSION,
+	PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION,
+	PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION
+} ChecksumType;
+
 /*
  * Compute the checksum for a Postgres page.  The page must be aligned on a
  * 4-byte boundary.
diff --git a/src/include/storage/lwlocklist.h b/src/include/storage/lwlocklist.h
index 06a1ffd4b08..b8f7ba0be51 100644
--- a/src/include/storage/lwlocklist.h
+++ b/src/include/storage/lwlocklist.h
@@ -85,6 +85,7 @@ PG_LWLOCK(50, DSMRegistry)
 PG_LWLOCK(51, InjectionPoint)
 PG_LWLOCK(52, SerialControl)
 PG_LWLOCK(53, AioWorkerSubmissionQueue)
+PG_LWLOCK(54, DataChecksumsWorker)
 
 /*
  * There also exist several built-in LWLock tranches.  As with the predefined
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index c6f5ebceefd..d90d35b1d6f 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -463,11 +463,11 @@ extern PGDLLIMPORT PGPROC *PreparedXactProcs;
  * Background writer, checkpointer, WAL writer, WAL summarizer, and archiver
  * run during normal operation.  Startup process and WAL receiver also consume
  * 2 slots, but WAL writer is launched only after startup has exited, so we
- * only need 6 slots.
+ * only need 6 slots to cover these. The DataChecksums worker and launcher
+ * can consume 2 slots when data checksums are enabled or disabled.
  */
 #define MAX_IO_WORKERS          32
-#define NUM_AUXILIARY_PROCS		(6 + MAX_IO_WORKERS)
-
+#define NUM_AUXILIARY_PROCS		(8 + MAX_IO_WORKERS)
 
 /* configurable options */
 extern PGDLLIMPORT int DeadlockTimeout;
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index afeeb1ca019..c54c61e2cd8 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -54,6 +54,11 @@ typedef enum
 typedef enum
 {
 	PROCSIGNAL_BARRIER_SMGRRELEASE, /* ask smgr to close files */
+
+	PROCSIGNAL_BARRIER_CHECKSUM_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_ON,
 } ProcSignalBarrierType;
 
 /*
diff --git a/src/include/utils/backend_progress.h b/src/include/utils/backend_progress.h
index dda813ab407..c664e92dbfe 100644
--- a/src/include/utils/backend_progress.h
+++ b/src/include/utils/backend_progress.h
@@ -28,6 +28,7 @@ typedef enum ProgressCommandType
 	PROGRESS_COMMAND_CREATE_INDEX,
 	PROGRESS_COMMAND_BASEBACKUP,
 	PROGRESS_COMMAND_COPY,
+	PROGRESS_COMMAND_DATACHECKSUMS,
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	20
diff --git a/src/test/Makefile b/src/test/Makefile
index 511a72e6238..278ce3e8a86 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,16 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = perl postmaster regress isolation modules authentication recovery subscription
+SUBDIRS = \
+		perl \
+		postmaster \
+		regress \
+		isolation \
+		modules \
+		authentication \
+		recovery \
+		subscription \
+		checksum
 
 ifeq ($(with_icu),yes)
 SUBDIRS += icu
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 903a8ac151a..c8f2747b261 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -17,6 +17,7 @@ SUBDIRS = \
 		  test_aio \
 		  test_binaryheap \
 		  test_bloomfilter \
+		  test_checksums \
 		  test_copy_callbacks \
 		  test_custom_rmgrs \
 		  test_ddl_deparse \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 93be0f57289..6b4450eb473 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -16,6 +16,7 @@ subdir('ssl_passphrase_callback')
 subdir('test_aio')
 subdir('test_binaryheap')
 subdir('test_bloomfilter')
+subdir('test_checksums')
 subdir('test_copy_callbacks')
 subdir('test_custom_rmgrs')
 subdir('test_ddl_deparse')
diff --git a/src/test/modules/test_checksums/.gitignore b/src/test/modules/test_checksums/.gitignore
new file mode 100644
index 00000000000..871e943d50e
--- /dev/null
+++ b/src/test/modules/test_checksums/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/modules/test_checksums/Makefile b/src/test/modules/test_checksums/Makefile
new file mode 100644
index 00000000000..a5b6259a728
--- /dev/null
+++ b/src/test/modules/test_checksums/Makefile
@@ -0,0 +1,40 @@
+#-------------------------------------------------------------------------
+#
+# Makefile for src/test/checksum
+#
+# Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/checksum/Makefile
+#
+#-------------------------------------------------------------------------
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+export enable_injection_points
+
+MODULE_big = test_checksums
+OBJS = \
+	$(WIN32RES) \
+	test_checksums.o
+PGFILEDESC = "test_checksums - test code for data checksums"
+
+EXTENSION = test_checksums
+DATA = test_checksums--1.0.sql
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_checksums
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
diff --git a/src/test/modules/test_checksums/README b/src/test/modules/test_checksums/README
new file mode 100644
index 00000000000..0f0317060b3
--- /dev/null
+++ b/src/test/modules/test_checksums/README
@@ -0,0 +1,22 @@
+src/test/checksum/README
+
+Regression tests for data checksums
+===================================
+
+This directory contains a test suite for enabling data checksums
+in a running cluster.
+
+Running the tests
+=================
+
+    make check
+
+or
+
+    make installcheck
+
+NOTE: This creates a temporary installation (in the case of "check"),
+with multiple nodes, be they master or standby(s) for the purpose of
+the tests.
+
+NOTE: This requires the --enable-tap-tests argument to configure.
diff --git a/src/test/modules/test_checksums/meson.build b/src/test/modules/test_checksums/meson.build
new file mode 100644
index 00000000000..57156b63599
--- /dev/null
+++ b/src/test/modules/test_checksums/meson.build
@@ -0,0 +1,35 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+test_checksums_sources = files(
+	'test_checksums.c',
+)
+
+test_checksums = shared_module('test_checksums',
+	test_checksums_sources,
+	kwargs: pg_test_mod_args,
+)
+test_install_libs += test_checksums
+
+test_install_data += files(
+	'test_checksums.control',
+	'test_checksums--1.0.sql',
+)
+
+tests += {
+  'name': 'test_checksums',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'env': {
+       'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+	},
+    'tests': [
+      't/001_basic.pl',
+      't/002_restarts.pl',
+      't/003_standby_restarts.pl',
+      't/004_offline.pl',
+      't/005_injection.pl',
+      't/006_concurrent_pgbench.pl',
+    ],
+  },
+}
diff --git a/src/test/modules/test_checksums/t/001_basic.pl b/src/test/modules/test_checksums/t/001_basic.pl
new file mode 100644
index 00000000000..728a5c4510c
--- /dev/null
+++ b/src/test/modules/test_checksums/t/001_basic.pl
@@ -0,0 +1,63 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are turned off
+test_checksum_state($node, 'off');
+
+# Enable data checksums and wait for the state transition to 'on'
+enable_data_checksums($node, wait => 'on');
+
+# Run a dummy query just to make sure we can read back data
+my $result =
+  $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1 ");
+is($result, '9999', 'ensure checksummed pages can be read back');
+
+# Enable data checksums again which should be a no-op so we explicitly don't
+# wait for any state transition as none should happen here
+enable_data_checksums($node);
+test_checksum_state($node, 'on');
+# ..and make sure we can still read/write data
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums again and wait for the state transition
+disable_data_checksums($node, wait => 'on');
+
+# Test reading data again
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '10000', 'ensure previously checksummed pages can be read back');
+
+# Re-enable checksums and make sure that the underlying data has changed to
+# ensure that checksums will be different.
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+enable_data_checksums($node, wait => 'on');
+
+# Run a dummy query just to make sure we can read back the data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/002_restarts.pl b/src/test/modules/test_checksums/t/002_restarts.pl
new file mode 100644
index 00000000000..75599cf41f2
--- /dev/null
+++ b/src/test/modules/test_checksums/t/002_restarts.pl
@@ -0,0 +1,110 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with a
+# restart which breaks processing.
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Initialize result storage for queries
+my $result;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+test_checksum_state($node, 'off');
+
+SKIP:
+{
+	skip 'Data checksum delay tests not enabled in PG_TEST_EXTRA', 6
+	  if (!$ENV{PG_TEST_EXTRA}
+		|| $ENV{PG_TEST_EXTRA} !~ /\bchecksum_extended\b/);
+
+	# Create a barrier for checksumming to block on, in this case a pre-
+	# existing temporary table which is kept open while processing is started.
+	# We can accomplish this by setting up an interactive psql process which
+	# keeps the temporary table created as we enable checksums in another psql
+	# process.
+	#
+	# This is a similar test to the synthetic variant in 005_injection.pl
+	# which fakes this scenario.
+	my $bsession = $node->background_psql('postgres');
+	$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+	# In another session, make sure we can see the blocking temp table but
+	# start processing anyways and check that we are blocked with a proper
+	# wait event.
+	$result = $node->safe_psql('postgres',
+		"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';"
+	);
+	is($result, 't', 'ensure we can see the temporary table');
+
+	# Enabling data checksums shouldn't work as the process is blocked on the
+	# temporary table held open by $bsession. Ensure that we reach inprogress-
+	# on before we do more tests.
+	enable_data_checksums($node, wait => 'inprogress-on');
+
+	# Wait for processing to finish and the worker waiting for leftover temp
+	# relations to be able to actually finish
+	$result = $node->poll_query_until(
+		'postgres',
+		"SELECT wait_event FROM pg_catalog.pg_stat_activity "
+		  . "WHERE backend_type = 'datachecksumsworker worker';",
+		'ChecksumEnableTemptableWait');
+
+	# The datachecksumsworker waits for temporary tables to disappear for 3
+	# seconds before retrying, so sleep for 4 seconds to be guaranteed to see
+	# a retry cycle
+	sleep(4);
+
+	# Re-check the wait event to ensure we are blocked on the right thing.
+	$result = $node->safe_psql('postgres',
+			"SELECT wait_event FROM pg_catalog.pg_stat_activity "
+		  . "WHERE backend_type = 'datachecksumsworker worker';");
+	is($result, 'ChecksumEnableTemptableWait',
+		'ensure the correct wait condition is set');
+	test_checksum_state($node, 'inprogress-on');
+
+	# Stop the cluster while bsession is still attached.  We can't close the
+	# session first since the brief period between closing and stopping might
+	# be enough for checksums to get enabled.
+	$node->stop;
+	$bsession->quit;
+	$node->start;
+
+	# Ensure the checksums aren't enabled across the restart.  This leaves the
+	# cluster in the same state as before we entered the SKIP block.
+	test_checksum_state($node, 'off');
+}
+
+enable_data_checksums($node, wait => 'on');
+
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '9999', 'ensure checksummed pages can be read back');
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+disable_data_checksums($node, wait => 1);
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/003_standby_restarts.pl b/src/test/modules/test_checksums/t/003_standby_restarts.pl
new file mode 100644
index 00000000000..fe34b4d7d05
--- /dev/null
+++ b/src/test/modules/test_checksums/t/003_standby_restarts.pl
@@ -0,0 +1,114 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# streaming replication
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+# Initialize primary node
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+$node_primary->init(allows_streaming => 1, no_data_checksums => 1);
+$node_primary->start;
+
+my $slotname = 'physical_slot';
+$node_primary->safe_psql('postgres',
+	"SELECT pg_create_physical_replication_slot('$slotname')");
+
+# Take backup
+my $backup_name = 'my_backup';
+$node_primary->backup($backup_name);
+
+# Create streaming standby linking to primary
+my $node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $backup_name,
+	has_streaming => 1);
+$node_standby_1->append_conf(
+	'postgresql.conf', qq[
+primary_slot_name = '$slotname'
+]);
+$node_standby_1->start;
+
+# Create some content on the primary to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Wait for standbys to catch up
+$node_primary->wait_for_catchup($node_standby_1, 'replay',
+	$node_primary->lsn('insert'));
+
+# Check that checksums are turned off on all nodes
+test_checksum_state($node_primary, 'off');
+test_checksum_state($node_standby_1, 'off');
+
+# ---------------------------------------------------------------------------
+# Enable checksums for the cluster, and make sure that both the primary and
+# standby change state.
+#
+
+# Ensure that the primary switches to "inprogress-on"
+enable_data_checksums($node_primary, wait => 'inprogress-on');
+# Wait for checksum enable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Ensure that the standby has switched to "inprogress-on" or "on".  Normally it
+# would be "inprogress-on", but it is theoretically possible for the primary to
+# complete the checksum enabling *and* have the standby replay that record
+# before we reach the check below.
+my $result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting = 'off' FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'f');
+is($result, 1, 'ensure standby has absorbed the inprogress-on barrier');
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+
+is(($result eq 'inprogress-on' || $result eq 'on'),
+	1, 'ensure checksums are on, or in progress, on standby_1');
+
+# Insert some more data which should be checksummed on INSERT
+$node_primary->safe_psql('postgres',
+	"INSERT INTO t VALUES (generate_series(1, 10000));");
+
+# Wait for checksums enabled on the primary and standby
+wait_for_checksum_state($node_primary, 'on');
+wait_for_checksum_state($node_standby_1, 'on');
+
+$result =
+  $node_primary->safe_psql('postgres', "SELECT count(a) FROM t WHERE a > 1");
+is($result, '19998', 'ensure we can safely read all data with checksums');
+
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+#
+# Disable checksums and ensure it's propagated to standby and that we can
+# still read all data
+#
+
+# Disable checksums and wait for the operation to be replayed
+disable_data_checksums($node_primary);
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+# Ensure that the primary abd standby has switched to off
+wait_for_checksum_state($node_primary, 'off');
+wait_for_checksum_state($node_standby_1, 'off');
+# Doublecheck reading data withourt errors
+$result =
+  $node_primary->safe_psql('postgres', "SELECT count(a) FROM t WHERE a > 1");
+is($result, "19998", 'ensure we can safely read all data without checksums');
+
+$node_standby_1->stop;
+$node_primary->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/004_offline.pl b/src/test/modules/test_checksums/t/004_offline.pl
new file mode 100644
index 00000000000..e9fbcf77eab
--- /dev/null
+++ b/src/test/modules/test_checksums/t/004_offline.pl
@@ -0,0 +1,82 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums offline from various states
+# of checksum processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+test_checksum_state($node, 'off');
+
+# Enable checksums offline using pg_checksums
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are enabled
+test_checksum_state($node, 'on');
+
+# Run a dummy query just to make sure we can read back some data
+my $result =
+  $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '9999', 'ensure checksummed pages can be read back');
+
+# Disable checksums offline again using pg_checksums
+$node->stop;
+$node->checksum_disable_offline;
+$node->start;
+
+# Ensure that checksums are disabled
+test_checksum_state($node, 'off');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+enable_data_checksums($node, wait => 'inprogress-on');
+
+# Turn the cluster off and enable checksums offline, then start back up
+$bsession->quit;
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are now enabled even though processing wasn't
+# restarted
+test_checksum_state($node, 'on');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '9999', 'ensure checksummed pages can be read back');
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/005_injection.pl b/src/test/modules/test_checksums/t/005_injection.pl
new file mode 100644
index 00000000000..f4459e0e636
--- /dev/null
+++ b/src/test/modules/test_checksums/t/005_injection.pl
@@ -0,0 +1,76 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# injection point tests injecting failures into the processing
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+# ---------------------------------------------------------------------------
+# Test cluster setup
+#
+
+# Initiate testcluster
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Set up test environment
+$node->safe_psql('postgres', 'CREATE EXTENSION test_checksums;');
+
+# ---------------------------------------------------------------------------
+# Inducing failures in processing
+
+# Force enabling checksums to fail by marking one of the databases as having
+# failed in processing.
+disable_data_checksums($node, wait => 1);
+$node->safe_psql('postgres', 'SELECT dcw_inject_fail_database(true);');
+enable_data_checksums($node, wait => 'off');
+$node->safe_psql('postgres', 'SELECT dcw_inject_fail_database(false);');
+
+# Force the enable checksums processing to make multiple passes by removing
+# one database from the list in the first pass.  This will simulate a CREATE
+# DATABASE during processing.  Doing this via fault injection makes the test
+# not be subject to exact timing.
+$node->safe_psql('postgres', 'SELECT dcw_prune_dblist(true);');
+enable_data_checksums($node, wait => 'on');
+
+# ---------------------------------------------------------------------------
+# Timing and retry related tests
+#
+SKIP:
+{
+	skip 'Data checksum delay tests not enabled in PG_TEST_EXTRA', 4
+	  if (!$ENV{PG_TEST_EXTRA}
+		|| $ENV{PG_TEST_EXTRA} !~ /\bchecksum_extended\b/);
+
+	# Inject a delay in the barrier for enabling checksums
+	disable_data_checksums($node, wait => 1);
+	$node->safe_psql('postgres', 'SELECT dcw_inject_delay_barrier();');
+	enable_data_checksums($node, wait => 'on');
+
+	# Fake the existence of a temporary table at the start of processing, which
+	# will force the processing to wait and retry in order to wait for it to
+	# disappear.
+	disable_data_checksums($node, wait => 1);
+	$node->safe_psql('postgres', 'SELECT dcw_fake_temptable(true);');
+	enable_data_checksums($node, wait => 'on');
+}
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/006_concurrent_pgbench.pl b/src/test/modules/test_checksums/t/006_concurrent_pgbench.pl
new file mode 100644
index 00000000000..b33ca6e0c26
--- /dev/null
+++ b/src/test/modules/test_checksums/t/006_concurrent_pgbench.pl
@@ -0,0 +1,326 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# concurrent activity via pgbench runs
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+my $node_primary_slot = 'physical_slot';
+my $node_primary_backup = 'primary_backup';
+my $node_primary;
+my $node_primary_loglocation = 0;
+my $node_standby_1;
+my $node_standby_1_loglocation = 0;
+
+# The number of full test iterations which will be performed. The exact number
+# of tests performed and the wall time taken is non-deterministic as the test
+# performs a lot of randomized actions, but 50 iterations will be a long test
+# run regardless.
+my $TEST_ITERATIONS = 50;
+
+# Variables which record the current state of the cluster
+my $data_checksum_state = 'off';
+my $pgbench_running = 0;
+
+# Variables holding state for managing the cluster and aux processes in
+# various ways
+my @stop_modes = ();
+my ($pgb_primary_stdin, $pgb_primary_stdout, $pgb_primary_stderr) =
+  ('', '', '');
+my ($pgb_standby_1_stdin, $pgb_standby_1_stdout, $pgb_standby_1_stderr) =
+  ('', '', '');
+
+if (!$ENV{PG_TEST_EXTRA} || $ENV{PG_TEST_EXTRA} !~ /\bchecksum_extended\b/)
+{
+	plan skip_all => 'Extended tests not enabled';
+}
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+# Helper for retrieving a binary value with random distribution for deciding
+# whether to turn things off during testing.
+sub cointoss
+{
+	return int(rand(2) == 1);
+}
+
+# Helper for injecting random sleeps here and there in the testrun. The sleep
+# duration wont be predictable in order to avoid sleep patterns that manage to
+# avoid race conditions and timing bugs.
+sub random_sleep
+{
+	return if cointoss;
+	sleep(int(rand(3)));
+}
+
+# Start a read-only pgbench run in the background against the server specified
+# via the port passed as parameter
+sub background_ro_pgbench
+{
+	my ($port, $stdin, $stdout, $stderr) = @_;
+
+	my $pgbench_primary = IPC::Run::start(
+		[ 'pgbench', '-p', $port, '-S', '-T', '600', '-c', '10', 'postgres' ],
+		'<' => \$stdin,
+		'>' => \$stdout,
+		'2>' => \$stderr,
+		IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default));
+}
+
+# Start a pgbench run in the background against the server specified via the
+# port passed as parameter
+sub background_rw_pgbench
+{
+	my ($port, $stdin, $stdout, $stderr) = @_;
+
+	my $pgbench_primary = IPC::Run::start(
+		[ 'pgbench', '-p', $port, '-T', '600', '-c', '10', 'postgres' ],
+		'<' => \$stdin,
+		'>' => \$stdout,
+		'2>' => \$stderr,
+		IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default));
+}
+
+# Invert the state of data checksums in the cluster, if data checksums are on
+# then disable them and vice versa. Also performs proper validation of the
+# before and after state.
+sub flip_data_checksums
+{
+	test_checksum_state($node_primary, $data_checksum_state);
+	test_checksum_state($node_standby_1, $data_checksum_state);
+
+	if ($data_checksum_state eq 'off')
+	{
+		# Coin-toss to see if we are injecting a retry due to a temptable
+		$node_primary->safe_psql('postgres',
+			'SELECT dcw_fake_temptable(true);')
+		  if cointoss();
+
+		# Ensure that the primary switches to "inprogress-on"
+		enable_data_checksums($node_primary, wait => 'inprogress-on');
+		random_sleep();
+		# Wait for checksum enable to be replayed
+		$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+		# Ensure that the standby has switched to "inprogress-on" or "on".
+		# Normally it would be "inprogress-on", but it is theoretically
+		# possible for the primary to complete the checksum enabling *and* have
+		# the standby replay that record before we reach the check below.
+		my $result = $node_standby_1->poll_query_until(
+			'postgres',
+			"SELECT setting = 'off' "
+			  . "FROM pg_catalog.pg_settings "
+			  . "WHERE name = 'data_checksums';",
+			'f');
+		is($result, 1,
+			'ensure standby has absorbed the inprogress-on barrier');
+		random_sleep();
+		$result = $node_standby_1->safe_psql('postgres',
+				"SELECT setting "
+			  . "FROM pg_catalog.pg_settings "
+			  . "WHERE name = 'data_checksums';");
+
+		is(($result eq 'inprogress-on' || $result eq 'on'),
+			1, 'ensure checksums are on, or in progress, on standby_1');
+
+		# Wait for checksums enabled on the primary and standby
+		wait_for_checksum_state($node_primary, 'on');
+		random_sleep();
+		wait_for_checksum_state($node_standby_1, 'on');
+
+		$node_primary->safe_psql('postgres',
+			'SELECT dcw_fake_temptable(false);');
+		$data_checksum_state = 'on';
+	}
+	elsif ($data_checksum_state eq 'on')
+	{
+		random_sleep();
+		disable_data_checksums($node_primary);
+		$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+		# Wait for checksums disabled on the primary and standby
+		wait_for_checksum_state($node_primary, 'off');
+		random_sleep();
+		wait_for_checksum_state($node_standby_1, 'off');
+
+		$data_checksum_state = 'off';
+	}
+	else
+	{
+		# This should only happen due to programmer error when hacking on the
+		# test code, but since that might pass subtly by let's ensure it gets
+		# caught with a test error if so.
+		is(1, 0, 'data_checksum_state variable has invalid state');
+	}
+}
+
+# Prepare an array with pg_ctl stop modes which we later can randomly select
+# from in order to stop the cluster in some way.
+for (my $i = 1; $i <= 100; $i++)
+{
+	if (int(rand($i * 2)) > $i)
+	{
+		push(@stop_modes, "immediate");
+	}
+	else
+	{
+		push(@stop_modes, "fast");
+	}
+}
+
+# Create and start a cluster with one primary and one standby node, and ensure
+# they are caught up and in sync.
+$node_primary = PostgreSQL::Test::Cluster->new('main');
+$node_primary->init(allows_streaming => 1, no_data_checksums => 1);
+# max_connections need to be bumped in order to accomodate for pgbench clients
+# and log_statement is dialled down since it otherwise will generate enormous
+# amounts of logging. Page verification failures are still logged.
+$node_primary->append_conf(
+	'postgresql.conf',
+	qq[
+max_connections = 30
+log_statement = none
+]);
+$node_primary->start;
+$node_primary->safe_psql('postgres', 'CREATE EXTENSION test_checksums;');
+# Create some content to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1, 100000) AS a;");
+$node_primary->safe_psql('postgres',
+	"SELECT pg_create_physical_replication_slot('$node_primary_slot');");
+$node_primary->backup($node_primary_backup);
+
+$node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $node_primary_backup,
+	has_streaming => 1);
+$node_standby_1->append_conf(
+	'postgresql.conf', qq[
+primary_slot_name = '$node_primary_slot'
+]);
+$node_standby_1->start;
+
+$node_primary->command_ok([ 'pgbench', '-i', '-s', '100', '-q', 'postgres' ]);
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Start the test suite with pgbench running.
+background_ro_pgbench(
+	$node_standby_1->port, $pgb_standby_1_stdin,
+	$pgb_standby_1_stdout, $pgb_standby_1_stderr);
+background_rw_pgbench(
+	$node_primary->port, $pgb_primary_stdin,
+	$pgb_primary_stdout, $pgb_primary_stderr);
+
+# Main test suite. This loop will start a pgbench run on the cluster and while
+# that's running flip the state of data checksums concurrently. It will then
+# randomly restart thec cluster (in fast or immediate) mode and then check for
+# the desired state.  The idea behind doing things randomly is to stress out
+# any timing related issues by subjecting the cluster for varied workloads.
+# A TODO is to generate a trace such that any test failure can be traced to
+# its order of operations for debugging.
+for (my $i = 0; $i < $TEST_ITERATIONS; $i++)
+{
+	if (!$node_primary->is_alive)
+	{
+		# Since the log isn't being written to now, parse the log and check
+		# for instances of checksum verification failures.
+		my $log = PostgreSQL::Test::Utils::slurp_file($node_primary->logfile,
+			$node_primary_loglocation);
+		unlike(
+			$log,
+			qr/page verification failed/,
+			"no checksum validation errors in primary log");
+		$node_primary_loglocation = -s $node_primary->logfile;
+
+		# If data checksums are enabled, take the opportunity to verify them
+		# while the cluster is offline
+		$node_primary->checksum_verify_offline()
+		  unless $data_checksum_state eq 'off';
+		random_sleep();
+		$node_primary->start;
+		# Start a pgbench in the background against the primary
+		background_rw_pgbench($node_primary->port, 0, $pgb_primary_stdin,
+			$pgb_primary_stdout, $pgb_primary_stderr);
+	}
+
+	if (!$node_standby_1->is_alive)
+	{
+		# Since the log isn't being written to now, parse the log and check
+		# for instances of checksum verification failures.
+		my $log =
+		  PostgreSQL::Test::Utils::slurp_file($node_standby_1->logfile,
+			$node_standby_1_loglocation);
+		unlike(
+			$log,
+			qr/page verification failed/,
+			"no checksum validation errors in standby_1 log");
+		$node_standby_1_loglocation = -s $node_standby_1->logfile;
+
+		# If data checksums are enabled, take the opportunity to verify them
+		# while the cluster is offline
+		$node_standby_1->checksum_verify_offline()
+		  unless $data_checksum_state eq 'off';
+		random_sleep();
+		$node_standby_1->start;
+		# Start a select-only pgbench in the background on the standby
+		background_ro_pgbench($node_standby_1->port, 1, $pgb_standby_1_stdin,
+			$pgb_standby_1_stdout, $pgb_standby_1_stderr);
+	}
+
+	$node_primary->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+
+	flip_data_checksums();
+	random_sleep();
+	my $result = $node_primary->safe_psql('postgres',
+		"SELECT count(*) FROM t WHERE a > 1");
+	is($result, '100000', 'ensure data pages can be read back on primary');
+	random_sleep();
+	$node_primary->wait_for_catchup($node_standby_1, 'write');
+
+	# Potentially powercycle the cluster
+	$node_primary->stop($stop_modes[ int(rand(100)) ]) if cointoss();
+	random_sleep();
+	$node_standby_1->stop($stop_modes[ int(rand(100)) ]) if cointoss();
+}
+
+# Testrun is over, ensure that data reads back as expected and perform a final
+# verification of the data checksum state.
+my $result =
+  $node_primary->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '100000', 'ensure data pages can be read back on primary');
+test_checksum_state($node_primary, $data_checksum_state);
+test_checksum_state($node_standby_1, $data_checksum_state);
+
+# Perform one final pass over the logs and hunt for unexpected errors
+my $log = PostgreSQL::Test::Utils::slurp_file($node_primary->logfile,
+	$node_primary_loglocation);
+unlike(
+	$log,
+	qr/page verification failed/,
+	"no checksum validation errors in primary log");
+$node_primary_loglocation = -s $node_primary->logfile;
+$log = PostgreSQL::Test::Utils::slurp_file($node_standby_1->logfile,
+	$node_standby_1_loglocation);
+unlike(
+	$log,
+	qr/page verification failed/,
+	"no checksum validation errors in standby_1 log");
+$node_standby_1_loglocation = -s $node_standby_1->logfile;
+
+$node_standby_1->teardown_node;
+$node_primary->teardown_node;
+
+done_testing();
diff --git a/src/test/modules/test_checksums/t/DataChecksums/Utils.pm b/src/test/modules/test_checksums/t/DataChecksums/Utils.pm
new file mode 100644
index 00000000000..ee2f2a1428f
--- /dev/null
+++ b/src/test/modules/test_checksums/t/DataChecksums/Utils.pm
@@ -0,0 +1,185 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+=pod
+
+=head1 NAME
+
+DataChecksums::Utils - Utility functions for testing data checksums in a running cluster
+
+=head1 SYNOPSIS
+
+  use PostgreSQL::Test::Cluster;
+  use DataChecksums::Utils qw( .. );
+
+  # Create, and start, a new cluster
+  my $node = PostgreSQL::Test::Cluster->new('primary');
+  $node->init;
+  $node->start;
+
+  test_checksum_state($node, 'off');
+
+  enable_data_checksums($node);
+
+  wait_for_checksum_state($node, 'on');
+
+
+=cut
+
+package DataChecksums::Utils;
+
+use strict;
+use warnings FATAL => 'all';
+use Exporter 'import';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+our @EXPORT = qw(
+  test_checksum_state
+  wait_for_checksum_state
+  enable_data_checksums
+  disable_data_checksums
+);
+
+=pod
+
+=head1 METHODS
+
+=over
+
+=item test_checksum_state(node, state)
+
+Test that the current value of the data checksum GUC in the server running
+at B<node> matches B<state>.  If the values differ, a test failure is logged.
+Returns True if the values match, otherwise False.
+
+=cut
+
+sub test_checksum_state
+{
+	my ($postgresnode, $state) = @_;
+
+	my $result = $postgresnode->safe_psql('postgres',
+		"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+	);
+	is($result, $state, 'ensure checksums are set to ' . $state);
+	return $result eq $state;
+}
+
+=item wait_for_checksum_state(node, state)
+
+Test the value of the data checksum GUC in the server running at B<node>
+repeatedly until it matches B<state> or times out.  Processing will run for
+$PostgreSQL::Test::Utils::timeout_default seconds before timing out.  If the
+values differ when the process times out, False is returned and a test failure
+is logged, otherwise True.
+
+=cut
+
+sub wait_for_checksum_state
+{
+	my ($postgresnode, $state) = @_;
+
+	my $res = $postgresnode->poll_query_until(
+		'postgres',
+		"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+		$state);
+	is($res, 1, 'ensure data checksums are transitioned to ' . $state);
+	return $res == 1;
+}
+
+=item enable_data_checksums($node, %params)
+
+Function for enabling data checksums in the cluster running at B<node>.
+
+=over
+
+=item cost_delay
+
+The C<cost_delay> to use when enabling data checksums, default is 0.
+
+=item cost_limit
+
+The C<cost_limit> to use when enabling data checksums, default is 100.
+
+=item fast
+
+If set to C<true> an immediate checkpoint will be issued after data
+checksums are enabled.  Setting this to false will lead to slower tests.
+The default is true.
+
+=item wait
+
+If defined, the function will wait for the state defined in this parameter,
+waiting timing out, before returning.  The function will wait for
+$PostgreSQL::Test::Utils::timeout_default seconds before timing out.
+
+=back
+
+=cut
+
+sub enable_data_checksums
+{
+	my $postgresnode = shift;
+	my %params = @_;
+
+	# Set sane defaults for the parameters
+	$params{cost_delay} = 0 unless (defined($params{cost_delay}));
+	$params{cost_limit} = 100 unless (defined($params{cost_limit}));
+	$params{fast} = 'true' unless (defined($params{fast}));
+
+	my $query = <<'EOQ';
+SELECT pg_enable_data_checksums(%s, %s, %s);
+EOQ
+
+	$postgresnode->safe_psql(
+		'postgres',
+		sprintf($query,
+			$params{cost_delay}, $params{cost_limit}, $params{fast}));
+
+	wait_for_checksum_state($postgresnode, $params{wait})
+	  if (defined($params{wait}));
+}
+
+=item disable_data_checksums($node, %params)
+
+Function for disabling data checksums in the cluster running at B<node>.
+
+=over
+
+=item wait
+
+If defined, the function will wait for the state to turn to B<off>, or
+waiting timing out, before returning.  The function will wait for
+$PostgreSQL::Test::Utils::timeout_default seconds before timing out.
+Unlike in C<enable_data_checksums> the value of the parameter is discarded.
+
+=back
+
+=cut
+
+sub disable_data_checksums
+{
+	my $postgresnode = shift;
+	my %params = @_;
+
+	# Set sane defaults for the parameters
+	$params{fast} = 'true' unless (defined($params{fast}));
+
+	my $query = <<'EOQ';
+SELECT pg_disable_data_checksums(%s);
+EOQ
+
+	$postgresnode->safe_psql('postgres', sprintf($query, $params{fast}));
+
+	wait_for_checksum_state($postgresnode, 'off') if (defined($params{wait}));
+}
+
+=pod
+
+=back
+
+=cut
+
+1;
diff --git a/src/test/modules/test_checksums/test_checksums--1.0.sql b/src/test/modules/test_checksums/test_checksums--1.0.sql
new file mode 100644
index 00000000000..704b45a3186
--- /dev/null
+++ b/src/test/modules/test_checksums/test_checksums--1.0.sql
@@ -0,0 +1,20 @@
+/* src/test/modules/test_checksums/test_checksums--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_checksums" to load this file. \quit
+
+CREATE FUNCTION dcw_inject_delay_barrier(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION dcw_inject_fail_database(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION dcw_prune_dblist(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION dcw_fake_temptable(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_checksums/test_checksums.c b/src/test/modules/test_checksums/test_checksums.c
new file mode 100644
index 00000000000..26897bff960
--- /dev/null
+++ b/src/test/modules/test_checksums/test_checksums.c
@@ -0,0 +1,173 @@
+/*--------------------------------------------------------------------------
+ *
+ * test_checksums.c
+ *		Test data checksums
+ *
+ * Copyright (c) 2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *		src/test/modules/test_checksums/test_checksums.c
+ *
+ * -------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "funcapi.h"
+#include "miscadmin.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/latch.h"
+#include "utils/injection_point.h"
+#include "utils/wait_event.h"
+
+#define USEC_PER_SEC    1000000
+
+
+PG_MODULE_MAGIC;
+
+extern PGDLLEXPORT void dc_delay_barrier(const char *name, const void *private_data, void *arg);
+extern PGDLLEXPORT void dc_fail_database(const char *name, const void *private_data, void *arg);
+extern PGDLLEXPORT void dc_dblist(const char *name, const void *private_data, void *arg);
+extern PGDLLEXPORT void dc_fake_temptable(const char *name, const void *private_data, void *arg);
+
+/*
+ * Test for delaying emission of procsignalbarriers.
+ */
+void
+dc_delay_barrier(const char *name, const void *private_data, void *arg)
+{
+	(void) name;
+	(void) private_data;
+
+	(void) WaitLatch(MyLatch,
+					 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+					 (3 * 1000),
+					 WAIT_EVENT_PG_SLEEP);
+}
+
+PG_FUNCTION_INFO_V1(dcw_inject_delay_barrier);
+Datum
+dcw_inject_delay_barrier(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksums-enable-checksums-delay",
+							 "test_checksums",
+							 "dc_delay_barrier",
+							 NULL,
+							 0);
+	else
+		InjectionPointDetach("datachecksums-enable-checksums-delay");
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
+
+void
+dc_fail_database(const char *name, const void *private_data, void *arg)
+{
+	static bool first_pass = true;
+	DataChecksumsWorkerResult *res = (DataChecksumsWorkerResult *) arg;
+
+	if (first_pass)
+		*res = DATACHECKSUMSWORKER_FAILED;
+	first_pass = false;
+}
+
+PG_FUNCTION_INFO_V1(dcw_inject_fail_database);
+Datum
+dcw_inject_fail_database(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksumsworker-fail-db",
+							 "test_checksums",
+							 "dc_fail_database",
+							 NULL,
+							 0);
+	else
+		InjectionPointDetach("datachecksumsworker-fail-db");
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
+
+/*
+ * Test to remove an entry from the Databaselist to force re-processing since
+ * not all databases could be processed in the first iteration of the loop.
+ */
+void
+dc_dblist(const char *name, const void *private_data, void *arg)
+{
+	static bool first_pass = true;
+	List	   *DatabaseList = (List *) arg;
+
+	if (first_pass)
+		DatabaseList = list_delete_last(DatabaseList);
+	first_pass = false;
+}
+
+PG_FUNCTION_INFO_V1(dcw_prune_dblist);
+Datum
+dcw_prune_dblist(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksumsworker-initial-dblist",
+							 "test_checksums",
+							 "dc_dblist",
+							 NULL,
+							 0);
+	else
+		InjectionPointDetach("datachecksumsworker-initial-dblist");
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
+
+/*
+ * Test to force waiting for existing temptables.
+ */
+void
+dc_fake_temptable(const char *name, const void *private_data, void *arg)
+{
+	static bool first_pass = true;
+	int		   *numleft = (int *) arg;
+
+	if (first_pass)
+		*numleft = 1;
+	first_pass = false;
+}
+
+PG_FUNCTION_INFO_V1(dcw_fake_temptable);
+Datum
+dcw_fake_temptable(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksumsworker-fake-temptable-wait",
+							 "test_checksums",
+							 "dc_fake_temptable",
+							 NULL,
+							 0);
+	else
+		InjectionPointDetach("datachecksumsworker-fake-temptable-wait");
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
diff --git a/src/test/modules/test_checksums/test_checksums.control b/src/test/modules/test_checksums/test_checksums.control
new file mode 100644
index 00000000000..84b4cc035a7
--- /dev/null
+++ b/src/test/modules/test_checksums/test_checksums.control
@@ -0,0 +1,4 @@
+comment = 'Test code for data checksums'
+default_version = '1.0'
+module_pathname = '$libdir/test_checksums'
+relocatable = true
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 35413f14019..3af7944acea 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3872,6 +3872,51 @@ sub advance_wal
 	}
 }
 
+=item $node->checksum_enable_offline()
+
+Enable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_enable_offline
+{
+	my ($self) = @_;
+
+	print "# Enabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-e');
+	return;
+}
+
+=item checksum_disable_offline
+
+Disable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_disable_offline
+{
+	my ($self) = @_;
+
+	print "# Disabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-d');
+	return;
+}
+
+sub checksum_verify_offline
+{
+	my ($self) = @_;
+
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-c');
+	return;
+}
+
 =pod
 
 =back
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 35e8aad7701..4b9c5526e50 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2071,6 +2071,42 @@ pg_stat_progress_create_index| SELECT s.pid,
     s.param15 AS partitions_done
    FROM (pg_stat_get_progress_info('CREATE INDEX'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
+pg_stat_progress_data_checksums| SELECT s.pid,
+    s.datid,
+    d.datname,
+        CASE s.param1
+            WHEN 0 THEN 'enabling'::text
+            WHEN 1 THEN 'disabling'::text
+            WHEN 2 THEN 'waiting'::text
+            WHEN 3 THEN 'waiting on temporary tables'::text
+            WHEN 4 THEN 'waiting on checkpoint'::text
+            WHEN 5 THEN 'done'::text
+            ELSE NULL::text
+        END AS phase,
+        CASE s.param2
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param2
+        END AS databases_total,
+    s.param3 AS databases_done,
+        CASE s.param4
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param4
+        END AS relations_total,
+        CASE s.param5
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param5
+        END AS relations_done,
+        CASE s.param6
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param6
+        END AS blocks_total,
+        CASE s.param7
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param7
+        END AS blocks_done
+   FROM (pg_stat_get_progress_info('DATACHECKSUMS'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)))
+  ORDER BY s.datid;
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,
diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 605f5070376..9042e4d38e3 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -59,6 +59,22 @@ io worker|relation|vacuum
 io worker|temp relation|normal
 io worker|wal|init
 io worker|wal|normal
+datachecksumsworker launcher|relation|bulkread
+datachecksumsworker launcher|relation|bulkwrite
+datachecksumsworker launcher|relation|init
+datachecksumsworker launcher|relation|normal
+datachecksumsworker launcher|relation|vacuum
+datachecksumsworker launcher|temp relation|normal
+datachecksumsworker launcher|wal|init
+datachecksumsworker launcher|wal|normal
+datachecksumsworker worker|relation|bulkread
+datachecksumsworker worker|relation|bulkwrite
+datachecksumsworker worker|relation|init
+datachecksumsworker worker|relation|normal
+datachecksumsworker worker|relation|vacuum
+datachecksumsworker worker|temp relation|normal
+datachecksumsworker worker|wal|init
+datachecksumsworker worker|wal|normal
 slotsync worker|relation|bulkread
 slotsync worker|relation|bulkwrite
 slotsync worker|relation|init
@@ -95,7 +111,7 @@ walsummarizer|wal|init
 walsummarizer|wal|normal
 walwriter|wal|init
 walwriter|wal|normal
-(79 rows)
+(87 rows)
 \a
 -- ensure that both seqscan and indexscan plans are allowed
 SET enable_seqscan TO on;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a13e8162890..df0f49ea2aa 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -416,6 +416,7 @@ CheckPointStmt
 CheckpointStatsData
 CheckpointerRequest
 CheckpointerShmemStruct
+ChecksumType
 Chromosome
 CkptSortItem
 CkptTsStatus
@@ -608,6 +609,10 @@ DataPageDeleteStack
 DataTypesUsageChecks
 DataTypesUsageVersionCheck
 DatabaseInfo
+DataChecksumsWorkerDatabase
+DataChecksumsWorkerResult
+DataChecksumsWorkerResultEntry
+DataChecksumsWorkerShmemStruct
 DateADT
 DateTimeErrorExtra
 Datum
@@ -4243,6 +4248,7 @@ xl_btree_split
 xl_btree_unlink_page
 xl_btree_update
 xl_btree_vacuum
+xl_checksum_state
 xl_clog_truncate
 xl_commit_ts_truncate
 xl_dbase_create_file_copy_rec
-- 
2.39.3 (Apple Git-146)

#56Tomas Vondra
tomas@vondra.me
In reply to: Daniel Gustafsson (#55)
Re: Changing the state of data checksums in a running cluster

On 8/27/25 13:00, Daniel Gustafsson wrote:

On 27 Aug 2025, at 11:39, Tomas Vondra <tomas@vondra.me> wrote:

Just to be clear - I don't see any pg_checksums failures either. I only
see failures in the standby log, and I don't think the script checks
that (it probably should).

Right, that's what I'm been checking too. I have been considering adding
another background process for monitoring all the log entries but I just
thought of a much simpler solution. When the clusters are turned off we can
take the opportunity to slurp the log written since last restart and inspect
it. The attached adds this.

There's still a couple issues, unfortunately. First, this may not do
what you intended:

sub cointoss
{
return int(rand(2) == 1);
}

The rand() call returns values in [0, 2.0), it's very unlikely to
produce 1, and so there are no restarts. I fixed this to

sub cointoss
{
return int(rand() < 0.5);
}

and then it starts working, restarting with ~50% probability.

Then it hits another problem when calling pg_checksums. That only works
after a "clean" shutdown, not after immediate one.This can't just check
is_alive, it needs to remember how exactly was the server stopped and
only verify checksums for 'fast' mode. I never saw any failures in
pg_checksums, so I just commented that out.

Then it starts working as intended, I think. I only did a couple runs so
far, but I haven't found any checksum failures ... Then I realized I did
the earlier runs with slightly stale master HEAD (38c5fbd97ee6a to be
precise), while this time I did git pull.

And this happened on Friday:

commit c13070a27b63d9ce4850d88a63bf889a6fde26f0
Author: Alexander Korotkov <akorotkov@postgresql.org>
Date: Fri Aug 22 18:44:39 2025 +0300

Revert "Get rid of WALBufMappingLock"

This reverts commit bc22dc0e0ddc2dcb6043a732415019cc6b6bf683.
It appears that conditional variables are not suitable for use
inside critical sections. If WaitLatch()/WaitEventSetWaitBlock()
face postmaster death, they exit, releasing all locks instead of
PANIC. In certain situations, this leads to data corruption.

...

I think it's very likely the checksums were broken by this. After all,
that linked thread has subject "VM corruption on standby" and I've only
ever seen checksum failures on standby on the _vm fork.

There's also an issue when calling teardown_node at the end. That won't
work if the instance got stopped in the last round, and the test will
fail like this:

t/006_concurrent_pgbench.pl .. 379/? # Tests were run but no plan was
declared and done_testing() was not seen.
# Looks like your test exited with 29 just after 379.
t/006_concurrent_pgbench.pl .. Dubious, test returned 29 (wstat 7424,
0x1d00)
All 379 subtests passed

It would probably be good to at some point clean this up a little by placing
all of variables for a single node in an associative hash which can be passed
around, and place repeated code in subroutines etc..

Yeah. For me Perl is hard to read in any case, but if we can cleanup
this a little bit before additional cases, that'd be nice.

regards

--
Tomas Vondra

#57Tomas Vondra
tomas@vondra.me
In reply to: Tomas Vondra (#56)
Re: Changing the state of data checksums in a running cluster

On 8/27/25 14:39, Tomas Vondra wrote:

...

And this happened on Friday:

commit c13070a27b63d9ce4850d88a63bf889a6fde26f0
Author: Alexander Korotkov <akorotkov@postgresql.org>
Date: Fri Aug 22 18:44:39 2025 +0300

Revert "Get rid of WALBufMappingLock"

This reverts commit bc22dc0e0ddc2dcb6043a732415019cc6b6bf683.
It appears that conditional variables are not suitable for use
inside critical sections. If WaitLatch()/WaitEventSetWaitBlock()
face postmaster death, they exit, releasing all locks instead of
PANIC. In certain situations, this leads to data corruption.

...

I think it's very likely the checksums were broken by this. After all,
that linked thread has subject "VM corruption on standby" and I've only
ever seen checksum failures on standby on the _vm fork.

Forgot to mention - I did try with c13070a27b reverted, and with that I
can reproduce the checksum failures again (using the fixed TAP test).

It's not a definitive proof, but it's a hint c13070a27b63 was causing
the checksum failures.

regards

--
Tomas Vondra

#58Tomas Vondra
tomas@vondra.me
In reply to: Tomas Vondra (#56)
2 attachment(s)
Re: Changing the state of data checksums in a running cluster

Hi,

I spent a bit more time fixing the TAP test. The attached patch makes it
"work" for me (or I think it should, in principle). I'm not saying it's
the best way to do stuff.

With the patch applied, I tried running it, and I got a failure when
running pg_checksums. There's a log snippet describing the issue, but
AFAICS it's happening like this:

1) checksums are disabled
2) flip_data_checksums gets called
3) both clusters go through 'inprogress-on' and 'on' states
4) primary gets shutdown in 'immediate' mode
5) standby gets shutdown in 'fast' mode
6) we try to validate checksums on the standby, but control file still
says checksums=inprogress-on

This seems like a bug to me - AFAICS the expectation is that after fast
shutdown, we don't forget the checksum state. Or is that expected? In
that case the TAP test probably needs to check the control file, instead
of relying on the perl variable $data_checksum_state. Or maybe it should
check that the control file has the correct / expected state?

FWIW I don't think the primary shutdown matters. I've seen multiple of
these failures, and it happens even without primary shutdown. But the
standby "fast" shutdown is always there.

But this also shows a limitation of the TAP test - it never triggers the
shutdowns while flipping the checksums (in flip_data_checksums). I think
that's something worth testing.

regards

--
Tomas Vondra

Attachments:

checksum-tap-fix.patchtext/x-patch; charset=UTF-8; name=checksum-tap-fix.patchDownload
diff --git a/src/test/modules/test_checksums/t/006_concurrent_pgbench.pl b/src/test/modules/test_checksums/t/006_concurrent_pgbench.pl
index b33ca6e0c26..5cee6d4a6b5 100644
--- a/src/test/modules/test_checksums/t/006_concurrent_pgbench.pl
+++ b/src/test/modules/test_checksums/t/006_concurrent_pgbench.pl
@@ -55,7 +55,7 @@ if ($ENV{enable_injection_points} ne 'yes')
 # whether to turn things off during testing.
 sub cointoss
 {
-	return int(rand(2) == 1);
+	return int(rand() < 0.5);
 }
 
 # Helper for injecting random sleeps here and there in the testrun. The sleep
@@ -74,7 +74,7 @@ sub background_ro_pgbench
 	my ($port, $stdin, $stdout, $stderr) = @_;
 
 	my $pgbench_primary = IPC::Run::start(
-		[ 'pgbench', '-p', $port, '-S', '-T', '600', '-c', '10', 'postgres' ],
+		[ 'pgbench', '-n', '-p', $port, '-S', '-T', '600', '-c', '10', 'postgres' ],
 		'<' => \$stdin,
 		'>' => \$stdout,
 		'2>' => \$stderr,
@@ -224,6 +224,9 @@ background_rw_pgbench(
 	$node_primary->port, $pgb_primary_stdin,
 	$pgb_primary_stdout, $pgb_primary_stderr);
 
+my $primary_shutdown_clean = 0;
+my $standby_shutdown_clean = 0;
+
 # Main test suite. This loop will start a pgbench run on the cluster and while
 # that's running flip the state of data checksums concurrently. It will then
 # randomly restart thec cluster (in fast or immediate) mode and then check for
@@ -246,9 +249,11 @@ for (my $i = 0; $i < $TEST_ITERATIONS; $i++)
 		$node_primary_loglocation = -s $node_primary->logfile;
 
 		# If data checksums are enabled, take the opportunity to verify them
-		# while the cluster is offline
+		# while the cluster is offline (but only if stopped in a clean way,
+		# not after immediate shutdown)
 		$node_primary->checksum_verify_offline()
-		  unless $data_checksum_state eq 'off';
+		  unless $data_checksum_state eq 'off' or !$primary_shutdown_clean;
+
 		random_sleep();
 		$node_primary->start;
 		# Start a pgbench in the background against the primary
@@ -270,9 +275,11 @@ for (my $i = 0; $i < $TEST_ITERATIONS; $i++)
 		$node_standby_1_loglocation = -s $node_standby_1->logfile;
 
 		# If data checksums are enabled, take the opportunity to verify them
-		# while the cluster is offline
+		# while the cluster is offline (but only if stopped in a clean way,
+		# not after immediate shutdown)
 		$node_standby_1->checksum_verify_offline()
-		  unless $data_checksum_state eq 'off';
+		  unless $data_checksum_state eq 'off' or !$standby_shutdown_clean;
+
 		random_sleep();
 		$node_standby_1->start;
 		# Start a select-only pgbench in the background on the standby
@@ -287,13 +294,41 @@ for (my $i = 0; $i < $TEST_ITERATIONS; $i++)
 	my $result = $node_primary->safe_psql('postgres',
 		"SELECT count(*) FROM t WHERE a > 1");
 	is($result, '100000', 'ensure data pages can be read back on primary');
+
 	random_sleep();
+
 	$node_primary->wait_for_catchup($node_standby_1, 'write');
 
-	# Potentially powercycle the cluster
-	$node_primary->stop($stop_modes[ int(rand(100)) ]) if cointoss();
 	random_sleep();
-	$node_standby_1->stop($stop_modes[ int(rand(100)) ]) if cointoss();
+
+	# Potentially powercycle the cluster (the nodes independently)
+	# XXX should maybe try stopping nodes in the opposite order too?
+	if (cointoss())
+	{
+		my $mode = $stop_modes[ int(rand(100)) ];
+		$node_primary->stop($mode);
+		$primary_shutdown_clean = ($mode eq 'fast');
+	}
+
+	random_sleep();
+
+	if (cointoss())
+	{
+		my $mode = $stop_modes[ int(rand(100)) ];
+		$node_standby_1->stop($mode);
+		$standby_shutdown_clean = ($mode eq 'fast');
+	}
+}
+
+# make sure the nodes are running
+if (!$node_primary->is_alive)
+{
+	$node_primary->start;
+}
+
+if (!$node_standby_1->is_alive)
+{
+        $node_standby_1->start;
 }
 
 # Testrun is over, ensure that data reads back as expected and perform a final
checksum-failure.txttext/plain; charset=UTF-8; name=checksum-failure.txtDownload
#59Tomas Vondra
tomas@vondra.me
In reply to: Tomas Vondra (#57)
2 attachment(s)
Re: Changing the state of data checksums in a running cluster

On 8/27/25 14:42, Tomas Vondra wrote:

On 8/27/25 14:39, Tomas Vondra wrote:

...

And this happened on Friday:

commit c13070a27b63d9ce4850d88a63bf889a6fde26f0
Author: Alexander Korotkov <akorotkov@postgresql.org>
Date: Fri Aug 22 18:44:39 2025 +0300

Revert "Get rid of WALBufMappingLock"

This reverts commit bc22dc0e0ddc2dcb6043a732415019cc6b6bf683.
It appears that conditional variables are not suitable for use
inside critical sections. If WaitLatch()/WaitEventSetWaitBlock()
face postmaster death, they exit, releasing all locks instead of
PANIC. In certain situations, this leads to data corruption.

...

I think it's very likely the checksums were broken by this. After all,
that linked thread has subject "VM corruption on standby" and I've only
ever seen checksum failures on standby on the _vm fork.

Forgot to mention - I did try with c13070a27b reverted, and with that I
can reproduce the checksum failures again (using the fixed TAP test).

It's not a definitive proof, but it's a hint c13070a27b63 was causing
the checksum failures.

Unfortunately, it seems I spoke too soon :-( I decided to test this on
multiple machines overnight, and it still fails on the slower ones.

Attached is a patch addressing a couple more issues, to makes the TAP
test work well enough. (Attached as .txt, to not confuse cfbot).

- The pgbench started by IPC::Run::start() needs to be finished, to
release resources. Otherwise it leaks file descriptors (and there's a
bunch of "defunct" pgbench processes), which may be a problem with
increased number of iterations.

- AFAICS the pgbench can't use stdin/stdout/stderr, otherwise the pipes
get broken when the command fails (after restart). I just used /dev/null
instead, the stdout/stderr was not used anyway (or was it?).

- commented out the pg_checksums call, because of the issues mentioned
earlier (I was trying to make it work by remembering the state, but it
seems to not make it into control file before shutdown occasionally)

I increased the number of iterations to 1000+ and ran it on three machines:

- ryzen (new machine from ~2024)
- xeon (old slow machine from ~2016)
- rpi5 (very slow machine)

I haven't seen a single failure on ryzen, after ~3000 iterations. But
both xeon and rpi5 show a number of failures. Xeon has about 35 reports
of 'Failed test', rpi5 and about 10.

My guess is it's something about timing. It works on the "fast" ryzen,
but fails on xeon which is ~3-4x slower. And rpi5, which is even slower.

The other reason why it seems unrelated to the reverted commit is that
it's not just about visibility maps (which was got corrupted). I see
checksum failures on VM and FSM. I think I forgot about the FSM cases,
and by the fact that I saw no failures on the ryzen post revert. But
clearly, other machines still have issues.

Another interesting fact is that the checksum failures happen both on
the primary and the standby, it's not just a standby issue. But again,
this sees to be machine-dependent. On the rpi5 I've only seen standby
issues. The xeon sees failures both on primary/standby (roughly 1:1).

There are more weird things. If I grep for page failures, I see this (a
more detailed log attached):

-----------
# 2025-08-28 22:33:28.195 CEST startup[177466] LOG: page verification
failed, calculated checksum 25350 but expected 44559
# 2025-08-28 22:33:28.197 CEST startup[177466] LOG: page verification
failed, calculated checksum 25350 but expected 44559
# 2025-08-28 22:33:28.199 CEST startup[177466] LOG: page verification
failed, calculated checksum 59909 but expected 53920
# 2025-08-28 22:33:28.201 CEST startup[177466] LOG: page verification
failed, calculated checksum 59909 but expected 53920
# 2025-08-28 22:33:28.206 CEST startup[177466] LOG: page verification
failed, calculated checksum 59909 but expected 53920
# 2025-08-28 22:33:28.207 CEST startup[177466] LOG: page verification
failed, calculated checksum 25350 but expected 44559
-----------

This is right after a single restart, while doing the recovery. The
weird thing is, this is all for just two FSM pages!

-----------
LOG: invalid page in block 2 of relation "base/5/16410_fsm"; zeroing
out page
LOG: invalid page in block 2 of relation "base/5/16408_fsm"; zeroing
out page
-----------

And the calculated/expected checksums repeat! It's just different WAL
records hitting the same page, and complaining about the same issue,
after claiming the page was zeroed out. Isn't that weird? How come the
page doesn't "get" the correct checksum after the first redo?

I've seen these failures after changing checksums in both directions,
both after enabling and disabling. But I've only ever saw this after
immediate shutdown, never after fast shutdown. (It's interesting the
pg_checksums failed only after fast shutdowns ...).

Could it be that the redo happens to start from an older position, but
using the new checksum version?

regards

--
Tomas Vondra

Attachments:

failure.logtext/x-log; charset=UTF-8; name=failure.logDownload
tap-fixes.txttext/plain; charset=UTF-8; name=tap-fixes.txtDownload
From 57bb79b1bc8faac646131336abcc1596711c5f32 Mon Sep 17 00:00:00 2001
From: tomas <tomas>
Date: Thu, 28 Aug 2025 22:10:25 +0200
Subject: [PATCH] TAP fixes

---
 .../t/006_concurrent_pgbench.pl               | 88 ++++++++++++++-----
 1 file changed, 68 insertions(+), 20 deletions(-)

diff --git a/src/test/modules/test_checksums/t/006_concurrent_pgbench.pl b/src/test/modules/test_checksums/t/006_concurrent_pgbench.pl
index b33ca6e0c26..374eac7e6a3 100644
--- a/src/test/modules/test_checksums/t/006_concurrent_pgbench.pl
+++ b/src/test/modules/test_checksums/t/006_concurrent_pgbench.pl
@@ -23,11 +23,14 @@ my $node_primary_loglocation = 0;
 my $node_standby_1;
 my $node_standby_1_loglocation = 0;
 
+my $pgbench_primary = undef;
+my $pgbench_standby = undef;
+
 # The number of full test iterations which will be performed. The exact number
 # of tests performed and the wall time taken is non-deterministic as the test
 # performs a lot of randomized actions, but 50 iterations will be a long test
 # run regardless.
-my $TEST_ITERATIONS = 50;
+my $TEST_ITERATIONS = 1000;
 
 # Variables which record the current state of the cluster
 my $data_checksum_state = 'off';
@@ -55,7 +58,7 @@ if ($ENV{enable_injection_points} ne 'yes')
 # whether to turn things off during testing.
 sub cointoss
 {
-	return int(rand(2) == 1);
+	return int(rand() < 0.5);
 }
 
 # Helper for injecting random sleeps here and there in the testrun. The sleep
@@ -73,11 +76,16 @@ sub background_ro_pgbench
 {
 	my ($port, $stdin, $stdout, $stderr) = @_;
 
-	my $pgbench_primary = IPC::Run::start(
-		[ 'pgbench', '-p', $port, '-S', '-T', '600', '-c', '10', 'postgres' ],
-		'<' => \$stdin,
-		'>' => \$stdout,
-		'2>' => \$stderr,
+	if ($pgbench_standby)
+	{
+		$pgbench_standby->finish;
+	}
+
+	$pgbench_standby = IPC::Run::start(
+		[ 'pgbench', '-n', '-p', $port, '-S', '-T', '600', '-c', '10', 'postgres' ],
+		'<' => '/dev/null',
+		'>' => '/dev/null',
+		'2>' => '/dev/null',
 		IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default));
 }
 
@@ -87,11 +95,16 @@ sub background_rw_pgbench
 {
 	my ($port, $stdin, $stdout, $stderr) = @_;
 
-	my $pgbench_primary = IPC::Run::start(
+	if ($pgbench_primary)
+	{
+		$pgbench_primary->finish;
+	}
+
+	$pgbench_primary = IPC::Run::start(
 		[ 'pgbench', '-p', $port, '-T', '600', '-c', '10', 'postgres' ],
-		'<' => \$stdin,
-		'>' => \$stdout,
-		'2>' => \$stderr,
+		'<' => '/dev/null',
+		'>' => '/dev/null',
+		'2>' => '/dev/null',
 		IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default));
 }
 
@@ -224,6 +237,9 @@ background_rw_pgbench(
 	$node_primary->port, $pgb_primary_stdin,
 	$pgb_primary_stdout, $pgb_primary_stderr);
 
+my $primary_shutdown_clean = 0;
+my $standby_shutdown_clean = 0;
+
 # Main test suite. This loop will start a pgbench run on the cluster and while
 # that's running flip the state of data checksums concurrently. It will then
 # randomly restart thec cluster (in fast or immediate) mode and then check for
@@ -246,9 +262,11 @@ for (my $i = 0; $i < $TEST_ITERATIONS; $i++)
 		$node_primary_loglocation = -s $node_primary->logfile;
 
 		# If data checksums are enabled, take the opportunity to verify them
-		# while the cluster is offline
-		$node_primary->checksum_verify_offline()
-		  unless $data_checksum_state eq 'off';
+		# while the cluster is offline (but only if stopped in a clean way,
+		# not after immediate shutdown)
+		#$node_primary->checksum_verify_offline()
+		#  unless $data_checksum_state eq 'off' or !$primary_shutdown_clean;
+
 		random_sleep();
 		$node_primary->start;
 		# Start a pgbench in the background against the primary
@@ -270,9 +288,11 @@ for (my $i = 0; $i < $TEST_ITERATIONS; $i++)
 		$node_standby_1_loglocation = -s $node_standby_1->logfile;
 
 		# If data checksums are enabled, take the opportunity to verify them
-		# while the cluster is offline
-		$node_standby_1->checksum_verify_offline()
-		  unless $data_checksum_state eq 'off';
+		# while the cluster is offline (but only if stopped in a clean way,
+		# not after immediate shutdown)
+		#$node_standby_1->checksum_verify_offline()
+		#  unless $data_checksum_state eq 'off' or !$standby_shutdown_clean;
+
 		random_sleep();
 		$node_standby_1->start;
 		# Start a select-only pgbench in the background on the standby
@@ -287,13 +307,41 @@ for (my $i = 0; $i < $TEST_ITERATIONS; $i++)
 	my $result = $node_primary->safe_psql('postgres',
 		"SELECT count(*) FROM t WHERE a > 1");
 	is($result, '100000', 'ensure data pages can be read back on primary');
+
 	random_sleep();
+
 	$node_primary->wait_for_catchup($node_standby_1, 'write');
 
-	# Potentially powercycle the cluster
-	$node_primary->stop($stop_modes[ int(rand(100)) ]) if cointoss();
 	random_sleep();
-	$node_standby_1->stop($stop_modes[ int(rand(100)) ]) if cointoss();
+
+	# Potentially powercycle the cluster (the nodes independently)
+	# XXX should maybe try stopping nodes in the opposite order too?
+	if (cointoss())
+	{
+		my $mode = $stop_modes[ int(rand(100)) ];
+		$node_primary->stop($mode);
+		$primary_shutdown_clean = ($mode eq 'fast');
+	}
+
+	random_sleep();
+
+	if (cointoss())
+	{
+		my $mode = $stop_modes[ int(rand(100)) ];
+		$node_standby_1->stop($mode);
+		$standby_shutdown_clean = ($mode eq 'fast');
+	}
+}
+
+# make sure the nodes are running
+if (!$node_primary->is_alive)
+{
+	$node_primary->start;
+}
+
+if (!$node_standby_1->is_alive)
+{
+        $node_standby_1->start;
 }
 
 # Testrun is over, ensure that data reads back as expected and perform a final
-- 
2.39.5

#60Tomas Vondra
tomas@vondra.me
In reply to: Tomas Vondra (#59)
1 attachment(s)
Re: Changing the state of data checksums in a running cluster

On 8/29/25 16:26, Tomas Vondra wrote:

...

I've seen these failures after changing checksums in both directions,
both after enabling and disabling. But I've only ever saw this after
immediate shutdown, never after fast shutdown. (It's interesting the
pg_checksums failed only after fast shutdowns ...).

Of course, right after I send a message, it fails after a fast shutdown,
contradicting my observation ...

Could it be that the redo happens to start from an older position, but
using the new checksum version?

... but it also provided more data supporting this hypothesis. I added
logging of pg_current_wal_lsn() before / after changing checksums on the
primary, and I see this:

1) LSN before: 14/2B0F26A8
2) enable checksums
3) LSN after: 14/EE335D60
4) standby waits for 14/F4E786E8 (higher, likely thanks to pgbench)
5) standby restarts with -m fast
6) redo starts at 14/230043B0, which is *before* enabling checksums

I guess this is the root cause. A bit more detailed log attached.

regards

--
Tomas Vondra

Attachments:

failure2.logtext/x-log; charset=UTF-8; name=failure2.logDownload
#61Tomas Vondra
tomas@vondra.me
In reply to: Tomas Vondra (#60)
1 attachment(s)
Re: Changing the state of data checksums in a running cluster

On 8/29/25 16:38, Tomas Vondra wrote:

On 8/29/25 16:26, Tomas Vondra wrote:

...

I've seen these failures after changing checksums in both directions,
both after enabling and disabling. But I've only ever saw this after
immediate shutdown, never after fast shutdown. (It's interesting the
pg_checksums failed only after fast shutdowns ...).

Of course, right after I send a message, it fails after a fast shutdown,
contradicting my observation ...

Could it be that the redo happens to start from an older position, but
using the new checksum version?

... but it also provided more data supporting this hypothesis. I added
logging of pg_current_wal_lsn() before / after changing checksums on the
primary, and I see this:

1) LSN before: 14/2B0F26A8
2) enable checksums
3) LSN after: 14/EE335D60
4) standby waits for 14/F4E786E8 (higher, likely thanks to pgbench)
5) standby restarts with -m fast
6) redo starts at 14/230043B0, which is *before* enabling checksums

I guess this is the root cause. A bit more detailed log attached.

I kept stress testing this over the weekend, and I think I found two
issues causing the checksum failures, both for a single node and on a
standby:

1) no checkpoint in the "disable path"

In the "enable" path, a checkpoint it enforced before flipping the state
from "inprogress-on" to "on". It's hidden in the ProcessAllDatabases,
but it's there. But the "off" path does not do that, probably on the
assumption that we'll always see the writes in the WAL order, so that
we'll see the XLOG_CHECKSUMS setting checksums=off before seeing any
writes without checksums.

And in the happy path this works fine - the standby is happy, etc. But
what about after a crash / immediate shutdown? Consider a sequence like
this:

a) we have checksums=on
b) write to page P, updating the checksum
c) start disabling checksums
d) progress to inprogress-off
e) progress to off
f) write to page P, without checksum update
g) the write to P gets evicted (small shared buffers, ...)
h) crash / immediate shutdown

Recovery starts from a LSN before (a), so we believe checksums=on. We
try to redo the write to P, which starts by reading the page from disk,
to check the page LSN. We still think checksums=on, and to read the LSN
we need to verify the checksum. But the page was modified without the
checksum, and evicted. Kabooom!

This is not that hard to trigger by hand. Add a long at the end of
SetDataChecksumsOff, start a pgbench on a scale larger than shared
buffers and call pg_disable_data_checksums(). Once it gets stuck on the
sleep, give it more time to dirty and evict some pages, then kill -9. On
recovery you should get the same checksum failures.

FWIW I've only ever seen failures on fsm/vm forks, which matches what I
see in the TAP tests. But isn't it a bit strange?

I think the "disable" path needs a checkpoint between inprogress-off and
off states, same as the "enable" path.

2) no restart point on the standby

The standby has a similar issue, I think. Even if the primary creates
all the necessary checkpoints, the standby may not need to create the
restart point for them. If you look into xlog_redo, it only "remembers"
the checkpoint position, it does not trigger a restart point. Than only
happens in XLogPageRead, based on distance from the previous one.

So a very similar failure to the primary is possible, even with the
extra checkpoint fixing (1). The primary flips checksums in either
direction, generating checkpoints, but the standby does not create the
restart points. But it applies WAL, and some of the pages without
checksums get evicted.

And then the standby fails, and goes to some redo position far back, and
runs into the same checksum failure when trying to check page LSN.

I think the standby needs some logic to force restart point creation
when the checksum flag changed.

I have an experimental WIP branch at:

https://github.com/tvondra/postgres/tree/online-checksums-tap-tweaks

It fixes the TAP issues reported earlier (and a couple more), and it
does a bunch of additional tweaks:

a) A lot of debug messages that helped me to figure this out. This is
probably way too much, especially the controlfile updates can be very
noisy on a standby.

b) Adds a simpler TAP, testing just a single node (should be easier to
understand than with failures on standby).

c) Adds an explicit checkpoints, to fix (1). It probably adds too many
checkpoints, though? AFAICS a checkpoint after the "inprogress" phase
should be enough, a checkpoint after the "on/off" can go away.

d) Forces creating a restart point on the first checkpoint after
XLOG_CHECKSUMS record. It's done in a bit silly way, using a static
flag. Maybe there's a more elegant approach, say by comparing the
checksum value in ControlFile to the received checkpoint?

e) Randomizes a couple more GUC values. This needs more thought, it was
done blindly before better understanding how the failures happen (it
requires buffers evicted, not hitting max_wal_size, ...). There are more
params worth randomizing (e.g. the "fast" flag).

Anyway, with (c) and (d) applied, the checksum failures go away. It may
not be 100% right (e.g. we could do away with fewer checkpoints), but it
seems to be the right direction.

I don't have time to cleanup the branch more, I've already spent too
much time looking at LSNs advancing in weird ways :-( Hopefully it's
good enough to show what needs to be fixed, etc. If there's a new
version, I'm happy to rerun the tests on my machines, ofc.

However, there still are more bugs. Attached is a log from a crash after
hitting the assert into AbsorbChecksumsOffBarrier:

Assert((LocalDataChecksumVersion != PG_DATA_CHECKSUM_VERSION) &&
(LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION));

This happened while flipping checksums to 'off, but the backend already
thinks checksum are 'off':

LocalDataChecksumVersion==0

I think this implies some bug in setting up LocalDataChecksumVersion
after connection, because this is for a query checking the checksum
state, executed by the TAP test (in a new connection, right?).

I haven't looked into this more, but how come the "off" direction does
not need to check InitialDataChecksumTransition?

I think the TAP test turned out to be very useful, so far. While
investigating on this, I thought about a couple more tweaks to make it
detect additional issues (on top of the randomization).

- Right now the shutdowns/restarts happen only in very limited places.
The checksum flips from on/off or off/on, and then a restart happens.
AFAICS it never happens in the "inprogress" phases, right?

- The pgbench clients connect once, so there are almost no new
connections while flipping checksums. Maybe some of the pgbenches should
run with "-C", to open new connections. It was pretty lucky the TAP
query hit the assert, this would make it more likely.

regards

--
Tomas Vondra

Attachments:

assert.logtext/x-log; charset=UTF-8; name=assert.logDownload
#62Daniel Gustafsson
daniel@yesql.se
In reply to: Tomas Vondra (#61)
2 attachment(s)
Re: Changing the state of data checksums in a running cluster

On 1 Sep 2025, at 14:11, Tomas Vondra <tomas@vondra.me> wrote:

I kept stress testing this over the weekend, and I think I found two
issues causing the checksum failures, both for a single node and on a
standby:

Thanks a lot for all your testing. I was able to reproduce ..

I have an experimental WIP branch at:

https://github.com/tvondra/postgres/tree/online-checksums-tap-tweaks

.. and with your changes I also see the reproducer go away. I concur that
you likely found the issue and the right fix for it. I have absorbed your
patches into my branch, the debug logging is left as 0002 but the other ones
are incorporated into 0001.

It fixes the TAP issues reported earlier (and a couple more), and it
does a bunch of additional tweaks:

a) A lot of debug messages that helped me to figure this out. This is
probably way too much, especially the controlfile updates can be very
noisy on a standby.

I've toned down the logging a bit, but kept most of it in 0002.

b) Adds a simpler TAP, testing just a single node (should be easier to
understand than with failures on standby).

This turned out to be more useful than I initially thought, so I've kept this
in the attached version. There could be value in separating the single and
dual node tests into different PG_TEST_EXTRA values given how intensive the
latter is.

Anyway, with (c) and (d) applied, the checksum failures go away. It may
not be 100% right (e.g. we could do away with fewer checkpoints), but it
seems to be the right direction.

I think so too, and while I have removed one of them due to being issued just
before (or after) another checkpoint I do believe this is the right fix for the
issue. There might well be more issues, but I wanted to get a new version out
on the thread to get more visibility on the the new tests.

I haven't looked into this more, but how come the "off" direction does
not need to check InitialDataChecksumTransition?

This boiled down into the barrier absorbing functions evolving out of sync with
one another over multiple versions of the patch. To address this absorbing the
barrier has been converted into a single function which is driven by an array
of ChecksumBarrierCondition structs, one for each target state. This struct
defines what the current state of the cluster must be for the barrier to be
successfully absorbed. This removed a lot of duplicate code and also unifies
the previously quite varied levels of assertions at the barrier.

I think the TAP test turned out to be very useful, so far. While
investigating on this, I thought about a couple more tweaks to make it
detect additional issues (on top of the randomization).

- Right now the shutdowns/restarts happen only in very limited places.
The checksum flips from on/off or off/on, and then a restart happens.
AFAICS it never happens in the "inprogress" phases, right?

That would be a good idea, as well as use a crashing injection test on top
controlled shutdowns.

- The pgbench clients connect once, so there are almost no new
connections while flipping checksums. Maybe some of the pgbenches should
run with "-C", to open new connections. It was pretty lucky the TAP
query hit the assert, this would make it more likely.

I've added this to both tests using pgbench, randomized with a cointoss call.

The attached has the above as well as a few other changes:

* A new injection test which calls abort() right before checkpointing has been
added in 005_injection.

* Most ereport calls have gotten proper errcodes in order to aid analysis,
particurly fleet-wide analysis should this be deployed in a larger setting.

* More code-level documentation of test code and several tweaked (and added)
code comments to aid readability.

--
Daniel Gustafsson

Attachments:

v20251006-0001-Online-enabling-and-disabling-of-data-chec.patchapplication/octet-stream; name=v20251006-0001-Online-enabling-and-disabling-of-data-chec.patch; x-unix-mode=0644Download
From 8453324b4b248a28f84cf75e4b7063c67a3456bf Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Fri, 15 Aug 2025 14:48:02 +0200
Subject: [PATCH v20251006 1/2] Online enabling and disabling of data checksums

This allows data checksums to be enabled, or disabled, in a running
cluster without restricting access to the cluster during processing.

Data checksums could prior to this only be enabled during initdb or
when the cluster is offline using the pg_checksums app. This commit
introduce functionality to enable, or disable, data checksums while
the cluster is running regardless of how it was initialized.

A background worker launcher process is responsible for launching a
dynamic per-database background worker which will mark all buffers
dirty for all relation with storage in order for them to have data
checksums calcuated on write.  Once all relations in all databases
have been processed, the data_checksums state will be set to on and
the cluster will at that point be identical to one which had data
checksums enabled during initialization or via offline processing.

When data checksums are being enabled, concurrent I/O operations
from backends other than the data checksums worker will write the
checksums but not verify them on reading.  Only when all backends
have absorbed the procsignalbarrier for setting data_checksums to
on will they also start verifying checksums on reading.  The same
process is repeated during disabling; all backends write checksums
but do not verify them until the barrier for setting the state to
off has been absorbed by all.  This in-progress state is used to
ensure there are no false negatives (or positives) due to reading
a checksum which is not in sync with the page.

A new testmodule, test_checksums, is introduced with an extensive
set of tests covering both online and offline data checksum mode
changes.  The tests for online processing are gated begind the
PG_TEST_EXTRA flag to some degree due to being very time consuming
to run.

This work is based on an earlier version of this patch which was
reviewed by among others Heikki Linnakangas, Robert Haas, Andres
Freund, Tomas Vondra, Michael Banck and Andrey Borodin.  During
the work on this new version, Tomas Vondra has given invaluable
assistance with not only coding and reviewing but very in-depth
testing.

Author: Daniel Gustafsson <daniel@yesql.se>
Author: Magnus Hagander <magnus@hagander.net>
Co-authored-by: Tomas Vondra <tomas@vondra.me>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com
Discussion: https://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de
Discussion: https://postgr.es/m/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com
---
 doc/src/sgml/func/func-admin.sgml             |   71 +
 doc/src/sgml/glossary.sgml                    |   23 +
 doc/src/sgml/monitoring.sgml                  |  208 ++-
 doc/src/sgml/ref/pg_checksums.sgml            |    6 +
 doc/src/sgml/regress.sgml                     |   12 +
 doc/src/sgml/wal.sgml                         |   59 +-
 src/backend/access/rmgrdesc/xlogdesc.c        |   24 +
 src/backend/access/transam/xlog.c             |  678 +++++++-
 src/backend/access/transam/xlogfuncs.c        |   57 +
 src/backend/access/transam/xlogrecovery.c     |   13 +
 src/backend/backup/basebackup.c               |    6 +-
 src/backend/catalog/system_functions.sql      |   20 +
 src/backend/catalog/system_views.sql          |   20 +
 src/backend/postmaster/Makefile               |    1 +
 src/backend/postmaster/auxprocess.c           |   19 +
 src/backend/postmaster/bgworker.c             |    7 +
 src/backend/postmaster/datachecksumsworker.c  | 1471 +++++++++++++++++
 src/backend/postmaster/meson.build            |    1 +
 src/backend/postmaster/postmaster.c           |    5 +
 src/backend/replication/logical/decode.c      |    1 +
 src/backend/storage/ipc/ipci.c                |    4 +
 src/backend/storage/ipc/procsignal.c          |   14 +
 src/backend/storage/page/README               |    4 +-
 src/backend/storage/page/bufpage.c            |   10 +-
 src/backend/utils/activity/pgstat_backend.c   |    2 +
 src/backend/utils/activity/pgstat_io.c        |    2 +
 .../utils/activity/wait_event_names.txt       |    4 +
 src/backend/utils/adt/pgstatfuncs.c           |    8 +-
 src/backend/utils/init/miscinit.c             |    3 +-
 src/backend/utils/init/postinit.c             |   20 +-
 src/backend/utils/misc/guc_parameters.dat     |    5 +-
 src/backend/utils/misc/guc_tables.c           |    9 +-
 src/bin/pg_checksums/pg_checksums.c           |    4 +-
 src/bin/pg_controldata/pg_controldata.c       |    2 +
 src/bin/pg_upgrade/controldata.c              |    9 +
 src/include/access/xlog.h                     |   14 +-
 src/include/access/xlog_internal.h            |    7 +
 src/include/catalog/pg_control.h              |    6 +-
 src/include/catalog/pg_proc.dat               |   19 +
 src/include/commands/progress.h               |   17 +
 src/include/miscadmin.h                       |    6 +
 src/include/postmaster/datachecksumsworker.h  |   51 +
 src/include/postmaster/proctypelist.h         |    2 +
 src/include/storage/bufpage.h                 |    2 +-
 src/include/storage/checksum.h                |   15 +
 src/include/storage/lwlocklist.h              |    1 +
 src/include/storage/proc.h                    |    6 +-
 src/include/storage/procsignal.h              |    5 +
 src/include/utils/backend_progress.h          |    1 +
 src/test/modules/Makefile                     |    1 +
 src/test/modules/meson.build                  |    1 +
 src/test/modules/test_checksums/.gitignore    |    2 +
 src/test/modules/test_checksums/Makefile      |   40 +
 src/test/modules/test_checksums/README        |   22 +
 src/test/modules/test_checksums/meson.build   |   36 +
 .../modules/test_checksums/t/001_basic.pl     |   63 +
 .../modules/test_checksums/t/002_restarts.pl  |  110 ++
 .../test_checksums/t/003_standby_restarts.pl  |  114 ++
 .../modules/test_checksums/t/004_offline.pl   |   82 +
 .../modules/test_checksums/t/005_injection.pl |  126 ++
 .../test_checksums/t/006_pgbench_single.pl    |  268 +++
 .../test_checksums/t/007_pgbench_standby.pl   |  398 +++++
 .../test_checksums/t/DataChecksums/Utils.pm   |  283 ++++
 .../test_checksums/test_checksums--1.0.sql    |   28 +
 .../modules/test_checksums/test_checksums.c   |  225 +++
 .../test_checksums/test_checksums.control     |    4 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |   45 +
 src/test/regress/expected/rules.out           |   36 +
 src/test/regress/expected/stats.out           |   18 +-
 src/tools/pgindent/typedefs.list              |    6 +
 70 files changed, 4815 insertions(+), 47 deletions(-)
 create mode 100644 src/backend/postmaster/datachecksumsworker.c
 create mode 100644 src/include/postmaster/datachecksumsworker.h
 create mode 100644 src/test/modules/test_checksums/.gitignore
 create mode 100644 src/test/modules/test_checksums/Makefile
 create mode 100644 src/test/modules/test_checksums/README
 create mode 100644 src/test/modules/test_checksums/meson.build
 create mode 100644 src/test/modules/test_checksums/t/001_basic.pl
 create mode 100644 src/test/modules/test_checksums/t/002_restarts.pl
 create mode 100644 src/test/modules/test_checksums/t/003_standby_restarts.pl
 create mode 100644 src/test/modules/test_checksums/t/004_offline.pl
 create mode 100644 src/test/modules/test_checksums/t/005_injection.pl
 create mode 100644 src/test/modules/test_checksums/t/006_pgbench_single.pl
 create mode 100644 src/test/modules/test_checksums/t/007_pgbench_standby.pl
 create mode 100644 src/test/modules/test_checksums/t/DataChecksums/Utils.pm
 create mode 100644 src/test/modules/test_checksums/test_checksums--1.0.sql
 create mode 100644 src/test/modules/test_checksums/test_checksums.c
 create mode 100644 src/test/modules/test_checksums/test_checksums.control

diff --git a/doc/src/sgml/func/func-admin.sgml b/doc/src/sgml/func/func-admin.sgml
index 1b465bc8ba7..f3a8782ede0 100644
--- a/doc/src/sgml/func/func-admin.sgml
+++ b/doc/src/sgml/func/func-admin.sgml
@@ -2979,4 +2979,75 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 
   </sect2>
 
+  <sect2 id="functions-admin-checksum">
+   <title>Data Checksum Functions</title>
+
+   <para>
+    The functions shown in <xref linkend="functions-checksums-table" /> can
+    be used to enable or disable data checksums in a running cluster.
+    See <xref linkend="checksums" /> for details.
+   </para>
+
+   <table id="functions-checksums-table">
+    <title>Data Checksum Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_enable_data_checksums</primary>
+        </indexterm>
+        <function>pg_enable_data_checksums</function> ( <optional><parameter>cost_delay</parameter> <type>int</type>, <parameter>cost_limit</parameter> <type>int</type></optional> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Initiates data checksums for the cluster. This will switch the data
+        checksums mode to <literal>inprogress-on</literal> as well as start a
+        background worker that will process all pages in the database and
+        enable checksums on them. When all data pages have had checksums
+        enabled, the cluster will automatically switch data checksums mode to
+        <literal>on</literal>.
+       </para>
+       <para>
+        If <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter> are
+        specified, the speed of the process is throttled using the same principles as
+        <link linkend="runtime-config-resource-vacuum-cost">Cost-based Vacuum Delay</link>.
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_disable_data_checksums</primary>
+        </indexterm>
+        <function>pg_disable_data_checksums</function> ()
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Disables data checksum validation and calculation for the cluster. This
+        will switch the data checksum mode to <literal>inprogress-off</literal>
+        while data checksums are being disabled. When all active backends have
+        stopped validating data checksums, the data checksum mode will be
+        changed to <literal>off</literal>.  At this point the data pages will
+        still have checksums recorded but they are not updated when pages are
+        modified.
+       </para></entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+
+  </sect2>
+
   </sect1>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 8651f0cdb91..9bac0c96348 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -184,6 +184,8 @@
      (but not the autovacuum workers),
      the <glossterm linkend="glossary-background-writer">background writer</glossterm>,
      the <glossterm linkend="glossary-checkpointer">checkpointer</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker">data checksums worker</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker-launcher">data checksums worker launcher</glossterm>,
      the <glossterm linkend="glossary-logger">logger</glossterm>,
      the <glossterm linkend="glossary-startup-process">startup process</glossterm>,
      the <glossterm linkend="glossary-wal-archiver">WAL archiver</glossterm>,
@@ -573,6 +575,27 @@
    </glossdef>
   </glossentry>
 
+  <glossentry id="glossary-data-checksums-worker">
+   <glossterm>Data Checksums Worker</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which enables or disables data checksums in a specific database.
+    </para>
+   </glossdef>
+  </glossentry>
+
+  <glossentry id="glossary-data-checksums-worker-launcher">
+   <glossterm>Data Checksums Worker Launcher</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which starts <glossterm linkend="glossary-data-checksums-worker"> processes</glossterm>
+     for each database.
+    </para>
+   </glossdef>
+  </glossentry>
+
   <glossentry id="glossary-db-cluster">
    <glossterm>Database cluster</glossterm>
    <glossdef>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 786aa2ac5f6..7b53262bd44 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3527,8 +3527,9 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Number of data page checksum failures detected in this
-       database (or on a shared object), or NULL if data checksums are
-       disabled.
+       database (or on a shared object).
+       Detected failures are reported regardless of the
+       <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -3538,8 +3539,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Time at which the last data page checksum failure was detected in
-       this database (or on a shared object), or NULL if data checksums are
-       disabled.
+       this database (or on a shared object). Last failure is reported
+       regardless of the <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -6913,6 +6914,205 @@ FROM pg_stat_get_backend_idset() AS backendid;
 
  </sect2>
 
+ <sect2 id="data-checksum-progress-reporting">
+  <title>Data Checksum Progress Reporting</title>
+
+  <indexterm>
+   <primary>pg_stat_progress_data_checksums</primary>
+  </indexterm>
+
+  <para>
+   When data checksums are being enabled on a running cluster, the
+   <structname>pg_stat_progress_data_checksums</structname> view will contain
+   a row for the launcher process, and one row for each worker process which
+   is currently calculating checksums for the data pages in one database.
+  </para>
+
+  <table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
+   <title><structname>pg_stat_progress_data_checksums</structname> View</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        Column Type
+       </para>
+       <para>
+        Description>
+       </para>
+      </entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>pid</structfield> <type>integer</type>
+       </para>
+       <para>
+        Process ID of a datachecksumworker process.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datid</structfield> <type>oid</type>
+      </para>
+      <para>
+       OID of this database, or 0 for the launcher process
+       relation
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datname</structfield> <type>name</type>
+      </para>
+      <para>
+       Name of this database, or <literal>NULL</literal> for the
+       launcher process.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>phase</structfield> <type>text</type>
+       </para>
+       <para>
+        Current processing phase, see <xref linkend="datachecksum-phases"/>
+        for description of the phases.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of databases which will be processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this set to <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of databases which have been processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this set to <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of relations which will be processed, or
+        <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of relations yet. The launcher process has
+        this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of relations which have been processed. The launcher
+        process has this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>blocks_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks in the current relation which will be processed,
+        or <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of blocks yet. The launcher process has
+        this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>blocks_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks in the current relation which have been processed.
+        The launcher process has this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+  <table id="datachecksum-phases">
+   <title>Data Checksum Phases</title>
+   <tgroup cols="2">
+    <colspec colname="col1" colwidth="1*"/>
+    <colspec colname="col2" colwidth="2*"/>
+    <thead>
+     <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+    <tbody>
+     <row>
+      <entry><literal>enabling</literal></entry>
+      <entry>
+       The command is currently enabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>disabling</literal></entry>
+      <entry>
+       The command is currently disabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on temporary tables</literal></entry>
+      <entry>
+       The command is currently waiting for all temporary tables which existed
+       at the time the command was started to be removed.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on checkpoint</literal></entry>
+      <entry>
+       The command is currently waiting for a checkpoint to update the checksum
+       state before finishing.
+      </entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+ </sect2>
+
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index 95043aa329c..0343710af53 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -45,6 +45,12 @@ PostgreSQL documentation
    exit status is nonzero if the operation failed.
   </para>
 
+  <para>
+   When enabling checksums, if checksums were in the process of being enabled
+   when the cluster was shut down, <application>pg_checksums</application>
+   will still process all relations regardless of the online processing.
+  </para>
+
   <para>
    When verifying checksums, every file in the cluster is scanned. When
    enabling checksums, each relation file block with a changed checksum is
diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index 8838fe7f022..7074751834e 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -263,6 +263,18 @@ make check-world PG_TEST_EXTRA='kerberos ldap ssl load_balance libpq_encryption'
 </programlisting>
    The following values are currently supported:
    <variablelist>
+    <varlistentry>
+     <term><literal>checksum_extended</literal></term>
+     <listitem>
+      <para>
+       Runs additional tests for enabling data checksums which inject delays
+       and re-tries in the processing, as well as tests that run pgbench
+       concurrently and randomly restarts the cluster.  Some of these test
+       suites requires injection points enabled in the installation.
+      </para>
+     </listitem>
+    </varlistentry>
+
     <varlistentry>
      <term><literal>kerberos</literal></term>
      <listitem>
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index f3b86b26be9..0ada90ca0b1 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -246,9 +246,10 @@
   <para>
    Checksums can be disabled when the cluster is initialized using <link
    linkend="app-initdb-data-checksums"><application>initdb</application></link>.
-   They can also be enabled or disabled at a later time as an offline
-   operation. Data checksums are enabled or disabled at the full cluster
-   level, and cannot be specified individually for databases or tables.
+   They can also be enabled or disabled at a later time either as an offline
+   operation or online in a running cluster allowing concurrent access. Data
+   checksums are enabled or disabled at the full cluster level, and cannot be
+   specified individually for databases or tables.
   </para>
 
   <para>
@@ -265,7 +266,7 @@
   </para>
 
   <sect2 id="checksums-offline-enable-disable">
-   <title>Off-line Enabling of Checksums</title>
+   <title>Offline Enabling of Checksums</title>
 
    <para>
     The <link linkend="app-pgchecksums"><application>pg_checksums</application></link>
@@ -274,6 +275,56 @@
    </para>
 
   </sect2>
+
+  <sect2 id="checksums-online-enable-disable">
+   <title>Online Enabling of Checksums</title>
+
+   <para>
+    Checksums can be enabled or disabled online, by calling the appropriate
+    <link linkend="functions-admin-checksum">functions</link>.
+   </para>
+
+   <para>
+    Enabling checksums will put the cluster checksum mode in
+    <literal>inprogress-on</literal> mode.  During this time, checksums will be
+    written but not verified. In addition to this, a background worker process
+    is started that enables checksums on all existing data in the cluster. Once
+    this worker has completed processing all databases in the cluster, the
+    checksum mode will automatically switch to <literal>on</literal>. The
+    processing will consume two background worker processes, make sure that
+    <varname>max_worker_processes</varname> allows for at least two more
+    additional processes.
+   </para>
+
+   <para>
+    The process will initially wait for all open transactions to finish before
+    it starts, so that it can be certain that there are no tables that have been
+    created inside a transaction that has not committed yet and thus would not
+    be visible to the process enabling checksums. It will also, for each database,
+    wait for all pre-existing temporary tables to get removed before it finishes.
+    If long-lived temporary tables are used in the application it may be necessary
+    to terminate these application connections to allow the process to complete.
+   </para>
+
+   <para>
+    If the cluster is stopped while in <literal>inprogress-on</literal> mode, for
+    any reason, then this process must be restarted manually. To do this,
+    re-execute the function <function>pg_enable_data_checksums()</function>
+    once the cluster has been restarted. The process will start over, there is
+    no support for resuming work from where it was interrupted.
+   </para>
+
+   <note>
+    <para>
+     Enabling checksums can cause significant I/O to the system, as most of the
+     database pages will need to be rewritten, and will be written both to the
+     data files and the WAL. The impact may be limited by throttling using the
+     <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter>
+     parameters of the <function>pg_enable_data_checksums</function> function.
+    </para>
+   </note>
+
+  </sect2>
  </sect1>
 
   <sect1 id="wal-intro">
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index cd6c2a2f650..c50d654db30 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -18,6 +18,7 @@
 #include "access/xlog.h"
 #include "access/xlog_internal.h"
 #include "catalog/pg_control.h"
+#include "storage/bufpage.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -167,6 +168,26 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
 		memcpy(&wal_level, rec, sizeof(int));
 		appendStringInfo(buf, "wal_level %s", get_wal_level_string(wal_level));
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state xlrec;
+
+		memcpy(&xlrec, rec, sizeof(xl_checksum_state));
+		switch (xlrec.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_VERSION:
+				appendStringInfoString(buf, "on");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				appendStringInfoString(buf, "inprogress-off");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				appendStringInfoString(buf, "inprogress-on");
+				break;
+			default:
+				appendStringInfoString(buf, "off");
+		}
+	}
 }
 
 const char *
@@ -218,6 +239,9 @@ xlog_identify(uint8 info)
 		case XLOG_CHECKPOINT_REDO:
 			id = "CHECKPOINT_REDO";
 			break;
+		case XLOG_CHECKSUMS:
+			id = "CHECKSUMS";
+			break;
 	}
 
 	return id;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index eceab341255..59f5cbe839f 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -286,6 +286,11 @@ static XLogRecPtr RedoRecPtr;
  */
 static bool doPageWrites;
 
+/*
+ * Force creating a restart point on the next CHECKPOINT after XLOG_CHECKSUMS.
+ */
+static bool checksumRestartPoint = false;
+
 /*----------
  * Shared-memory data structures for XLOG control
  *
@@ -550,6 +555,9 @@ typedef struct XLogCtlData
 	 */
 	XLogRecPtr	lastFpwDisableRecPtr;
 
+	/* last data_checksum_version we've seen */
+	uint32		data_checksum_version;
+
 	slock_t		info_lck;		/* locks shared variables shown above */
 } XLogCtlData;
 
@@ -573,6 +581,44 @@ static WALInsertLockPadded *WALInsertLocks = NULL;
  */
 static ControlFileData *ControlFile = NULL;
 
+/*
+ * This must match largest number of sets in barrier_eq and barrier_ne in the
+ * below checksum_barriers definition.
+ */
+#define MAX_BARRIER_CONDITIONS 2
+
+/*
+ * Configuration of conditions which must match when absorbing a procsignal
+ * barrier during data checksum enable/disable operations.  A single function
+ * is used for absorbing all barriers, and the set of conditions to use is
+ * looked up in the checksum_barriers struct.  The struct member for the target
+ * state defines which state the backend must currently be in, and which it
+ * must not be in.
+ */
+typedef struct ChecksumBarrierCondition
+{
+	/* The target state of the barrier */
+	int			target;
+	/* A set of states in which at least one MUST match the current state */
+	int			barrier_eq[MAX_BARRIER_CONDITIONS];
+	/* The number of elements in the barrier_eq set */
+	int			barrier_eq_sz;
+	/* A set of states which all MUST NOT match the current state */
+	int			barrier_ne[MAX_BARRIER_CONDITIONS];
+	/* The number of elements in the barrier_ne set */
+	int			barrier_ne_sz;
+}			ChecksumBarrierCondition;
+
+static const ChecksumBarrierCondition checksum_barriers[] =
+{
+	{PG_DATA_CHECKSUM_OFF, {PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION, PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION}, 2, {PG_DATA_CHECKSUM_VERSION}, 1},
+	{PG_DATA_CHECKSUM_VERSION, {PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION}, 1, {0}, 0},
+	{PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION, {PG_DATA_CHECKSUM_ANY_VERSION}, 1, {PG_DATA_CHECKSUM_VERSION}, 1},
+	{PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION, {PG_DATA_CHECKSUM_VERSION}, 1, {0}, 0},
+	{-1}
+};
+
+
 /*
  * Calculate the amount of space left on the page after 'endptr'. Beware
  * multiple evaluation!
@@ -647,6 +693,36 @@ static XLogRecPtr LocalMinRecoveryPoint;
 static TimeLineID LocalMinRecoveryPointTLI;
 static bool updateMinRecoveryPoint = true;
 
+/*
+ * Local state fror Controlfile data_checksum_version.  After initialization
+ * this is only updated when absorbing a procsignal barrier during interrupt
+ * processing.  The reason for keeping a copy in backend-private memory is to
+ * avoid locking for interrogating checksum state.  Possible values are the
+ * checksum versions defined in storage/bufpage.h as well as zero when data
+ * checksums are disabled.
+ */
+static uint32 LocalDataChecksumVersion = 0;
+
+/*
+ * Flag to remember if the procsignalbarrier being absorbed for checksums is
+ * the first one.  The first procsignalbarrier can in rare cases be for the
+ * state we've initialized, i.e. a duplicate.  This may happen for any
+ * data_checksum_version value, but for PG_DATA_CHECKSUM_ON_VERSION this would
+ * trigger an assert failure (this is the only transition with an assert) when
+ * processing the barrier.  This may happen if the process is spawned between
+ * the update of XLogCtl->data_checksum_version and the barrier being emitted.
+ * This can only happen on the very first barrier so mark that with this flag.
+ */
+static bool InitialDataChecksumTransition = true;
+
+/*
+ * Variable backing the GUC, keep it in sync with LocalDataChecksumVersion.
+ * See SetLocalDataChecksumVersion().
+ */
+int			data_checksums = 0;
+
+static void SetLocalDataChecksumVersion(uint32 data_checksum_version);
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -715,6 +791,8 @@ static void WALInsertLockAcquireExclusive(void);
 static void WALInsertLockRelease(void);
 static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);
 
+static void XLogChecksums(uint32 new_type);
+
 /*
  * Insert an XLOG record represented by an already-constructed chain of data
  * chunks.  This is a low-level routine; to construct the WAL record header
@@ -828,9 +906,10 @@ XLogInsertRecord(XLogRecData *rdata,
 		 * only happen just after a checkpoint, so it's better to be slow in
 		 * this case and fast otherwise.
 		 *
-		 * Also check to see if fullPageWrites was just turned on or there's a
-		 * running backup (which forces full-page writes); if we weren't
-		 * already doing full-page writes then go back and recompute.
+		 * Also check to see if fullPageWrites was just turned on, there's a
+		 * running backup or if checksums are enabled (all of which forces
+		 * full-page writes); if we weren't already doing full-page writes
+		 * then go back and recompute.
 		 *
 		 * If we aren't doing full-page writes then RedoRecPtr doesn't
 		 * actually affect the contents of the XLOG record, so we'll update
@@ -843,7 +922,9 @@ XLogInsertRecord(XLogRecData *rdata,
 			Assert(RedoRecPtr < Insert->RedoRecPtr);
 			RedoRecPtr = Insert->RedoRecPtr;
 		}
-		doPageWrites = (Insert->fullPageWrites || Insert->runningBackups > 0);
+		doPageWrites = (Insert->fullPageWrites ||
+						Insert->runningBackups > 0 ||
+						DataChecksumsNeedWrite());
 
 		if (doPageWrites &&
 			(!prevDoPageWrites ||
@@ -4249,6 +4330,12 @@ InitControlFile(uint64 sysidentifier, uint32 data_checksum_version)
 	ControlFile->wal_log_hints = wal_log_hints;
 	ControlFile->track_commit_timestamp = track_commit_timestamp;
 	ControlFile->data_checksum_version = data_checksum_version;
+
+	/*
+	 * Set the data_checksum_version value into XLogCtl, which is where all
+	 * processes get the current value from. (Maybe it should go just there?)
+	 */
+	XLogCtl->data_checksum_version = data_checksum_version;
 }
 
 static void
@@ -4573,9 +4660,9 @@ ReadControlFile(void)
 
 	CalculateCheckpointSegments();
 
-	/* Make the initdb settings visible as GUC variables, too */
-	SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no",
-					PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
+	elog(LOG, "ReadControlFile checkpoint %X/%08X redo %X/%08X",
+		 LSN_FORMAT_ARGS(ControlFile->checkPoint),
+		 LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo));
 }
 
 /*
@@ -4609,13 +4696,430 @@ GetMockAuthenticationNonce(void)
 }
 
 /*
- * Are checksums enabled for data pages?
+ * DataChecksumsNeedWrite
+ *		Returns whether data checksums must be written or not
+ *
+ * Returns true iff data checksums are enabled or are in the process of being
+ * enabled. During "inprogress-on" and "inprogress-off" states checksums must
+ * be written even though they are not verified (see datachecksumsworker.c for
+ * a longer discussion).
+ *
+ * This function is intended for callsites which are about to write a data page
+ * to storage, and need to know whether to re-calculate the checksum for the
+ * page header. Calling this function must be performed as close to the write
+ * operation as possible to keep the critical section short.
  */
 bool
-DataChecksumsEnabled(void)
+DataChecksumsNeedWrite(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * DataChecksumsNeedVerify
+ *		Returns whether data checksums must be verified or not
+ *
+ * Data checksums are only verified if they are fully enabled in the cluster.
+ * During the "inprogress-on" and "inprogress-off" states they are only
+ * updated, not verified (see datachecksumsworker.c for a longer discussion).
+ *
+ * This function is intended for callsites which have read data and are about
+ * to perform checksum validation based on the result of this.  Calling this
+ * function must be performed as close to the validation call as possible to
+ * keep the critical section short. This is in order to protect against time of
+ * check/time of use situations around data checksum validation.
+ */
+bool
+DataChecksumsNeedVerify(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
+}
+
+/*
+ * DataChecksumsOnInProgress
+ *		Returns whether data checksums are being enabled
+ *
+ * Most operations don't need to worry about the "inprogress" states, and
+ * should use DataChecksumsNeedVerify() or DataChecksumsNeedWrite(). The
+ * "inprogress-on" state for enabling checksums is used when the checksum
+ * worker is setting checksums on all pages, it can thus be used to check for
+ * aborted checksum processing which need to be restarted.
+ */
+inline bool
+DataChecksumsOnInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+}
+
+/*
+ * DataChecksumsOffInProgress
+ *		Returns whether data checksums are being disabled
+ *
+ * The "inprogress-off" state for disabling checksums is used for when the
+ * worker resets the catalog state.  DataChecksumsNeedVerify() or
+ * DataChecksumsNeedWrite() should be used for deciding whether to read/write
+ * checksums.
+ */
+bool
+DataChecksumsOffInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * SetDataChecksumsOnInProgress
+ *		Sets the data checksum state to "inprogress-on" to enable checksums
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on". See
+ * SetDataChecksumsOn below for a description on how this state change works.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOnInProgress(bool immediate_checkpoint)
+{
+	uint64		barrier;
+	int			flags;
+
+	Assert(ControlFile != NULL);
+
+	/*
+	 * The state transition is performed in a critical section with
+	 * checkpoints held off to provide crash safety.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state change in all backends to ensure that all backends are in
+	 * "inprogress-on". Once done we know that all backends are writing data
+	 * checksums.
+	 */
+	WaitForProcSignalBarrier(barrier);
+
+	/*
+	 * force checkpoint to persist the current checksum state in control file
+	 * etc.
+	 *
+	 * XXX is this needed? There's already a checkpoint at the end of
+	 * ProcessAllDatabases, maybe this is redundant?
+	 */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_FAST;
+	RequestCheckpoint(flags);
+}
+
+/*
+ * SetDataChecksumsOn
+ *		Enables data checksums cluster-wide
+ *
+ * Enabling data checksums is performed using two barriers, the first one to
+ * set the state to "inprogress-on" (done by SetDataChecksumsOnInProgress())
+ * and the second one to set the state to "on" (done here). Below is a short
+ * description of the processing, a more detailed write-up can be found in
+ * datachecksumsworker.c.
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on".  This state
+ * requires data checksums to be written but not verified. This ensures that
+ * all data pages can be checksummed without the risk of false negatives in
+ * validation during the process.  When all existing pages are guaranteed to
+ * have checksums, and all new pages will be initiated with checksums, the
+ * state can be changed to "on". Once the state is "on" checksums will be both
+ * written and verified. See datachecksumsworker.c for a longer discussion on
+ * how data checksums can be enabled in a running cluster.
+ *
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOn(bool immediate_checkpoint)
 {
+	uint64		barrier;
+	int			flags;
+
 	Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+
+	/*
+	 * The only allowed state transition to "on" is from "inprogress-on" since
+	 * that state ensures that all pages will have data checksums written.
+	 */
+	if (XLogCtl->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		SpinLockRelease(&XLogCtl->info_lck);
+		elog(ERROR, "checksums not in \"inprogress-on\" mode");
+	}
+
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	INJECTION_POINT("datachecksums-enable-checksums-delay", NULL);
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_VERSION);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state transition of "on" in all backends. When done we know that
+	 * data checksums are enabled in all backends and data checksums are both
+	 * written and verified.
+	 */
+	WaitForProcSignalBarrier(barrier);
+
+	INJECTION_POINT("datachecksums-enable-checksums-pre-checkpoint", NULL);
+
+	/* XXX is this needed? */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_FAST;
+	RequestCheckpoint(flags);
+}
+
+/*
+ * SetDataChecksumsOff
+ *		Disables data checksums cluster-wide
+ *
+ * Disabling data checksums must be performed with two sets of barriers, each
+ * carrying a different state. The state is first set to "inprogress-off"
+ * during which checksums are still written but not verified. This ensures that
+ * backends which have yet to observe the state change from "on" won't get
+ * validation errors on concurrently modified pages. Once all backends have
+ * changed to "inprogress-off", the barrier for moving to "off" can be emitted.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOff(bool immediate_checkpoint)
+{
+	uint64		barrier;
+	int			flags;
+
+	Assert(ControlFile);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+
+	/* If data checksums are already disabled there is nothing to do */
+	if (XLogCtl->data_checksum_version == 0)
+	{
+		SpinLockRelease(&XLogCtl->info_lck);
+		return;
+	}
+
+	/*
+	 * If data checksums are currently enabled we first transition to the
+	 * "inprogress-off" state during which backends continue to write
+	 * checksums without verifying them. When all backends are in
+	 * "inprogress-off" the next transition to "off" can be performed, after
+	 * which all data checksum processing is disabled.
+	 */
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
+	{
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+		START_CRIT_SECTION();
+
+		XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+
+		END_CRIT_SECTION();
+		MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+		/*
+		 * Update local state in all backends to ensure that any backend in
+		 * "on" state is changed to "inprogress-off".
+		 */
+		WaitForProcSignalBarrier(barrier);
+
+		/*
+		 * force checkpoint to persist the current checksum state in control
+		 * file etc.
+		 *
+		 * XXX is this safe? What if the crash/shutdown happens while waiting
+		 * for the checkpoint? Also, should we persist the checksum first and
+		 * only then flip the flag in XLogCtl?
+		 */
+		INJECTION_POINT("datachecksums-disable-checksums-pre-checkpoint", NULL);
+		flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+		if (immediate_checkpoint)
+			flags = flags | CHECKPOINT_FAST;
+		RequestCheckpoint(flags);
+
+		/*
+		 * At this point we know that no backends are verifying data checksums
+		 * during reading. Next, we can safely move to state "off" to also
+		 * stop writing checksums.
+		 */
+	}
+	else
+	{
+		/*
+		 * Ending up here implies that the checksums state is "inprogress-on"
+		 * or "inprogress-off" and we can transition directly to "off" from
+		 * there.
+		 */
+		SpinLockRelease(&XLogCtl->info_lck);
+	}
+
+	/*
+	 * Ensure that we don't incur a checkpoint during disabling checksums.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(0);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = 0;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	WaitForProcSignalBarrier(barrier);
+
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_FAST;
+	RequestCheckpoint(flags);
+}
+
+/*
+ * AbsorbDataChecksumsBarrier
+ *		Generic function for absorbing data checksum state changes
+ *
+ * All procsignalbarriers regarding data checksum state changes are absorbed
+ * with this function.  The set of conditions required for the state change to
+ * be accepted are listed in the checksum_barriers struct, target_state is
+ * used to look up the relevant entry.
+ */
+bool
+AbsorbDataChecksumsBarrier(int target_state)
+{
+	const		ChecksumBarrierCondition *condition = checksum_barriers;
+	int			current = LocalDataChecksumVersion;
+	bool		found = false;
+
+	/*
+	 * Find the barrier condition definition for the target state. Not finding
+	 * a condition would be a grave programmer error as the states are a
+	 * discrete set.
+	 */
+	while (condition->target != target_state && condition->target != -1)
+		condition++;
+	if (unlikely(condition->target == -1))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("invalid target state %i for data checksum barrier",
+					   target_state));
+
+	/*
+	 * The current state MUST be equal to one of the EQ states defined in this
+	 * barrier condition, or equal to the target_state if - and only if -
+	 * InitialDataChecksumTransition is true.
+	 */
+	for (int i = 0; i < condition->barrier_eq_sz; i++)
+	{
+		if (current == condition->barrier_eq[i] ||
+			condition->barrier_eq[i] == PG_DATA_CHECKSUM_ANY_VERSION)
+			found = true;
+	}
+	if (InitialDataChecksumTransition && current == target_state)
+		found = true;
+
+	/*
+	 * The current state MUST NOT be equal to any of the NE states defined in
+	 * this barrier condition.
+	 */
+	for (int i = 0; i < condition->barrier_ne_sz; i++)
+	{
+		if (current == condition->barrier_ne[i])
+			found = false;
+	}
+
+	/*
+	 * If the relevent state criteria aren't satisfied, throw an error which
+	 * will be caught by the procsignal machinery for a later retry.
+	 */
+	if (!found)
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("incorrect data checksum state %i for target state %i",
+					   current, target_state));
+
+	SetLocalDataChecksumVersion(target_state);
+	InitialDataChecksumTransition = false;
+	return true;
+}
+
+/*
+ * InitLocalControlData
+ *
+ * Set up backend local caches of controldata variables which may change at
+ * any point during runtime and thus require special cased locking. So far
+ * this only applies to data_checksum_version, but it's intended to be general
+ * purpose enough to handle future cases.
+ */
+void
+InitLocalDataChecksumVersion(void)
+{
+	SpinLockAcquire(&XLogCtl->info_lck);
+	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+	SpinLockRelease(&XLogCtl->info_lck);
+}
+
+void
+SetLocalDataChecksumVersion(uint32 data_checksum_version)
+{
+	LocalDataChecksumVersion = data_checksum_version;
+
+	data_checksums = data_checksum_version;
+}
+
+/* guc hook */
+const char *
+show_data_checksums(void)
+{
+	if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)
+		return "on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		return "inprogress-on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+		return "inprogress-off";
+	else
+		return "off";
 }
 
 /*
@@ -4890,6 +5394,7 @@ LocalProcessControlFile(bool reset)
 	Assert(reset || ControlFile == NULL);
 	ControlFile = palloc(sizeof(ControlFileData));
 	ReadControlFile();
+	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
 }
 
 /*
@@ -5059,6 +5564,11 @@ XLOGShmemInit(void)
 	XLogCtl->InstallXLogFileSegmentActive = false;
 	XLogCtl->WalWriterSleeping = false;
 
+	/* Use the checksum info from control file */
+	XLogCtl->data_checksum_version = ControlFile->data_checksum_version;
+
+	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+
 	SpinLockInit(&XLogCtl->Insert.insertpos_lck);
 	SpinLockInit(&XLogCtl->info_lck);
 	pg_atomic_init_u64(&XLogCtl->logInsertResult, InvalidXLogRecPtr);
@@ -6200,6 +6710,47 @@ StartupXLOG(void)
 	pfree(endOfRecoveryInfo->recoveryStopReason);
 	pfree(endOfRecoveryInfo);
 
+	/*
+	 * If we reach this point with checksums in the state inprogress-on, it
+	 * means that data checksums were in the process of being enabled when the
+	 * cluster shut down. Since processing didn't finish, the operation will
+	 * have to be restarted from scratch since there is no capability to
+	 * continue where it was when the cluster shut down. Thus, revert the
+	 * state back to off, and inform the user with a warning message. Being
+	 * able to restart processing is a TODO, but it wouldn't be possible to
+	 * restart here since we cannot launch a dynamic background worker
+	 * directly from here (it has to be from a regular backend).
+	 */
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		XLogChecksums(0);
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = 0;
+		SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		ereport(WARNING,
+				(errmsg("data checksums state has been set of off"),
+				 errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
+	}
+
+	/*
+	 * If data checksums were being disabled when the cluster was shut down,
+	 * we know that we have a state where all backends have stopped validating
+	 * checksums and we can move to off instead of prompting the user to
+	 * perform any action.
+	 */
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+	{
+		XLogChecksums(0);
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = 0;
+		SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+		SpinLockRelease(&XLogCtl->info_lck);
+	}
+
 	/*
 	 * All done with end-of-recovery actions.
 	 *
@@ -6491,7 +7042,7 @@ GetRedoRecPtr(void)
 	XLogRecPtr	ptr;
 
 	/*
-	 * The possibly not up-to-date copy in XlogCtl is enough. Even if we
+	 * The possibly not up-to-date copy in XLogCtl is enough. Even if we
 	 * grabbed a WAL insertion lock to read the authoritative value in
 	 * Insert->RedoRecPtr, someone might update it just after we've released
 	 * the lock.
@@ -7055,6 +7606,12 @@ CreateCheckPoint(int flags)
 	checkPoint.fullPageWrites = Insert->fullPageWrites;
 	checkPoint.wal_level = wal_level;
 
+	/*
+	 * Get the current data_checksum_version value from xlogctl, valid at the
+	 * time of the checkpoint.
+	 */
+	checkPoint.data_checksum_version = XLogCtl->data_checksum_version;
+
 	if (shutdown)
 	{
 		XLogRecPtr	curInsert = XLogBytePosToRecPtr(Insert->CurrBytePos);
@@ -7310,6 +7867,9 @@ CreateCheckPoint(int flags)
 	ControlFile->minRecoveryPoint = InvalidXLogRecPtr;
 	ControlFile->minRecoveryPointTLI = 0;
 
+	/* make sure we start with the checksum version as of the checkpoint */
+	ControlFile->data_checksum_version = checkPoint.data_checksum_version;
+
 	/*
 	 * Persist unloggedLSN value. It's reset on crash recovery, so this goes
 	 * unused on non-shutdown checkpoints, but seems useful to store it always
@@ -7453,6 +8013,10 @@ CreateEndOfRecoveryRecord(void)
 	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
 	ControlFile->minRecoveryPoint = recptr;
 	ControlFile->minRecoveryPointTLI = xlrec.ThisTimeLineID;
+
+	/* start with the latest checksum version (as of the end of recovery) */
+	ControlFile->data_checksum_version = XLogCtl->data_checksum_version;
+
 	UpdateControlFile();
 	LWLockRelease(ControlFileLock);
 
@@ -7794,6 +8358,10 @@ CreateRestartPoint(int flags)
 			if (flags & CHECKPOINT_IS_SHUTDOWN)
 				ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY;
 		}
+
+		/* we shall start with the latest checksum version */
+		ControlFile->data_checksum_version = lastCheckPoint.data_checksum_version;
+
 		UpdateControlFile();
 	}
 	LWLockRelease(ControlFileLock);
@@ -8205,6 +8773,26 @@ XLogReportParameters(void)
 	}
 }
 
+/*
+ * Log the new state of checksums
+ */
+static void
+XLogChecksums(uint32 new_type)
+{
+	xl_checksum_state xlrec;
+	XLogRecPtr	recptr;
+
+	xlrec.new_checksumtype = new_type;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, sizeof(xl_checksum_state));
+
+	INJECTION_POINT("datachecksums-xlogchecksums-pre-xloginsert", &new_type);
+
+	recptr = XLogInsert(RM_XLOG_ID, XLOG_CHECKSUMS);
+	XLogFlush(recptr);
+}
+
 /*
  * Update full_page_writes in shared memory, and write an
  * XLOG_FPW_CHANGE record if necessary.
@@ -8639,6 +9227,74 @@ xlog_redo(XLogReaderState *record)
 	{
 		/* nothing to do here, just for informational purposes */
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state state;
+		uint64		barrier;
+
+		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
+
+		/*
+		 * XXX Could this end up written to the control file prematurely? IIRC
+		 * that happens during checkpoint, so what if that gets triggered e.g.
+		 * because someone runs CHECKPOINT? If we then crash (or something
+		 * like that), could that confuse the instance?
+		 */
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = state.new_checksumtype;
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		/*
+		 * Block on a procsignalbarrier to await all processes having seen the
+		 * change to checksum status. Once the barrier has been passed we can
+		 * initiate the corresponding processing.
+		 */
+		switch (state.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			default:
+				Assert(state.new_checksumtype == 0);
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+		}
+
+		/*
+		 * force creating a restart point for the first CHECKPOINT after
+		 * seeing XLOG_CHECKSUMS in WAL
+		 */
+		checksumRestartPoint = true;
+	}
+
+	if (checksumRestartPoint &&
+		(info == XLOG_CHECKPOINT_ONLINE ||
+		 info == XLOG_CHECKPOINT_REDO ||
+		 info == XLOG_CHECKPOINT_SHUTDOWN))
+	{
+		int			flags;
+
+		elog(LOG, "forcing creation of a restart point after XLOG_CHECKSUMS");
+
+		/* We explicitly want an immediate checkpoint here */
+		flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT | CHECKPOINT_FAST;
+		RequestCheckpoint(flags);
+
+		checksumRestartPoint = false;
+	}
 }
 
 /*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 8c3090165f0..d786374209f 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -26,6 +26,7 @@
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "postmaster/datachecksumsworker.h"
 #include "replication/walreceiver.h"
 #include "storage/fd.h"
 #include "storage/latch.h"
@@ -748,3 +749,59 @@ pg_promote(PG_FUNCTION_ARGS)
 						   wait_seconds)));
 	PG_RETURN_BOOL(false);
 }
+
+/*
+ * Disables data checksums for the cluster, if applicable. Starts a background
+ * worker which turns off the data checksums.
+ */
+Datum
+disable_data_checksums(PG_FUNCTION_ARGS)
+{
+	bool		fast = PG_GETARG_BOOL(0);
+
+	ereport(LOG,
+			errmsg("disable_data_checksums, fast: %d", fast));
+
+	if (!superuser())
+		ereport(ERROR,
+				errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+				errmsg("must be superuser to change data checksum state"));
+
+	StartDataChecksumsWorkerLauncher(DISABLE_DATACHECKSUMS, 0, 0, fast);
+	PG_RETURN_VOID();
+}
+
+/*
+ * Enables data checksums for the cluster, if applicable.  Supports vacuum-
+ * like cost based throttling to limit system load. Starts a background worker
+ * which updates data checksums on existing data.
+ */
+Datum
+enable_data_checksums(PG_FUNCTION_ARGS)
+{
+	int			cost_delay = PG_GETARG_INT32(0);
+	int			cost_limit = PG_GETARG_INT32(1);
+	bool		fast = PG_GETARG_BOOL(2);
+
+	ereport(LOG,
+			errmsg("enable_data_checksums, cost_delay: %d cost_limit: %d fast: %d", cost_delay, cost_limit, fast));
+
+	if (!superuser())
+		ereport(ERROR,
+				errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+				errmsg("must be superuser to change data checksum state"));
+
+	if (cost_delay < 0)
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("cost delay cannot be a negative value"));
+
+	if (cost_limit <= 0)
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("cost limit must be greater than zero"));
+
+	StartDataChecksumsWorkerLauncher(ENABLE_DATACHECKSUMS, cost_delay, cost_limit, fast);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 52ff4d119e6..5fee73e617a 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -782,6 +782,10 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 		CheckPointTLI = ControlFile->checkPointCopy.ThisTimeLineID;
 		RedoStartLSN = ControlFile->checkPointCopy.redo;
 		RedoStartTLI = ControlFile->checkPointCopy.ThisTimeLineID;
+
+		elog(LOG, "InitWalRecovery checkpoint %X/%08X redo %X/%08X",
+			 LSN_FORMAT_ARGS(CheckPointLoc), LSN_FORMAT_ARGS(RedoStartLSN));
+
 		record = ReadCheckpointRecord(xlogprefetcher, CheckPointLoc,
 									  CheckPointTLI);
 		if (record != NULL)
@@ -1665,6 +1669,9 @@ PerformWalRecovery(void)
 	bool		reachedRecoveryTarget = false;
 	TimeLineID	replayTLI;
 
+	elog(LOG, "PerformWalRecovery checkpoint %X/%08X redo %X/%08X",
+		 LSN_FORMAT_ARGS(CheckPointLoc), LSN_FORMAT_ARGS(RedoStartLSN));
+
 	/*
 	 * Initialize shared variables for tracking progress of WAL replay, as if
 	 * we had just replayed the record before the REDO location (or the
@@ -1673,12 +1680,14 @@ PerformWalRecovery(void)
 	SpinLockAcquire(&XLogRecoveryCtl->info_lck);
 	if (RedoStartLSN < CheckPointLoc)
 	{
+		elog(LOG, "(RedoStartLSN < CheckPointLoc)");
 		XLogRecoveryCtl->lastReplayedReadRecPtr = InvalidXLogRecPtr;
 		XLogRecoveryCtl->lastReplayedEndRecPtr = RedoStartLSN;
 		XLogRecoveryCtl->lastReplayedTLI = RedoStartTLI;
 	}
 	else
 	{
+		elog(LOG, "(RedoStartLSN >= CheckPointLoc)");
 		XLogRecoveryCtl->lastReplayedReadRecPtr = xlogreader->ReadRecPtr;
 		XLogRecoveryCtl->lastReplayedEndRecPtr = xlogreader->EndRecPtr;
 		XLogRecoveryCtl->lastReplayedTLI = CheckPointTLI;
@@ -1690,6 +1699,10 @@ PerformWalRecovery(void)
 	XLogRecoveryCtl->recoveryPauseState = RECOVERY_NOT_PAUSED;
 	SpinLockRelease(&XLogRecoveryCtl->info_lck);
 
+	elog(LOG, "PerformWalRecovery lastReplayedReadRecPtr %X/%08X lastReplayedEndRecPtr %X/%08X",
+		 LSN_FORMAT_ARGS(XLogRecoveryCtl->lastReplayedReadRecPtr),
+		 LSN_FORMAT_ARGS(XLogRecoveryCtl->lastReplayedEndRecPtr));
+
 	/* Also ensure XLogReceiptTime has a sane value */
 	XLogReceiptTime = GetCurrentTimestamp();
 
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index bb7d90aa5d9..54dcfbcb333 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1613,7 +1613,8 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	 * enabled for this cluster, and if this is a relation file, then verify
 	 * the checksum.
 	 */
-	if (!noverify_checksums && DataChecksumsEnabled() &&
+	if (!noverify_checksums &&
+		DataChecksumsNeedWrite() &&
 		RelFileNumberIsValid(relfilenumber))
 		verify_checksum = true;
 
@@ -2007,6 +2008,9 @@ verify_page_checksum(Page page, XLogRecPtr start_lsn, BlockNumber blkno,
 	if (PageIsNew(page) || PageGetLSN(page) >= start_lsn)
 		return true;
 
+	if (!DataChecksumsNeedVerify())
+		return true;
+
 	/* Perform the actual checksum calculation. */
 	checksum = pg_checksum_page(page, blkno);
 
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 2d946d6d9e9..0f0c5b5c7fe 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -657,6 +657,22 @@ LANGUAGE INTERNAL
 STRICT VOLATILE PARALLEL UNSAFE
 AS 'pg_replication_origin_session_setup';
 
+CREATE OR REPLACE FUNCTION
+  pg_enable_data_checksums(cost_delay integer DEFAULT 0,
+                           cost_limit integer DEFAULT 100,
+						   fast boolean DEFAULT false)
+RETURNS void
+STRICT VOLATILE LANGUAGE internal 
+PARALLEL RESTRICTED
+AS 'enable_data_checksums';
+
+CREATE OR REPLACE FUNCTION
+  pg_disable_data_checksums(fast boolean DEFAULT false)
+RETURNS void
+STRICT VOLATILE LANGUAGE internal
+PARALLEL RESTRICTED
+AS 'disable_data_checksums';
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
@@ -782,6 +798,10 @@ REVOKE EXECUTE ON FUNCTION pg_ls_logicalmapdir() FROM PUBLIC;
 
 REVOKE EXECUTE ON FUNCTION pg_ls_replslotdir(text) FROM PUBLIC;
 
+REVOKE EXECUTE ON FUNCTION pg_enable_data_checksums(integer, integer, boolean) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_disable_data_checksums(boolean) FROM public;
+
 --
 -- We also set up some things as accessible to standard roles.
 --
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 884b6a23817..012bd34c6d6 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1358,6 +1358,26 @@ CREATE VIEW pg_stat_progress_copy AS
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_data_checksums AS
+    SELECT
+        S.pid AS pid, S.datid, D.datname AS datname,
+        CASE S.param1 WHEN 0 THEN 'enabling'
+                      WHEN 1 THEN 'disabling'
+                      WHEN 2 THEN 'waiting'
+                      WHEN 3 THEN 'waiting on temporary tables'
+                      WHEN 4 THEN 'waiting on checkpoint'
+                      WHEN 5 THEN 'done'
+                      END AS phase,
+        CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+        S.param3 AS databases_done,
+        CASE S.param4 WHEN -1 THEN NULL ELSE S.param4 END AS relations_total,
+        CASE S.param5 WHEN -1 THEN NULL ELSE S.param5 END AS relations_done,
+        CASE S.param6 WHEN -1 THEN NULL ELSE S.param6 END AS blocks_total,
+        CASE S.param7 WHEN -1 THEN NULL ELSE S.param7 END AS blocks_done
+    FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid
+    ORDER BY S.datid; -- return the launcher process first
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/postmaster/Makefile b/src/backend/postmaster/Makefile
index 0f4435d2d97..0c36765acfe 100644
--- a/src/backend/postmaster/Makefile
+++ b/src/backend/postmaster/Makefile
@@ -18,6 +18,7 @@ OBJS = \
 	bgworker.o \
 	bgwriter.o \
 	checkpointer.o \
+	datachecksumsworker.o \
 	fork_process.o \
 	interrupt.o \
 	launch_backend.o \
diff --git a/src/backend/postmaster/auxprocess.c b/src/backend/postmaster/auxprocess.c
index a6d3630398f..5742a1dd724 100644
--- a/src/backend/postmaster/auxprocess.c
+++ b/src/backend/postmaster/auxprocess.c
@@ -15,6 +15,7 @@
 #include <unistd.h>
 #include <signal.h>
 
+#include "access/xlog.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/auxprocess.h"
@@ -68,6 +69,24 @@ AuxiliaryProcessMainCommon(void)
 
 	ProcSignalInit(NULL, 0);
 
+	/*
+	 * Initialize a local cache of the data_checksum_version, to be updated by
+	 * the procsignal-based barriers.
+	 *
+	 * This intentionally happens after initializing the procsignal, otherwise
+	 * we might miss a state change. This means we can get a barrier for the
+	 * state we've just initialized - but it can happen only once.
+	 *
+	 * The postmaster (which is what gets forked into the new child process)
+	 * does not handle barriers, therefore it may not have the current value
+	 * of LocalDataChecksumVersion value (it'll have the value read from the
+	 * control file, which may be arbitrarily old).
+	 *
+	 * NB: Even if the postmaster handled barriers, the value might still be
+	 * stale, as it might have changed after this process forked.
+	 */
+	InitLocalDataChecksumVersion();
+
 	/*
 	 * Auxiliary processes don't run transactions, but they may need a
 	 * resource owner anyway to manage buffer pins acquired outside
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 1ad65c237c3..0d2ade1f905 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -18,6 +18,7 @@
 #include "pgstat.h"
 #include "port/atomics.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "replication/logicallauncher.h"
 #include "replication/logicalworker.h"
@@ -132,6 +133,12 @@ static const struct
 	},
 	{
 		"TablesyncWorkerMain", TablesyncWorkerMain
+	},
+	{
+		"DataChecksumsWorkerLauncherMain", DataChecksumsWorkerLauncherMain
+	},
+	{
+		"DataChecksumsWorkerMain", DataChecksumsWorkerMain
 	}
 };
 
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
new file mode 100644
index 00000000000..3deb57a96de
--- /dev/null
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -0,0 +1,1471 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.c
+ *	  Background worker for enabling or disabling data checksums online
+ *
+ * When enabling data checksums on a database at initdb time or when shut down
+ * with pg_checksums, no extra process is required as each page is checksummed,
+ * and verified, when accessed.  When enabling checksums on an already running
+ * cluster, this worker will ensure that all pages are checksummed before
+ * verification of the checksums is turned on. In the case of disabling
+ * checksums, the state transition is performed only in the control file, no
+ * changes are performed on the data pages.
+ *
+ * Checksums can be either enabled or disabled cluster-wide, with on/off being
+ * the end state for data_checksums.
+ *
+ * Enabling checksums
+ * ------------------
+ * When enabling checksums in an online cluster, data_checksums will be set to
+ * "inprogress-on" which signals that write operations MUST compute and write
+ * the checksum on the data page, but during reading the checksum SHALL NOT be
+ * verified. This ensures that all objects created during checksumming will
+ * have checksums set, but no reads will fail due to incorrect checksum. The
+ * DataChecksumsWorker will compile a list of databases which exist at the
+ * start of checksumming, and all of these which haven't been dropped during
+ * the processing MUST have been processed successfully in order for checksums
+ * to be enabled. Any new relation created during processing will see the
+ * in-progress state and will automatically be checksummed.
+ *
+ * For each database, all relations which have storage are read and every data
+ * page is marked dirty to force a write with the checksum. This will generate
+ * a lot of WAL as the entire database is read and written.
+ *
+ * If the processing is interrupted by a cluster restart, it will be restarted
+ * from the beginning again as state isn't persisted.
+ *
+ * Disabling checksums
+ * -------------------
+ * When disabling checksums, data_checksums will be set to "inprogress-off"
+ * which signals that checksums are written but no longer verified. This ensure
+ * that backends which have yet to move from the "on" state will still be able
+ * to process data checksum validation.
+ *
+ * Synchronization and Correctness
+ * -------------------------------
+ * The processes involved in enabling, or disabling, data checksums in an
+ * online cluster must be properly synchronized with the normal backends
+ * serving concurrent queries to ensure correctness. Correctness is defined
+ * as the following:
+ *
+ *    - Backends SHALL NOT violate local data_checksums state
+ *    - Data checksums SHALL NOT be considered enabled cluster-wide until all
+ *      currently connected backends have the local state "enabled"
+ *
+ * There are two levels of synchronization required for enabling data checksums
+ * in an online cluster: (i) changing state in the active backends ("on",
+ * "off", "inprogress-on" and "inprogress-off"), and (ii) ensuring no
+ * incompatible objects and processes are left in a database when workers end.
+ * The former deals with cluster-wide agreement on data checksum state and the
+ * latter with ensuring that any concurrent activity cannot break the data
+ * checksum contract during processing.
+ *
+ * Synchronizing the state change is done with procsignal barriers, where the
+ * WAL logging backend updating the global state in the controlfile will wait
+ * for all other backends to absorb the barrier. Barrier absorption will happen
+ * during interrupt processing, which means that connected backends will change
+ * state at different times. To prevent data checksum state changes when
+ * writing and verifying checksums, interrupts shall be held off before
+ * interrogating state and resumed when the IO operation has been performed.
+ *
+ *   When Enabling Data Checksums
+ *   ----------------------------
+ *   A process which fails to observe data checksums being enabled can induce
+ *   two types of errors: failing to write the checksum when modifying the page
+ *   and failing to validate the data checksum on the page when reading it.
+ *
+ *   When processing starts all backends belong to one of the below sets, with
+ *   one set being empty:
+ *
+ *   Bd: Backends in "off" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   If processing is started in an online cluster then all backends are in Bd.
+ *   If processing was halted by the cluster shutting down, the controlfile
+ *   state "inprogress-on" will be observed on system startup and all backends
+ *   will be in Bd. Backends transition Bd -> Bi via a procsignalbarrier.  When
+ *   the DataChecksumsWorker has finished writing checksums on all pages and
+ *   enables data checksums cluster-wide, there are four sets of backends where
+ *   Bd shall be an empty set:
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   Backends in Bi and Be will write checksums when modifying a page, but only
+ *   backends in Be will verify the checksum during reading. The Bg backend is
+ *   blocked waiting for all backends in Bi to process interrupts and move to
+ *   Be. Any backend starting while Bg is waiting on the procsignalbarrier will
+ *   observe the global state being "on" and will thus automatically belong to
+ *   Be.  Checksums are enabled cluster-wide when Bi is an empty set. Bi and Be
+ *   are compatible sets while still operating based on their local state as
+ *   both write data checksums.
+ *
+ *   When Disabling Data Checksums
+ *   -----------------------------
+ *   A process which fails to observe that data checksums have been disabled
+ *   can induce two types of errors: writing the checksum when modifying the
+ *   page and validating a data checksum which is no longer correct due to
+ *   modifications to the page.
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bo: Backends in "inprogress-off" state
+ *
+ *   Backends transition from the Be state to Bd like so: Be -> Bo -> Bd
+ *
+ *   The goal is to transition all backends to Bd making the others empty sets.
+ *   Backends in Bo write data checksums, but don't validate them, such that
+ *   backends still in Be can continue to validate pages until the barrier has
+ *   been absorbed such that they are in Bo. Once all backends are in Bo, the
+ *   barrier to transition to "off" can be raised and all backends can safely
+ *   stop writing data checksums as no backend is enforcing data checksum
+ *   validation any longer.
+ *
+ *
+ * Potential optimizations
+ * -----------------------
+ * Below are some potential optimizations and improvements which were brought
+ * up during reviews of this feature, but which weren't implemented in the
+ * initial version. These are ideas listed without any validation on their
+ * feasibility or potential payoff. More discussion on these can be found on
+ * the -hackers threads linked to in the commit message of this feature.
+ *
+ *   * Launching datachecksumsworker for resuming operation from the startup
+ *     process: Currently users have to restart processing manually after a
+ *     restart since dynamic background worker cannot be started from the
+ *     postmaster. Changing the startup process could make restarting the
+ *     processing automatic on cluster restart.
+ *   * Avoid dirtying the page when checksums already match: Iff the checksum
+ *     on the page happens to already match we still dirty the page. It should
+ *     be enough to only do the log_newpage_buffer() call in that case.
+ *   * Invent a lightweight WAL record that doesn't contain the full-page
+ *     image but just the block number: On replay, the redo routine would read
+ *     the page from disk.
+ *   * Teach pg_checksums to avoid checksummed pages when pg_checksums is used
+ *     to enable checksums on a cluster which is in inprogress-on state and
+ *     may have checksummed pages (make pg_checksums be able to resume an
+ *     online operation).
+ *
+ *
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/postmaster/datachecksumsworker.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/htup_details.h"
+#include "access/xact.h"
+#include "access/xloginsert.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_database.h"
+#include "commands/progress.h"
+#include "commands/vacuum.h"
+#include "common/relpath.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "postmaster/bgworker.h"
+#include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/bufmgr.h"
+#include "storage/checksum.h"
+#include "storage/ipc.h"
+#include "storage/lmgr.h"
+#include "storage/lwlock.h"
+#include "storage/procarray.h"
+#include "storage/smgr.h"
+#include "tcop/tcopprot.h"
+#include "utils/fmgroids.h"
+#include "utils/injection_point.h"
+#include "utils/lsyscache.h"
+#include "utils/ps_status.h"
+#include "utils/syscache.h"
+
+/*
+ * Number of times we retry to open a database before giving up and consider
+ * it to have failed processing.
+ */
+#define DATACHECKSUMSWORKER_MAX_DB_RETRIES 5
+
+/*
+ * Signaling between backends calling pg_enable/disable_data_checksums, the
+ * checksums launcher process, and the checksums worker process.
+ *
+ * This struct is protected by DataChecksumsWorkerLock
+ */
+typedef struct DataChecksumsWorkerShmemStruct
+{
+	/*
+	 * These are set by pg_{enable|disable|verify}_data_checksums, to tell the
+	 * launcher what the target state is.
+	 */
+	DataChecksumsWorkerOperation launch_operation;
+	int			launch_cost_delay;
+	int			launch_cost_limit;
+	bool		launch_fast;
+
+	/*
+	 * Is a launcher process is currently running?
+	 *
+	 * This is set by the launcher process, after it has read the above
+	 * launch_* parameters.
+	 */
+	bool		launcher_running;
+
+	/*
+	 * These fields indicate the target state that the launcher is currently
+	 * working towards. They can be different from the corresponding launch_*
+	 * fields, if a new pg_enable/disable_data_checksums() call was made while
+	 * the launcher/worker was already running.
+	 *
+	 * The below members are set when the launcher starts, and are only
+	 * accessed read-only by the single worker. Thus, we can access these
+	 * without a lock. If multiple workers, or dynamic cost parameters, are
+	 * supported at some point then this would need to be revisited.
+	 */
+	DataChecksumsWorkerOperation operation;
+	int			cost_delay;
+	int			cost_limit;
+	bool		immediate_checkpoint;
+
+	/*
+	 * Signaling between the launcher and the worker process.
+	 *
+	 * As there is only a single worker, and the launcher won't read these
+	 * until the worker exits, they can be accessed without the need for a
+	 * lock. If multiple workers are supported then this will have to be
+	 * revisited.
+	 */
+
+	/* result, set by worker before exiting */
+	DataChecksumsWorkerResult success;
+
+	/*
+	 * tells the worker process whether it should also process the shared
+	 * catalogs
+	 */
+	bool		process_shared_catalogs;
+} DataChecksumsWorkerShmemStruct;
+
+/* Shared memory segment for datachecksumsworker */
+static DataChecksumsWorkerShmemStruct *DataChecksumsWorkerShmem;
+
+typedef struct DataChecksumsWorkerDatabase
+{
+	Oid			dboid;
+	char	   *dbname;
+} DataChecksumsWorkerDatabase;
+
+typedef struct DataChecksumsWorkerResultEntry
+{
+	Oid			dboid;
+	DataChecksumsWorkerResult result;
+	int			retries;
+} DataChecksumsWorkerResultEntry;
+
+
+/*
+ * Flag set by the interrupt handler
+ */
+static volatile sig_atomic_t abort_requested = false;
+
+/*
+ * Have we set the DataChecksumsWorkerShmemStruct->launcher_running flag?
+ * If we have, we need to clear it before exiting!
+ */
+static volatile sig_atomic_t launcher_running = false;
+
+/*
+ * Are we enabling data checksums, or disabling them?
+ */
+static DataChecksumsWorkerOperation operation;
+
+/* Prototypes */
+static List *BuildDatabaseList(void);
+static List *BuildRelationList(bool temp_relations, bool include_shared);
+static void FreeDatabaseList(List *dblist);
+static DataChecksumsWorkerResult ProcessDatabase(DataChecksumsWorkerDatabase *db);
+static bool ProcessAllDatabases(bool immediate_checkpoint);
+static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy);
+static void launcher_cancel_handler(SIGNAL_ARGS);
+static void WaitForAllTransactionsToFinish(void);
+
+/*
+ * StartDataChecksumsWorkerLauncher
+ *		Main entry point for datachecksumsworker launcher process
+ *
+ * The main entrypoint for starting data checksums processing for enabling as
+ * well as disabling.
+ */
+void
+StartDataChecksumsWorkerLauncher(DataChecksumsWorkerOperation op,
+								 int cost_delay,
+								 int cost_limit,
+								 bool fast)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	bool		launcher_running;
+
+#ifdef USE_ASSERT_CHECKING
+	/* The cost delay settings have no effect when disabling */
+	if (op == DISABLE_DATACHECKSUMS)
+		Assert(cost_delay == 0 && cost_limit == 0);
+#endif
+
+	/* Store the desired state in shared memory */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	DataChecksumsWorkerShmem->launch_operation = op;
+	DataChecksumsWorkerShmem->launch_cost_delay = cost_delay;
+	DataChecksumsWorkerShmem->launch_cost_limit = cost_limit;
+	DataChecksumsWorkerShmem->launch_fast = fast;
+
+	/* is the launcher already running? */
+	launcher_running = DataChecksumsWorkerShmem->launcher_running;
+
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * Launch a new launcher process, if it's not running already.
+	 *
+	 * If the launcher is currently busy enabling the checksums, and we want
+	 * them disabled (or vice versa), the launcher will notice that at latest
+	 * when it's about to exit, and will loop back process the new request. So
+	 * if the launcher is already running, we don't need to do anything more
+	 * here to abort it.
+	 *
+	 * If you call pg_enable/disable_data_checksums() twice in a row, before
+	 * the launcher has had a chance to start up, we still end up launching it
+	 * twice.  That's OK, the second invocation will see that a launcher is
+	 * already running and exit quickly.
+	 *
+	 * TODO: We could optimize here and skip launching the launcher, if we are
+	 * already in the desired state, i.e. if the checksums are already enabled
+	 * and you call pg_enable_data_checksums().
+	 */
+	if (!launcher_running)
+	{
+		/*
+		 * Prepare the BackgroundWorker and launch it.
+		 */
+		memset(&bgw, 0, sizeof(bgw));
+		bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+		bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+		snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+		snprintf(bgw.bgw_function_name, BGW_MAXLEN, "DataChecksumsWorkerLauncherMain");
+		snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksum launcher");
+		snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksum launcher");
+		bgw.bgw_restart_time = BGW_NEVER_RESTART;
+		bgw.bgw_notify_pid = MyProcPid;
+		bgw.bgw_main_arg = (Datum) 0;
+
+		if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+			ereport(ERROR,
+					errcode(ERRCODE_INSUFFICIENT_RESOURCES),
+					errmsg("failed to start background worker to process data checksums"));
+	}
+}
+
+/*
+ * ProcessSingleRelationFork
+ *		Enable data checksums in a single relation/fork.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy)
+{
+	BlockNumber numblocks = RelationGetNumberOfBlocksInFork(reln, forkNum);
+	char		activity[NAMEDATALEN * 2 + 128];
+	char	   *relns;
+
+	relns = get_namespace_name(RelationGetNamespace(reln));
+
+	if (!relns)
+		return false;
+
+	/* Report the current relation to pgstat_activity */
+	snprintf(activity, sizeof(activity) - 1, "processing: %s.%s (%s, %dblocks)",
+			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	/*
+	 * As of now we only update the block counter for main forks in order to
+	 * not cause too frequent calls. TODO: investigate whether we should do it
+	 * more frequent?
+	 */
+	if (forkNum == MAIN_FORKNUM)
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+									 numblocks);
+
+	/*
+	 * We are looping over the blocks which existed at the time of process
+	 * start, which is safe since new blocks are created with checksums set
+	 * already due to the state being "inprogress-on".
+	 */
+	for (BlockNumber blknum = 0; blknum < numblocks; blknum++)
+	{
+		Buffer		buf = ReadBufferExtended(reln, forkNum, blknum, RBM_NORMAL, strategy);
+
+		/* Need to get an exclusive lock before we can flag as dirty */
+		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+
+		/*
+		 * Mark the buffer as dirty and force a full page write.  We have to
+		 * re-write the page to WAL even if the checksum hasn't changed,
+		 * because if there is a replica it might have a slightly different
+		 * version of the page with an invalid checksum, caused by unlogged
+		 * changes (e.g. hintbits) on the master happening while checksums
+		 * were off. This can happen if there was a valid checksum on the page
+		 * at one point in the past, so only when checksums are first on, then
+		 * off, and then turned on again.  TODO: investigate if this could be
+		 * avoided if the checksum is calculated to be correct and wal_level
+		 * is set to "minimal",
+		 */
+		START_CRIT_SECTION();
+		MarkBufferDirty(buf);
+		log_newpage_buffer(buf, false);
+		END_CRIT_SECTION();
+
+		UnlockReleaseBuffer(buf);
+
+		/*
+		 * This is the only place where we check if we are asked to abort, the
+		 * abortion will bubble up from here. It's safe to check this without
+		 * a lock, because if we miss it being set, we will try again soon.
+		 */
+		Assert(operation == ENABLE_DATACHECKSUMS);
+		if (DataChecksumsWorkerShmem->launch_operation == DISABLE_DATACHECKSUMS)
+			abort_requested = true;
+
+		if (abort_requested)
+			return false;
+
+		/*
+		 * As of now we only update the block counter for main forks in order
+		 * to not cause too frequent calls. TODO: investigate whether we
+		 * should do it more frequent?
+		 */
+		if (forkNum == MAIN_FORKNUM)
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+										 (blknum + 1));
+
+		vacuum_delay_point(false);
+	}
+
+	pfree(relns);
+	return true;
+}
+
+/*
+ * ProcessSingleRelationByOid
+ *		Process a single relation based on oid.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationByOid(Oid relationId, BufferAccessStrategy strategy)
+{
+	Relation	rel;
+	bool		aborted = false;
+
+	StartTransactionCommand();
+
+	rel = try_relation_open(relationId, AccessShareLock);
+	if (rel == NULL)
+	{
+		/*
+		 * Relation no longer exists. We don't consider this an error since
+		 * there are no pages in it that need data checksums, and thus return
+		 * true. The worker operates off a list of relations generated at the
+		 * start of processing, so relations being dropped in the meantime is
+		 * to be expected.
+		 */
+		CommitTransactionCommand();
+		pgstat_report_activity(STATE_IDLE, NULL);
+		return true;
+	}
+	RelationGetSmgr(rel);
+
+	for (ForkNumber fnum = 0; fnum <= MAX_FORKNUM; fnum++)
+	{
+		if (smgrexists(rel->rd_smgr, fnum))
+		{
+			if (!ProcessSingleRelationFork(rel, fnum, strategy))
+			{
+				aborted = true;
+				break;
+			}
+		}
+	}
+	relation_close(rel, AccessShareLock);
+	elog(DEBUG2,
+		 "data checksum processing done for relation with OID %u: %s",
+		 relationId, (aborted ? "aborted" : "finished"));
+
+	CommitTransactionCommand();
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return !aborted;
+}
+
+/*
+ * ProcessDatabase
+ *		Enable data checksums in a single database.
+ *
+ * We do this by launching a dynamic background worker into this database, and
+ * waiting for it to finish.  We have to do this in a separate worker, since
+ * each process can only be connected to one database during its lifetime.
+ */
+static DataChecksumsWorkerResult
+ProcessDatabase(DataChecksumsWorkerDatabase *db)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	BgwHandleStatus status;
+	pid_t		pid;
+	char		activity[NAMEDATALEN + 64];
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_FAILED;
+
+	memset(&bgw, 0, sizeof(bgw));
+	bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+	bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+	snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+	snprintf(bgw.bgw_function_name, BGW_MAXLEN, "%s", "DataChecksumsWorkerMain");
+	snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksum worker");
+	snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksum worker");
+	bgw.bgw_restart_time = BGW_NEVER_RESTART;
+	bgw.bgw_notify_pid = MyProcPid;
+	bgw.bgw_main_arg = ObjectIdGetDatum(db->dboid);
+
+	/*
+	 * If there are no worker slots available, make sure we retry processing
+	 * this database. This will make the datachecksumsworker move on to the
+	 * next database and quite likely fail with the same problem. TODO: Maybe
+	 * we need a backoff to avoid running through all the databases here in
+	 * short order.
+	 */
+	if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+	{
+		ereport(WARNING,
+				errmsg("failed to start worker for enabling data checksums in database \"%s\", retrying",
+					   db->dbname),
+				errhint("The max_worker_processes setting might be too low."));
+		return DATACHECKSUMSWORKER_RETRYDB;
+	}
+
+	status = WaitForBackgroundWorkerStartup(bgw_handle, &pid);
+	if (status == BGWH_STOPPED)
+	{
+		ereport(WARNING,
+				errmsg("could not start background worker for enabling data checksums in database \"%s\"",
+					   db->dbname),
+				errhint("More details on the error might be found in the server log."));
+		return DATACHECKSUMSWORKER_FAILED;
+	}
+
+	/*
+	 * If the postmaster crashed we cannot end up with a processed database so
+	 * we have no alternative other than exiting. When enabling checksums we
+	 * won't at this time have changed the pg_control version to enabled so
+	 * when the cluster comes back up processing will have to be restarted.
+	 * When disabling, the pg_control version will be set to off before this
+	 * so when the cluster comes up checksums will be off as expected.
+	 */
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				errcode(ERRCODE_ADMIN_SHUTDOWN),
+				errmsg("cannot enable data checksums without the postmaster process"),
+				errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums()."));
+
+	Assert(status == BGWH_STARTED);
+	ereport(DEBUG1,
+			errmsg("initiating data checksum processing in database \"%s\"",
+				   db->dbname));
+
+	snprintf(activity, sizeof(activity) - 1,
+			 "Waiting for worker in database %s (pid %ld)", db->dbname, (long) pid);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	status = WaitForBackgroundWorkerShutdown(bgw_handle);
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				errcode(ERRCODE_ADMIN_SHUTDOWN),
+				errmsg("postmaster exited during data checksum processing in \"%s\"",
+					   db->dbname),
+				errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums()."));
+
+	if (DataChecksumsWorkerShmem->success == DATACHECKSUMSWORKER_ABORTED)
+		ereport(LOG,
+				errmsg("data checksums processing was aborted in database \"%s\"",
+					   db->dbname));
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return DataChecksumsWorkerShmem->success;
+}
+
+/*
+ * launcher_exit
+ *
+ * Internal routine for cleaning up state when the launcher process exits. We
+ * need to clean up the abort flag to ensure that processing can be restarted
+ * again after it was previously aborted.
+ */
+static void
+launcher_exit(int code, Datum arg)
+{
+	if (launcher_running)
+	{
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		launcher_running = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		LWLockRelease(DataChecksumsWorkerLock);
+	}
+}
+
+/*
+ * launcher_cancel_handler
+ *
+ * Internal routine for reacting to SIGINT and flagging the worker to abort.
+ * The worker won't be interrupted immediately but will check for abort flag
+ * between each block in a relation.
+ */
+static void
+launcher_cancel_handler(SIGNAL_ARGS)
+{
+	int			save_errno = errno;
+
+	abort_requested = true;
+
+	/*
+	 * There is no sleeping in the main loop, the flag will be checked
+	 * periodically in ProcessSingleRelationFork. The worker does however
+	 * sleep when waiting for concurrent transactions to end so we still need
+	 * to set the latch.
+	 */
+	SetLatch(MyLatch);
+
+	errno = save_errno;
+}
+
+/*
+ * WaitForAllTransactionsToFinish
+ *		Blocks awaiting all current transactions to finish
+ *
+ * Returns when all transactions which are active at the call of the function
+ * have ended, or if the postmaster dies while waiting. If the postmaster dies
+ * the abort flag will be set to indicate that the caller of this shouldn't
+ * proceed.
+ *
+ * NB: this will return early, if aborted by SIGINT or if the target state
+ * is changed while we're running.
+ */
+static void
+WaitForAllTransactionsToFinish(void)
+{
+	TransactionId waitforxid;
+
+	LWLockAcquire(XidGenLock, LW_SHARED);
+	waitforxid = XidFromFullTransactionId(TransamVariables->nextXid);
+	LWLockRelease(XidGenLock);
+
+	while (TransactionIdPrecedes(GetOldestActiveTransactionId(false, true), waitforxid))
+	{
+		char		activity[64];
+		int			rc;
+
+		/* Oldest running xid is older than us, so wait */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for current transactions to finish (waiting for %u)",
+				 waitforxid);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 3 seconds */
+		ResetLatch(MyLatch);
+		rc = WaitLatch(MyLatch,
+					   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+					   3000,
+					   WAIT_EVENT_CHECKSUM_ENABLE_STARTCONDITION);
+
+		/*
+		 * If the postmaster died we won't be able to enable checksums
+		 * cluster-wide so abort and hope to continue when restarted.
+		 */
+		if (rc & WL_POSTMASTER_DEATH)
+			ereport(FATAL,
+					errcode(ERRCODE_ADMIN_SHUTDOWN),
+					errmsg("postmaster exited during data checksum processing"),
+					errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums()."));
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_SHARED);
+		if (DataChecksumsWorkerShmem->launch_operation != operation)
+			abort_requested = true;
+		LWLockRelease(DataChecksumsWorkerLock);
+		if (abort_requested)
+			break;
+	}
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+	return;
+}
+
+/*
+ * DataChecksumsWorkerLauncherMain
+ *
+ * Main function for launching dynamic background workers for processing data
+ * checksums in databases. This function has the bgworker management, with
+ * ProcessAllDatabases being responsible for looping over the databases and
+ * initiating processing.
+ */
+void
+DataChecksumsWorkerLauncherMain(Datum arg)
+{
+	on_shmem_exit(launcher_exit, 0);
+
+	ereport(DEBUG1,
+			errmsg("background worker \"datachecksum launcher\" started"));
+
+	pqsignal(SIGTERM, die);
+	pqsignal(SIGINT, launcher_cancel_handler);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_LAUNCHER;
+	init_ps_display(NULL);
+
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	if (DataChecksumsWorkerShmem->launcher_running)
+	{
+		/* Launcher was already running, let it finish */
+		LWLockRelease(DataChecksumsWorkerLock);
+		return;
+	}
+
+	launcher_running = true;
+
+	/*
+	 * Initialize a connection to shared catalogs only.
+	 */
+	BackgroundWorkerInitializeConnectionByOid(InvalidOid, InvalidOid, 0);
+
+	operation = DataChecksumsWorkerShmem->launch_operation;
+	DataChecksumsWorkerShmem->launcher_running = true;
+	DataChecksumsWorkerShmem->operation = operation;
+	DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+	DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+	DataChecksumsWorkerShmem->immediate_checkpoint = DataChecksumsWorkerShmem->launch_fast;
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * The target state can change while we are busy enabling/disabling
+	 * checksums, if the user calls pg_disable/enable_data_checksums() before
+	 * we are finished with the previous request. In that case, we will loop
+	 * back here, to process the new request.
+	 */
+again:
+
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	if (operation == ENABLE_DATACHECKSUMS)
+	{
+		/*
+		 * If we are asked to enable checksums in a cluster which already has
+		 * checksums enabled, exit immediately as there is nothing more to do.
+		 * Hold interrupts to make sure state doesn't change during checking.
+		 */
+		HOLD_INTERRUPTS();
+		if (DataChecksumsNeedVerify())
+		{
+			RESUME_INTERRUPTS();
+			goto done;
+		}
+		RESUME_INTERRUPTS();
+
+		/*
+		 * Set the state to inprogress-on and wait on the procsignal barrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_ENABLING);
+		SetDataChecksumsOnInProgress(DataChecksumsWorkerShmem->immediate_checkpoint);
+
+		/*
+		 * All backends are now in inprogress-on state and are writing data
+		 * checksums.  Start processing all data at rest.
+		 */
+		if (!ProcessAllDatabases(DataChecksumsWorkerShmem->immediate_checkpoint))
+		{
+			/*
+			 * If the target state changed during processing then it's not a
+			 * failure, so restart processing instead.
+			 */
+			LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+			if (DataChecksumsWorkerShmem->launch_operation != operation)
+			{
+				LWLockRelease(DataChecksumsWorkerLock);
+				goto done;
+			}
+			LWLockRelease(DataChecksumsWorkerLock);
+			ereport(ERROR,
+					errcode(ERRCODE_INSUFFICIENT_RESOURCES),
+					errmsg("unable to enable data checksums in cluster"));
+		}
+
+		/*
+		 * Data checksums have been set on all pages, set the state to on in
+		 * order to instruct backends to validate checksums on reading.
+		 */
+		SetDataChecksumsOn(DataChecksumsWorkerShmem->immediate_checkpoint);
+	}
+	else
+	{
+		int			flags;
+
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_DISABLING);
+		SetDataChecksumsOff(DataChecksumsWorkerShmem->immediate_checkpoint);
+
+		flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+		if (DataChecksumsWorkerShmem->immediate_checkpoint)
+			flags = flags | CHECKPOINT_FAST;
+		RequestCheckpoint(flags);
+	}
+
+done:
+
+	/*
+	 * All done. But before we exit, check if the target state was changed
+	 * while we were running. In that case we will have to start all over
+	 * again.
+	 */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+	if (DataChecksumsWorkerShmem->launch_operation != operation)
+	{
+		DataChecksumsWorkerShmem->operation = DataChecksumsWorkerShmem->launch_operation;
+		operation = DataChecksumsWorkerShmem->launch_operation;
+		DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+		DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+		LWLockRelease(DataChecksumsWorkerLock);
+		goto again;
+	}
+
+	/* Shut down progress reporting as we are done */
+	pgstat_progress_end_command();
+
+	launcher_running = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	LWLockRelease(DataChecksumsWorkerLock);
+}
+
+/*
+ * ProcessAllDatabases
+ *		Compute the list of all databases and process checksums in each
+ *
+ * This will repeatedly generate a list of databases to process for enabling
+ * checksums. Until no new databases are found, this will loop around computing
+ * a new list and comparing it to the already seen ones.
+ *
+ * If immediate_checkpoint is set to true then a CHECKPOINT_FAST will be
+ * issued. This is useful for testing but should be avoided in production use
+ * as it may affect cluster performance drastically.
+ */
+static bool
+ProcessAllDatabases(bool immediate_checkpoint)
+{
+	List	   *DatabaseList;
+	HTAB	   *ProcessedDatabases = NULL;
+	HASHCTL		hash_ctl;
+	bool		found_failed = false;
+	int			flags;
+
+	/* Initialize a hash tracking all processed databases */
+	memset(&hash_ctl, 0, sizeof(hash_ctl));
+	hash_ctl.keysize = sizeof(Oid);
+	hash_ctl.entrysize = sizeof(DataChecksumsWorkerResultEntry);
+	ProcessedDatabases = hash_create("Processed databases",
+									 64,
+									 &hash_ctl,
+									 HASH_ELEM | HASH_BLOBS);
+
+	/*
+	 * Set up so first run processes shared catalogs, but not once in every
+	 * db.
+	 */
+	DataChecksumsWorkerShmem->process_shared_catalogs = true;
+
+	/*
+	 * Get a list of all databases to process. This may include databases that
+	 * were created during our runtime.  Since a database can be created as a
+	 * copy of any other database (which may not have existed in our last
+	 * run), we have to repeat this loop until no new databases show up in the
+	 * list.
+	 */
+	DatabaseList = BuildDatabaseList();
+
+	/* Allow a test case to modify the initial list of databases */
+	INJECTION_POINT("datachecksumsworker-initial-dblist", DatabaseList);
+
+	/*
+	 * Update progress reporting with the total number of databases we need to
+	 * process.  This number should not be changed during processing, the
+	 * columns for processed databases is instead increased such that it can
+	 * be compared against the total.
+	 */
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_DBS_TOTAL,
+			PROGRESS_DATACHECKSUMS_DBS_DONE,
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE,
+			PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+			PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+		};
+
+		int64		vals[6];
+
+		vals[0] = list_length(DatabaseList);
+		vals[1] = 0;
+
+		/* translated to NULL */
+		vals[2] = -1;
+		vals[3] = -1;
+		vals[4] = -1;
+		vals[5] = -1;
+
+		pgstat_progress_update_multi_param(6, index, vals);
+	}
+
+	while (true)
+	{
+		int			processed_databases = 0;
+
+		foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+		{
+			DataChecksumsWorkerResult result;
+			DataChecksumsWorkerResultEntry *entry;
+			bool		found;
+
+			/*
+			 * Check if this database has been processed already, and if so
+			 * whether it should be retried or skipped.
+			 */
+			entry = (DataChecksumsWorkerResultEntry *) hash_search(ProcessedDatabases, &db->dboid,
+																   HASH_FIND, NULL);
+
+			if (entry)
+			{
+				if (entry->result == DATACHECKSUMSWORKER_RETRYDB)
+				{
+					/*
+					 * Limit the number of retries to avoid infinite looping
+					 * in case there simply won't be enough workers in the
+					 * cluster to finish this operation.
+					 */
+					if (entry->retries > DATACHECKSUMSWORKER_MAX_DB_RETRIES)
+						entry->result = DATACHECKSUMSWORKER_FAILED;
+				}
+
+				/* Skip if this database has been processed already */
+				if (entry->result != DATACHECKSUMSWORKER_RETRYDB)
+					continue;
+			}
+
+			result = ProcessDatabase(db);
+			processed_databases++;
+
+			/*
+			 * Update the number of processed databases in the progress
+			 * report.
+			 */
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_DBS_DONE,
+										 processed_databases);
+
+			/* Allow a test process to alter the result of the operation */
+			INJECTION_POINT("datachecksumsworker-fail-db", &result);
+
+			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
+			{
+				/*
+				 * If one database has completed shared catalogs, we don't
+				 * have to process them again.
+				 */
+				if (DataChecksumsWorkerShmem->process_shared_catalogs)
+					DataChecksumsWorkerShmem->process_shared_catalogs = false;
+			}
+			else if (result == DATACHECKSUMSWORKER_ABORTED)
+			{
+				/* Abort flag set, so exit the whole process */
+				return false;
+			}
+
+			entry = hash_search(ProcessedDatabases, &db->dboid, HASH_ENTER, &found);
+			entry->dboid = db->dboid;
+			entry->result = result;
+			if (!found)
+				entry->retries = 0;
+			else
+				entry->retries++;
+		}
+
+		elog(DEBUG1,
+			 "%i databases processed for data checksum enabling, %s",
+			 processed_databases,
+			 (processed_databases ? "process with restart" : "process completed"));
+
+		FreeDatabaseList(DatabaseList);
+
+		/*
+		 * If no databases were processed in this run of the loop, we have now
+		 * finished all databases and no concurrently created ones can exist.
+		 */
+		if (processed_databases == 0)
+			break;
+
+		/*
+		 * Re-generate the list of databases for another pass. Since we wait
+		 * for all pre-existing transactions finish, this way we can be
+		 * certain that there are no databases left without checksums.
+		 */
+		WaitForAllTransactionsToFinish();
+		DatabaseList = BuildDatabaseList();
+	}
+
+	/*
+	 * ProcessedDatabases now has all databases and the results of their
+	 * processing. Failure to enable checksums for a database can be because
+	 * they actually failed for some reason, or because the database was
+	 * dropped between us getting the database list and trying to process it.
+	 * Get a fresh list of databases to detect the second case where the
+	 * database was dropped before we had started processing it. If a database
+	 * still exists, but enabling checksums failed then we fail the entire
+	 * checksumming process and exit with an error.
+	 */
+	WaitForAllTransactionsToFinish();
+	DatabaseList = BuildDatabaseList();
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+	{
+		DataChecksumsWorkerResultEntry *entry;
+		bool		found;
+
+		entry = hash_search(ProcessedDatabases, (void *) &db->dboid,
+							HASH_FIND, &found);
+
+		/*
+		 * We are only interested in the processed databases which failed, and
+		 * where the failed database still exists.  This indicates that
+		 * enabling checksums actually failed, and not that the failure was
+		 * due to the db being concurrently dropped.
+		 */
+		if (found && entry->result == DATACHECKSUMSWORKER_FAILED)
+		{
+			ereport(WARNING,
+					errmsg("failed to enable data checksums in \"%s\"", db->dbname));
+			found_failed = found;
+			continue;
+		}
+	}
+
+	FreeDatabaseList(DatabaseList);
+
+	if (found_failed)
+	{
+		/* Disable checksums on cluster, because we failed */
+		SetDataChecksumsOff(DataChecksumsWorkerShmem->immediate_checkpoint);
+		/* Force a checkpoint to make everything consistent */
+		flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+		if (immediate_checkpoint)
+			flags = flags | CHECKPOINT_FAST;
+		RequestCheckpoint(flags);
+		ereport(ERROR,
+				errcode(ERRCODE_INSUFFICIENT_RESOURCES),
+				errmsg("data checksums failed to get enabled in all databases, aborting"),
+				errhint("The server log might have more information on the cause of the error."));
+	}
+
+	/*
+	 * When enabling checksums, we have to wait for a checkpoint for the
+	 * checksums to change from in-progress to on.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT);
+
+	/*
+	 * Force a checkpoint to get everything out to disk. The use of immediate
+	 * checkpoints is for running tests, as they would otherwise not execute
+	 * in such a way that they can reliably be placed under timeout control.
+	 */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_FAST;
+	RequestCheckpoint(flags);
+
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_BARRIER);
+	return true;
+}
+
+/*
+ * DataChecksumsWorkerShmemSize
+ *		Compute required space for datachecksumsworker-related shared memory
+ */
+Size
+DataChecksumsWorkerShmemSize(void)
+{
+	Size		size;
+
+	size = sizeof(DataChecksumsWorkerShmemStruct);
+	size = MAXALIGN(size);
+
+	return size;
+}
+
+/*
+ * DataChecksumsWorkerShmemInit
+ *		Allocate and initialize datachecksumsworker-related shared memory
+ */
+void
+DataChecksumsWorkerShmemInit(void)
+{
+	bool		found;
+
+	DataChecksumsWorkerShmem = (DataChecksumsWorkerShmemStruct *)
+		ShmemInitStruct("DataChecksumsWorker Data",
+						DataChecksumsWorkerShmemSize(),
+						&found);
+
+	if (!found)
+	{
+		MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
+
+		/*
+		 * Even if this is a redundant assignment, we want to be explicit
+		 * about our intent for readability, since we want to be able to query
+		 * this state in case of restartability.
+		 */
+		DataChecksumsWorkerShmem->launch_operation = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		DataChecksumsWorkerShmem->launch_fast = false;
+	}
+}
+
+/*
+ * BuildDatabaseList
+ *		Compile a list of all currently available databases in the cluster
+ *
+ * This creates the list of databases for the datachecksumsworker workers to
+ * add checksums to. If the caller wants to ensure that no concurrently
+ * running CREATE DATABASE calls exist, this needs to be preceded by a call
+ * to WaitForAllTransactionsToFinish().
+ */
+static List *
+BuildDatabaseList(void)
+{
+	List	   *DatabaseList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(DatabaseRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_database pgdb = (Form_pg_database) GETSTRUCT(tup);
+		DataChecksumsWorkerDatabase *db;
+
+		oldctx = MemoryContextSwitchTo(ctx);
+
+		db = (DataChecksumsWorkerDatabase *) palloc0(sizeof(DataChecksumsWorkerDatabase));
+
+		db->dboid = pgdb->oid;
+		db->dbname = pstrdup(NameStr(pgdb->datname));
+
+		DatabaseList = lappend(DatabaseList, db);
+
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return DatabaseList;
+}
+
+static void
+FreeDatabaseList(List *dblist)
+{
+	if (!dblist)
+		return;
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, dblist)
+	{
+		if (db->dbname != NULL)
+			pfree(db->dbname);
+	}
+
+	list_free_deep(dblist);
+}
+
+/*
+ * BuildRelationList
+ *		Compile a list of relations in the database
+ *
+ * Returns a list of OIDs for the request relation types. If temp_relations
+ * is True then only temporary relations are returned. If temp_relations is
+ * False then non-temporary relations which have data checksums are returned.
+ * If include_shared is True then shared relations are included as well in a
+ * non-temporary list. include_shared has no relevance when building a list of
+ * temporary relations.
+ */
+static List *
+BuildRelationList(bool temp_relations, bool include_shared)
+{
+	List	   *RelationList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(RelationRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_class pgc = (Form_pg_class) GETSTRUCT(tup);
+
+		/*
+		 * Only include temporary relations when asked for a temp relation
+		 * list.
+		 */
+		if (pgc->relpersistence == RELPERSISTENCE_TEMP)
+		{
+			if (!temp_relations)
+				continue;
+		}
+		else
+		{
+			/*
+			 * If we are only interested in temp relations then continue
+			 * immediately as the current relation isn't a temp relation.
+			 */
+			if (temp_relations)
+				continue;
+
+			if (!RELKIND_HAS_STORAGE(pgc->relkind))
+				continue;
+
+			if (pgc->relisshared && !include_shared)
+				continue;
+		}
+
+		oldctx = MemoryContextSwitchTo(ctx);
+		RelationList = lappend_oid(RelationList, pgc->oid);
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return RelationList;
+}
+
+/*
+ * DataChecksumsWorkerMain
+ *
+ * Main function for enabling checksums in a single database, This is the
+ * function set as the bgw_function_name in the dynamic background worker
+ * process initiated for each database by the worker launcher. After enabling
+ * data checksums in each applicable relation in the database, it will wait for
+ * all temporary relations that were present when the function started to
+ * disappear before returning. This is required since we cannot rewrite
+ * existing temporary relations with data checksums.
+ */
+void
+DataChecksumsWorkerMain(Datum arg)
+{
+	Oid			dboid = DatumGetObjectId(arg);
+	List	   *RelationList = NIL;
+	List	   *InitialTempTableList = NIL;
+	BufferAccessStrategy strategy;
+	bool		aborted = false;
+	int64		rels_done;
+
+	operation = ENABLE_DATACHECKSUMS;
+
+	pqsignal(SIGTERM, die);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_WORKER;
+	init_ps_display(NULL);
+
+	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
+											  BGWORKER_BYPASS_ALLOWCONN);
+
+	/* worker will have a separate entry in pg_stat_progress_data_checksums */
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	/*
+	 * Get a list of all temp tables present as we start in this database. We
+	 * need to wait until they are all gone until we are done, since we cannot
+	 * access these relations and modify them.
+	 */
+	InitialTempTableList = BuildRelationList(true, false);
+
+	/*
+	 * Enable vacuum cost delay, if any.
+	 */
+	Assert(DataChecksumsWorkerShmem->operation == ENABLE_DATACHECKSUMS);
+	VacuumCostDelay = DataChecksumsWorkerShmem->cost_delay;
+	VacuumCostLimit = DataChecksumsWorkerShmem->cost_limit;
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumCostPageHit = 0;
+	VacuumCostPageMiss = 0;
+	VacuumCostPageDirty = 0;
+
+	/*
+	 * Create and set the vacuum strategy as our buffer strategy.
+	 */
+	strategy = GetAccessStrategy(BAS_VACUUM);
+
+	RelationList = BuildRelationList(false,
+									 DataChecksumsWorkerShmem->process_shared_catalogs);
+
+	/* Update the total number of relations to be processed in this DB. */
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE
+		};
+
+		int64		vals[2];
+
+		vals[0] = list_length(RelationList);
+		vals[1] = 0;
+
+		pgstat_progress_update_multi_param(2, index, vals);
+	}
+
+	/* Process the relations */
+	rels_done = 0;
+	foreach_oid(reloid, RelationList)
+	{
+		if (!ProcessSingleRelationByOid(reloid, strategy))
+		{
+			aborted = true;
+			break;
+		}
+
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_RELS_DONE,
+									 ++rels_done);
+	}
+	list_free(RelationList);
+
+	if (aborted)
+	{
+		DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+		ereport(DEBUG1,
+				errmsg("data checksum processing aborted in database OID %u",
+					   dboid));
+		return;
+	}
+
+	/* The worker is about to wait for temporary tables to go away. */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL);
+
+	/*
+	 * Wait for all temp tables that existed when we started to go away. This
+	 * is necessary since we cannot "reach" them to enable checksums. Any temp
+	 * tables created after we started will already have checksums in them
+	 * (due to the "inprogress-on" state), so no need to wait for those.
+	 */
+	for (;;)
+	{
+		List	   *CurrentTempTables;
+		int			numleft;
+		char		activity[64];
+
+		CurrentTempTables = BuildRelationList(true, false);
+		numleft = 0;
+		foreach_oid(tmptbloid, InitialTempTableList)
+		{
+			if (list_member_oid(CurrentTempTables, tmptbloid))
+				numleft++;
+		}
+		list_free(CurrentTempTables);
+
+		INJECTION_POINT("datachecksumsworker-fake-temptable-wait", &numleft);
+
+		if (numleft == 0)
+			break;
+
+		/*
+		 * At least one temp table is left to wait for, indicate in pgstat
+		 * activity and progress reporting.
+		 */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for %d temp tables to be removed", numleft);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 3 seconds */
+		ResetLatch(MyLatch);
+		(void) WaitLatch(MyLatch,
+						 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+						 3000,
+						 WAIT_EVENT_CHECKSUM_ENABLE_TEMPTABLE_WAIT);
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		aborted = DataChecksumsWorkerShmem->launch_operation != operation;
+		LWLockRelease(DataChecksumsWorkerLock);
+
+		if (aborted || abort_requested)
+		{
+			DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+			ereport(DEBUG1,
+					errmsg("data checksum processing aborted in database OID %u",
+						   dboid));
+			return;
+		}
+	}
+
+	list_free(InitialTempTableList);
+
+	/* worker done */
+	pgstat_progress_end_command();
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
+}
diff --git a/src/backend/postmaster/meson.build b/src/backend/postmaster/meson.build
index 0008603cfee..ce10ef1059a 100644
--- a/src/backend/postmaster/meson.build
+++ b/src/backend/postmaster/meson.build
@@ -6,6 +6,7 @@ backend_sources += files(
   'bgworker.c',
   'bgwriter.c',
   'checkpointer.c',
+  'datachecksumsworker.c',
   'fork_process.c',
   'interrupt.c',
   'launch_backend.c',
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index e1d643b013d..3d15a894c3a 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2983,6 +2983,11 @@ PostmasterStateMachine(void)
 									B_INVALID,
 									B_STANDALONE_BACKEND);
 
+			/* also add checksumming processes */
+			remainMask = btmask_add(remainMask,
+									B_DATACHECKSUMSWORKER_LAUNCHER,
+									B_DATACHECKSUMSWORKER_WORKER);
+
 			/* All types should be included in targetMask or remainMask */
 			Assert((remainMask.mask | targetMask.mask) == BTYPE_MASK_ALL.mask);
 		}
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index cc03f0706e9..f9f06821a8f 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -186,6 +186,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 		case XLOG_FPW_CHANGE:
 		case XLOG_FPI_FOR_HINT:
 		case XLOG_FPI:
+		case XLOG_CHECKSUMS:
 		case XLOG_OVERWRITE_CONTRECORD:
 		case XLOG_CHECKPOINT_REDO:
 			break;
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 2fa045e6b0f..44213d140ae 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -30,6 +30,8 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "postmaster/postmaster.h"
 #include "postmaster/walsummarizer.h"
 #include "replication/logicallauncher.h"
 #include "replication/origin.h"
@@ -150,6 +152,7 @@ CalculateShmemSize(int *num_semaphores)
 	size = add_size(size, InjectionPointShmemSize());
 	size = add_size(size, SlotSyncShmemSize());
 	size = add_size(size, AioShmemSize());
+	size = add_size(size, DataChecksumsWorkerShmemSize());
 
 	/* include additional requested shmem from preload libraries */
 	size = add_size(size, total_addin_request);
@@ -332,6 +335,7 @@ CreateOrAttachShmemStructs(void)
 	PgArchShmemInit();
 	ApplyLauncherShmemInit();
 	SlotSyncShmemInit();
+	DataChecksumsWorkerShmemInit();
 
 	/*
 	 * Set up other modules that need some shared memory space
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 087821311cc..2f6ccdfb32f 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -18,12 +18,14 @@
 #include <unistd.h>
 
 #include "access/parallel.h"
+#include "access/xlog.h"
 #include "commands/async.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "port/pg_bitutils.h"
 #include "replication/logicalworker.h"
 #include "replication/walsender.h"
+#include "storage/checksum.h"
 #include "storage/condition_variable.h"
 #include "storage/ipc.h"
 #include "storage/latch.h"
@@ -576,6 +578,18 @@ ProcessProcSignalBarrier(void)
 					case PROCSIGNAL_BARRIER_SMGRRELEASE:
 						processed = ProcessBarrierSmgrRelease();
 						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON:
+						processed = AbsorbDataChecksumsBarrier(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_ON:
+						processed = AbsorbDataChecksumsBarrier(PG_DATA_CHECKSUM_VERSION);
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF:
+						processed = AbsorbDataChecksumsBarrier(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_OFF:
+						processed = AbsorbDataChecksumsBarrier(PG_DATA_CHECKSUM_OFF);
+						break;
 				}
 
 				/*
diff --git a/src/backend/storage/page/README b/src/backend/storage/page/README
index e30d7ac59ad..73c36a63908 100644
--- a/src/backend/storage/page/README
+++ b/src/backend/storage/page/README
@@ -10,7 +10,9 @@ http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf, discussed
 2010/12/22 on -hackers list.
 
 Current implementation requires this be enabled system-wide at initdb time, or
-by using the pg_checksums tool on an offline cluster.
+by using the pg_checksums tool on an offline cluster.  Checksums can also be
+enabled at runtime using pg_enable_data_checksums(), and disabled by using
+pg_disable_data_checksums().
 
 The checksum is not valid at all times on a data page!!
 The checksum is valid when the page leaves the shared pool and is checked
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index dbb49ed9197..d6510651a1c 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -107,7 +107,7 @@ PageIsVerified(PageData *page, BlockNumber blkno, int flags, bool *checksum_fail
 	 */
 	if (!PageIsNew(page))
 	{
-		if (DataChecksumsEnabled())
+		if (DataChecksumsNeedVerify())
 		{
 			checksum = pg_checksum_page(page, blkno);
 
@@ -151,8 +151,8 @@ PageIsVerified(PageData *page, BlockNumber blkno, int flags, bool *checksum_fail
 		if ((flags & (PIV_LOG_WARNING | PIV_LOG_LOG)) != 0)
 			ereport(flags & PIV_LOG_WARNING ? WARNING : LOG,
 					(errcode(ERRCODE_DATA_CORRUPTED),
-					 errmsg("page verification failed, calculated checksum %u but expected %u",
-							checksum, p->pd_checksum)));
+					 errmsg("page verification failed, calculated checksum %u but expected %u (page LSN %X/%08X)",
+							checksum, p->pd_checksum, LSN_FORMAT_ARGS(PageXLogRecPtrGet(p->pd_lsn)))));
 
 		if (header_sane && (flags & PIV_IGNORE_CHECKSUM_FAILURE))
 			return true;
@@ -1511,7 +1511,7 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 	static char *pageCopy = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return page;
 
 	/*
@@ -1541,7 +1541,7 @@ void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
 	/* If we don't need a checksum, just return */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return;
 
 	((PageHeader) page)->pd_checksum = pg_checksum_page(page, blkno);
diff --git a/src/backend/utils/activity/pgstat_backend.c b/src/backend/utils/activity/pgstat_backend.c
index a864ae8e6a6..f396d2c9cfb 100644
--- a/src/backend/utils/activity/pgstat_backend.c
+++ b/src/backend/utils/activity/pgstat_backend.c
@@ -379,6 +379,8 @@ pgstat_tracks_backend_bktype(BackendType bktype)
 		case B_CHECKPOINTER:
 		case B_IO_WORKER:
 		case B_STARTUP:
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 			return false;
 
 		case B_AUTOVAC_WORKER:
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index 13ae57ed649..a290d56f409 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -362,6 +362,8 @@ pgstat_tracks_io_bktype(BackendType bktype)
 		case B_LOGGER:
 			return false;
 
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 		case B_AUTOVAC_LAUNCHER:
 		case B_AUTOVAC_WORKER:
 		case B_BACKEND:
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index 7553f6eacef..430178c699c 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -116,6 +116,9 @@ CHECKPOINT_DELAY_COMPLETE	"Waiting for a backend that blocks a checkpoint from c
 CHECKPOINT_DELAY_START	"Waiting for a backend that blocks a checkpoint from starting."
 CHECKPOINT_DONE	"Waiting for a checkpoint to complete."
 CHECKPOINT_START	"Waiting for a checkpoint to start."
+CHECKSUM_ENABLE_STARTCONDITION	"Waiting for data checksums enabling to start."
+CHECKSUM_ENABLE_FINISHCONDITION	"Waiting for data checksums to be enabled."
+CHECKSUM_ENABLE_TEMPTABLE_WAIT	"Waiting for temporary tables to be dropped for data checksums to be enabled."
 EXECUTE_GATHER	"Waiting for activity from a child process while executing a <literal>Gather</literal> plan node."
 HASH_BATCH_ALLOCATE	"Waiting for an elected Parallel Hash participant to allocate a hash table."
 HASH_BATCH_ELECT	"Waiting to elect a Parallel Hash participant to allocate a hash table."
@@ -355,6 +358,7 @@ DSMRegistry	"Waiting to read or update the dynamic shared memory registry."
 InjectionPoint	"Waiting to read or update information related to injection points."
 SerialControl	"Waiting to read or update shared <filename>pg_serial</filename> state."
 AioWorkerSubmissionQueue	"Waiting to access AIO worker submission queue."
+DataChecksumsWorker	"Waiting for data checksumsworker."
 
 #
 # END OF PREDEFINED LWLOCKS (DO NOT CHANGE THIS LINE)
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 7e89a8048d5..8d4acce4f53 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -277,6 +277,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		cmdtype = PROGRESS_COMMAND_BASEBACKUP;
 	else if (pg_strcasecmp(cmd, "COPY") == 0)
 		cmdtype = PROGRESS_COMMAND_COPY;
+	else if (pg_strcasecmp(cmd, "DATACHECKSUMS") == 0)
+		cmdtype = PROGRESS_COMMAND_DATACHECKSUMS;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -1149,9 +1151,6 @@ pg_stat_get_db_checksum_failures(PG_FUNCTION_ARGS)
 	int64		result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
@@ -1167,9 +1166,6 @@ pg_stat_get_db_checksum_last_failure(PG_FUNCTION_ARGS)
 	TimestampTz result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index fec79992c8d..9b78e0012ef 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -844,7 +844,8 @@ InitializeSessionUserIdStandalone(void)
 	 * workers, in slot sync worker and in background workers.
 	 */
 	Assert(!IsUnderPostmaster || AmAutoVacuumWorkerProcess() ||
-		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess());
+		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess() ||
+		   AmDataChecksumsWorkerProcess());
 
 	/* call only once */
 	Assert(!OidIsValid(AuthenticatedUserId));
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 641e535a73c..589e7eab9e8 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -750,6 +750,24 @@ InitPostgres(const char *in_dbname, Oid dboid,
 
 	ProcSignalInit(MyCancelKey, MyCancelKeyLength);
 
+	/*
+	 * Initialize a local cache of the data_checksum_version, to be updated by
+	 * the procsignal-based barriers.
+	 *
+	 * This intentionally happens after initializing the procsignal, otherwise
+	 * we might miss a state change. This means we can get a barrier for the
+	 * state we've just initialized - but it can happen only once.
+	 *
+	 * The postmaster (which is what gets forked into the new child process)
+	 * does not handle barriers, therefore it may not have the current value
+	 * of LocalDataChecksumVersion value (it'll have the value read from the
+	 * control file, which may be arbitrarily old).
+	 *
+	 * NB: Even if the postmaster handled barriers, the value might still be
+	 * stale, as it might have changed after this process forked.
+	 */
+	InitLocalDataChecksumVersion();
+
 	/*
 	 * Also set up timeout handlers needed for backend operation.  We need
 	 * these in every case except bootstrap.
@@ -878,7 +896,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
 					 errhint("You should immediately run CREATE USER \"%s\" SUPERUSER;.",
 							 username != NULL ? username : "postgres")));
 	}
-	else if (AmBackgroundWorkerProcess())
+	else if (AmBackgroundWorkerProcess() || AmDataChecksumsWorkerProcess())
 	{
 		if (username == NULL && !OidIsValid(useroid))
 		{
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 6bc6be13d2a..1555f1386ee 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -805,11 +805,12 @@
   boot_val => 'false',
 },
 
-{ name => 'data_checksums', type => 'bool', context => 'PGC_INTERNAL', group => 'PRESET_OPTIONS',
+{ name => 'data_checksums', type => 'enum', context => 'PGC_INTERNAL', group => 'PRESET_OPTIONS',
   short_desc => 'Shows whether data checksums are turned on for this cluster.',
   flags => 'GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED',
   variable => 'data_checksums',
-  boot_val => 'false',
+  boot_val => 'PG_DATA_CHECKSUM_OFF',
+  options => 'data_checksums_options',
 },
 
 { name => 'syslog_sequence_numbers', type => 'bool', context => 'PGC_SIGHUP', group => 'LOGGING_WHERE',
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 00c8376cf4d..ae1506d87f5 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -491,6 +491,14 @@ static const struct config_enum_entry file_copy_method_options[] = {
 	{NULL, 0, false}
 };
 
+static const struct config_enum_entry data_checksums_options[] = {
+	{"on", PG_DATA_CHECKSUM_VERSION, true},
+	{"off", PG_DATA_CHECKSUM_OFF, true},
+	{"inprogress-on", PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION, true},
+	{"inprogress-off", PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION, true},
+	{NULL, 0, false}
+};
+
 /*
  * Options for enum values stored in other modules
  */
@@ -617,7 +625,6 @@ static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
 static int	num_os_semaphores;
-static bool data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index f20be82862a..8411cecf3ff 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -568,7 +568,7 @@ main(int argc, char *argv[])
 		ControlFile->state != DB_SHUTDOWNED_IN_RECOVERY)
 		pg_fatal("cluster must be shut down");
 
-	if (ControlFile->data_checksum_version == 0 &&
+	if (ControlFile->data_checksum_version != PG_DATA_CHECKSUM_VERSION &&
 		mode == PG_MODE_CHECK)
 		pg_fatal("data checksums are not enabled in cluster");
 
@@ -576,7 +576,7 @@ main(int argc, char *argv[])
 		mode == PG_MODE_DISABLE)
 		pg_fatal("data checksums are already disabled in cluster");
 
-	if (ControlFile->data_checksum_version > 0 &&
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION &&
 		mode == PG_MODE_ENABLE)
 		pg_fatal("data checksums are already enabled in cluster");
 
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index 10de058ce91..acf5c7b026e 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -280,6 +280,8 @@ main(int argc, char *argv[])
 		   ControlFile->checkPointCopy.oldestCommitTsXid);
 	printf(_("Latest checkpoint's newestCommitTsXid:%u\n"),
 		   ControlFile->checkPointCopy.newestCommitTsXid);
+	printf(_("Latest checkpoint's data_checksum_version:%u\n"),
+		   ControlFile->checkPointCopy.data_checksum_version);
 	printf(_("Time of latest checkpoint:            %s\n"),
 		   ckpttime_str);
 	printf(_("Fake LSN counter for unlogged rels:   %X/%08X\n"),
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index 90cef0864de..29684e82440 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -15,6 +15,7 @@
 #include "access/xlog_internal.h"
 #include "common/string.h"
 #include "pg_upgrade.h"
+#include "storage/bufpage.h"
 
 
 /*
@@ -736,6 +737,14 @@ check_control_data(ControlData *oldctrl,
 	 * check_for_isn_and_int8_passing_mismatch().
 	 */
 
+	/*
+	 * If data checksums are in any in-progress state then disallow the
+	 * upgrade. The user should either let the process finish, or turn off
+	 * data checksums, before retrying.
+	 */
+	if (oldctrl->data_checksum_version > PG_DATA_CHECKSUM_VERSION)
+		pg_fatal("checksums are being enabled in the old cluster");
+
 	/*
 	 * We might eventually allow upgrades from checksum to no-checksum
 	 * clusters.
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index d12798be3d8..2463a617c8d 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -56,6 +56,7 @@ extern PGDLLIMPORT int CommitDelay;
 extern PGDLLIMPORT int CommitSiblings;
 extern PGDLLIMPORT bool track_wal_io_timing;
 extern PGDLLIMPORT int wal_decode_buffer_size;
+extern PGDLLIMPORT int data_checksums;
 
 extern PGDLLIMPORT int CheckPointSegments;
 
@@ -117,7 +118,7 @@ extern PGDLLIMPORT int wal_level;
  * of the bits make it to disk, but the checksum wouldn't match.  Also WAL-log
  * them if forced by wal_log_hints=on.
  */
-#define XLogHintBitIsNeeded() (DataChecksumsEnabled() || wal_log_hints)
+#define XLogHintBitIsNeeded() (wal_log_hints || DataChecksumsNeedWrite())
 
 /* Do we need to WAL-log information required only for Hot Standby and logical replication? */
 #define XLogStandbyInfoActive() (wal_level >= WAL_LEVEL_REPLICA)
@@ -229,7 +230,16 @@ extern XLogRecPtr GetXLogWriteRecPtr(void);
 
 extern uint64 GetSystemIdentifier(void);
 extern char *GetMockAuthenticationNonce(void);
-extern bool DataChecksumsEnabled(void);
+extern bool DataChecksumsNeedWrite(void);
+extern bool DataChecksumsNeedVerify(void);
+extern bool DataChecksumsOnInProgress(void);
+extern bool DataChecksumsOffInProgress(void);
+extern void SetDataChecksumsOnInProgress(bool immediate_checkpoint);
+extern void SetDataChecksumsOn(bool immediate_checkpoint);
+extern void SetDataChecksumsOff(bool immediate_checkpoint);
+extern bool AbsorbDataChecksumsBarrier(int target_state);
+extern const char *show_data_checksums(void);
+extern void InitLocalDataChecksumVersion(void);
 extern bool GetDefaultCharSignedness(void);
 extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
 extern Size XLOGShmemSize(void);
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index cc06fc29ab2..cc78b00fe4c 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,6 +25,7 @@
 #include "lib/stringinfo.h"
 #include "pgtime.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/relfilelocator.h"
 
 
@@ -289,6 +290,12 @@ typedef struct xl_restore_point
 	char		rp_name[MAXFNAMELEN];
 } xl_restore_point;
 
+/* Information logged when data checksum level is changed */
+typedef struct xl_checksum_state
+{
+	uint32		new_checksumtype;
+} xl_checksum_state;
+
 /* Overwrite of prior contrecord */
 typedef struct xl_overwrite_contrecord
 {
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 63e834a6ce4..a8877fb87d1 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -62,6 +62,9 @@ typedef struct CheckPoint
 	 * set to InvalidTransactionId.
 	 */
 	TransactionId oldestActiveXid;
+
+	/* data checksums at the time of the checkpoint  */
+	uint32		data_checksum_version;
 } CheckPoint;
 
 /* XLOG info values for XLOG rmgr */
@@ -80,6 +83,7 @@ typedef struct CheckPoint
 /* 0xC0 is used in Postgres 9.5-11 */
 #define XLOG_OVERWRITE_CONTRECORD		0xD0
 #define XLOG_CHECKPOINT_REDO			0xE0
+#define XLOG_CHECKSUMS					0xF0
 
 
 /*
@@ -219,7 +223,7 @@ typedef struct ControlFileData
 	bool		float8ByVal;	/* float8, int8, etc pass-by-value? */
 
 	/* Are data pages protected by checksums? Zero if no checksum version */
-	uint32		data_checksum_version;
+	uint32		data_checksum_version;	/* persistent */
 
 	/*
 	 * True if the default signedness of char is "signed" on a platform where
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 5d5a9483fec..59f453f227f 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12372,6 +12372,25 @@
   proname => 'jsonb_subscript_handler', prorettype => 'internal',
   proargtypes => 'internal', prosrc => 'jsonb_subscript_handler' },
 
+# data checksum management functions
+{ oid => '9258',
+  descr => 'disable data checksums',
+  proname => 'pg_disable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'bool', proallargtypes => '{bool}',
+  proargmodes => '{i}',
+  proargnames => '{fast}',
+  prosrc => 'disable_data_checksums' },
+
+{ oid => '9257',
+  descr => 'enable data checksums',
+  proname => 'pg_enable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'int4 int4 bool', proallargtypes => '{int4,int4,bool}',
+  proargmodes => '{i,i,i}',
+  proargnames => '{cost_delay,cost_limit,fast}',
+  prosrc => 'enable_data_checksums' },
+
 # collation management functions
 { oid => '3445', descr => 'import collations from operating system',
   proname => 'pg_import_system_collations', procost => '100',
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 1cde4bd9bcf..d2aa148533b 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -162,4 +162,21 @@
 #define PROGRESS_COPY_TYPE_PIPE 3
 #define PROGRESS_COPY_TYPE_CALLBACK 4
 
+/* Progress parameters for PROGRESS_DATACHECKSUMS */
+#define PROGRESS_DATACHECKSUMS_PHASE		0
+#define PROGRESS_DATACHECKSUMS_DBS_TOTAL	1
+#define PROGRESS_DATACHECKSUMS_DBS_DONE		2
+#define PROGRESS_DATACHECKSUMS_RELS_TOTAL	3
+#define PROGRESS_DATACHECKSUMS_RELS_DONE	4
+#define PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL	5
+#define PROGRESS_DATACHECKSUMS_BLOCKS_DONE	6
+
+/* Phases of datachecksumsworker operation */
+#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING			0
+#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING			1
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING			2
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL	3
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT 4
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BARRIER	5
+
 #endif
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1bef98471c3..2a0d7b6de42 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -366,6 +366,9 @@ typedef enum BackendType
 	B_WAL_SUMMARIZER,
 	B_WAL_WRITER,
 
+	B_DATACHECKSUMSWORKER_LAUNCHER,
+	B_DATACHECKSUMSWORKER_WORKER,
+
 	/*
 	 * Logger is not connected to shared memory and does not have a PGPROC
 	 * entry.
@@ -391,6 +394,9 @@ extern PGDLLIMPORT BackendType MyBackendType;
 #define AmWalSummarizerProcess()	(MyBackendType == B_WAL_SUMMARIZER)
 #define AmWalWriterProcess()		(MyBackendType == B_WAL_WRITER)
 #define AmIoWorkerProcess()			(MyBackendType == B_IO_WORKER)
+#define AmDataChecksumsWorkerProcess() \
+	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || \
+	 MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
 
 #define AmSpecialWorkerProcess() \
 	(AmAutoVacuumLauncherProcess() || \
diff --git a/src/include/postmaster/datachecksumsworker.h b/src/include/postmaster/datachecksumsworker.h
new file mode 100644
index 00000000000..2cd066fd0fe
--- /dev/null
+++ b/src/include/postmaster/datachecksumsworker.h
@@ -0,0 +1,51 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.h
+ *	  header file for data checksum helper background worker
+ *
+ *
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/postmaster/datachecksumsworker.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef DATACHECKSUMSWORKER_H
+#define DATACHECKSUMSWORKER_H
+
+/* Shared memory */
+extern Size DataChecksumsWorkerShmemSize(void);
+extern void DataChecksumsWorkerShmemInit(void);
+
+/* Possible operations the Datachecksumsworker can perform */
+typedef enum DataChecksumsWorkerOperation
+{
+	ENABLE_DATACHECKSUMS,
+	DISABLE_DATACHECKSUMS,
+	/* TODO: VERIFY_DATACHECKSUMS, */
+}			DataChecksumsWorkerOperation;
+
+/*
+ * Possible states for a database entry which has been processed. Exported
+ * here since we want to be able to reference this from injection point tests.
+ */
+typedef enum
+{
+	DATACHECKSUMSWORKER_SUCCESSFUL = 0,
+	DATACHECKSUMSWORKER_ABORTED,
+	DATACHECKSUMSWORKER_FAILED,
+	DATACHECKSUMSWORKER_RETRYDB,
+} DataChecksumsWorkerResult;
+
+/* Start the background processes for enabling or disabling checksums */
+void		StartDataChecksumsWorkerLauncher(DataChecksumsWorkerOperation op,
+											 int cost_delay,
+											 int cost_limit,
+											 bool fast);
+
+/* Background worker entrypoints */
+void		DataChecksumsWorkerLauncherMain(Datum arg);
+void		DataChecksumsWorkerMain(Datum arg);
+
+#endif							/* DATACHECKSUMSWORKER_H */
diff --git a/src/include/postmaster/proctypelist.h b/src/include/postmaster/proctypelist.h
index 242862451d8..3dc93b176d9 100644
--- a/src/include/postmaster/proctypelist.h
+++ b/src/include/postmaster/proctypelist.h
@@ -38,6 +38,8 @@ PG_PROCTYPE(B_BACKEND, gettext_noop("client backend"), BackendMain, true)
 PG_PROCTYPE(B_BG_WORKER, gettext_noop("background worker"), BackgroundWorkerMain, true)
 PG_PROCTYPE(B_BG_WRITER, gettext_noop("background writer"), BackgroundWriterMain, true)
 PG_PROCTYPE(B_CHECKPOINTER, gettext_noop("checkpointer"), CheckpointerMain, true)
+PG_PROCTYPE(B_DATACHECKSUMSWORKER_LAUNCHER, gettext_noop("datachecksum launcher"), NULL, false)
+PG_PROCTYPE(B_DATACHECKSUMSWORKER_WORKER, gettext_noop("datachecksum worker"), NULL, false)
 PG_PROCTYPE(B_DEAD_END_BACKEND, gettext_noop("dead-end client backend"), BackendMain, true)
 PG_PROCTYPE(B_INVALID, gettext_noop("unrecognized"), NULL, false)
 PG_PROCTYPE(B_IO_WORKER, gettext_noop("io worker"), IoWorkerMain, true)
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index aeb67c498c5..30fb0f62d4c 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -16,6 +16,7 @@
 
 #include "access/xlogdefs.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/item.h"
 #include "storage/off.h"
 
@@ -205,7 +206,6 @@ typedef PageHeaderData *PageHeader;
  * handling pages.
  */
 #define PG_PAGE_LAYOUT_VERSION		4
-#define PG_DATA_CHECKSUM_VERSION	1
 
 /* ----------------------------------------------------------------
  *						page support functions
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 25d13a798d1..0faaac14b1b 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -15,6 +15,21 @@
 
 #include "storage/block.h"
 
+/*
+ * Checksum version 0 is used for when data checksums are disabled (OFF).
+ * PG_DATA_CHECKSUM_VERSION defines that data checksums are enabled in the
+ * cluster and PG_DATA_CHECKSUM_INPROGRESS_{ON|OFF}_VERSION defines that data
+ * checksums are either currently being enabled or disabled.
+ */
+typedef enum ChecksumType
+{
+	PG_DATA_CHECKSUM_OFF = 0,
+	PG_DATA_CHECKSUM_VERSION,
+	PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION,
+	PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION,
+	PG_DATA_CHECKSUM_ANY_VERSION
+} ChecksumType;
+
 /*
  * Compute the checksum for a Postgres page.  The page must be aligned on a
  * 4-byte boundary.
diff --git a/src/include/storage/lwlocklist.h b/src/include/storage/lwlocklist.h
index 06a1ffd4b08..b8f7ba0be51 100644
--- a/src/include/storage/lwlocklist.h
+++ b/src/include/storage/lwlocklist.h
@@ -85,6 +85,7 @@ PG_LWLOCK(50, DSMRegistry)
 PG_LWLOCK(51, InjectionPoint)
 PG_LWLOCK(52, SerialControl)
 PG_LWLOCK(53, AioWorkerSubmissionQueue)
+PG_LWLOCK(54, DataChecksumsWorker)
 
 /*
  * There also exist several built-in LWLock tranches.  As with the predefined
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index c6f5ebceefd..d90d35b1d6f 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -463,11 +463,11 @@ extern PGDLLIMPORT PGPROC *PreparedXactProcs;
  * Background writer, checkpointer, WAL writer, WAL summarizer, and archiver
  * run during normal operation.  Startup process and WAL receiver also consume
  * 2 slots, but WAL writer is launched only after startup has exited, so we
- * only need 6 slots.
+ * only need 6 slots to cover these. The DataChecksums worker and launcher
+ * can consume 2 slots when data checksums are enabled or disabled.
  */
 #define MAX_IO_WORKERS          32
-#define NUM_AUXILIARY_PROCS		(6 + MAX_IO_WORKERS)
-
+#define NUM_AUXILIARY_PROCS		(8 + MAX_IO_WORKERS)
 
 /* configurable options */
 extern PGDLLIMPORT int DeadlockTimeout;
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index afeeb1ca019..c54c61e2cd8 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -54,6 +54,11 @@ typedef enum
 typedef enum
 {
 	PROCSIGNAL_BARRIER_SMGRRELEASE, /* ask smgr to close files */
+
+	PROCSIGNAL_BARRIER_CHECKSUM_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_ON,
 } ProcSignalBarrierType;
 
 /*
diff --git a/src/include/utils/backend_progress.h b/src/include/utils/backend_progress.h
index dda813ab407..c664e92dbfe 100644
--- a/src/include/utils/backend_progress.h
+++ b/src/include/utils/backend_progress.h
@@ -28,6 +28,7 @@ typedef enum ProgressCommandType
 	PROGRESS_COMMAND_CREATE_INDEX,
 	PROGRESS_COMMAND_BASEBACKUP,
 	PROGRESS_COMMAND_COPY,
+	PROGRESS_COMMAND_DATACHECKSUMS,
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	20
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 902a7954101..28c8d0bd3cf 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -18,6 +18,7 @@ SUBDIRS = \
 		  test_binaryheap \
 		  test_bitmapset \
 		  test_bloomfilter \
+		  test_checksums \
 		  test_copy_callbacks \
 		  test_custom_rmgrs \
 		  test_ddl_deparse \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 14fc761c4cf..88b8b369534 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -17,6 +17,7 @@ subdir('test_aio')
 subdir('test_binaryheap')
 subdir('test_bitmapset')
 subdir('test_bloomfilter')
+subdir('test_checksums')
 subdir('test_copy_callbacks')
 subdir('test_custom_rmgrs')
 subdir('test_ddl_deparse')
diff --git a/src/test/modules/test_checksums/.gitignore b/src/test/modules/test_checksums/.gitignore
new file mode 100644
index 00000000000..871e943d50e
--- /dev/null
+++ b/src/test/modules/test_checksums/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/modules/test_checksums/Makefile b/src/test/modules/test_checksums/Makefile
new file mode 100644
index 00000000000..a5b6259a728
--- /dev/null
+++ b/src/test/modules/test_checksums/Makefile
@@ -0,0 +1,40 @@
+#-------------------------------------------------------------------------
+#
+# Makefile for src/test/checksum
+#
+# Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/checksum/Makefile
+#
+#-------------------------------------------------------------------------
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+export enable_injection_points
+
+MODULE_big = test_checksums
+OBJS = \
+	$(WIN32RES) \
+	test_checksums.o
+PGFILEDESC = "test_checksums - test code for data checksums"
+
+EXTENSION = test_checksums
+DATA = test_checksums--1.0.sql
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_checksums
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
diff --git a/src/test/modules/test_checksums/README b/src/test/modules/test_checksums/README
new file mode 100644
index 00000000000..0f0317060b3
--- /dev/null
+++ b/src/test/modules/test_checksums/README
@@ -0,0 +1,22 @@
+src/test/checksum/README
+
+Regression tests for data checksums
+===================================
+
+This directory contains a test suite for enabling data checksums
+in a running cluster.
+
+Running the tests
+=================
+
+    make check
+
+or
+
+    make installcheck
+
+NOTE: This creates a temporary installation (in the case of "check"),
+with multiple nodes, be they master or standby(s) for the purpose of
+the tests.
+
+NOTE: This requires the --enable-tap-tests argument to configure.
diff --git a/src/test/modules/test_checksums/meson.build b/src/test/modules/test_checksums/meson.build
new file mode 100644
index 00000000000..ffc737ca87a
--- /dev/null
+++ b/src/test/modules/test_checksums/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+test_checksums_sources = files(
+	'test_checksums.c',
+)
+
+test_checksums = shared_module('test_checksums',
+	test_checksums_sources,
+	kwargs: pg_test_mod_args,
+)
+test_install_libs += test_checksums
+
+test_install_data += files(
+	'test_checksums.control',
+	'test_checksums--1.0.sql',
+)
+
+tests += {
+  'name': 'test_checksums',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'env': {
+       'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+	},
+    'tests': [
+      't/001_basic.pl',
+      't/002_restarts.pl',
+      't/003_standby_restarts.pl',
+      't/004_offline.pl',
+      't/005_injection.pl',
+      't/006_pgbench_single.pl',
+      't/007_pgbench_standby.pl',
+    ],
+  },
+}
diff --git a/src/test/modules/test_checksums/t/001_basic.pl b/src/test/modules/test_checksums/t/001_basic.pl
new file mode 100644
index 00000000000..728a5c4510c
--- /dev/null
+++ b/src/test/modules/test_checksums/t/001_basic.pl
@@ -0,0 +1,63 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are turned off
+test_checksum_state($node, 'off');
+
+# Enable data checksums and wait for the state transition to 'on'
+enable_data_checksums($node, wait => 'on');
+
+# Run a dummy query just to make sure we can read back data
+my $result =
+  $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1 ");
+is($result, '9999', 'ensure checksummed pages can be read back');
+
+# Enable data checksums again which should be a no-op so we explicitly don't
+# wait for any state transition as none should happen here
+enable_data_checksums($node);
+test_checksum_state($node, 'on');
+# ..and make sure we can still read/write data
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums again and wait for the state transition
+disable_data_checksums($node, wait => 'on');
+
+# Test reading data again
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '10000', 'ensure previously checksummed pages can be read back');
+
+# Re-enable checksums and make sure that the underlying data has changed to
+# ensure that checksums will be different.
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+enable_data_checksums($node, wait => 'on');
+
+# Run a dummy query just to make sure we can read back the data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/002_restarts.pl b/src/test/modules/test_checksums/t/002_restarts.pl
new file mode 100644
index 00000000000..6c17f304eac
--- /dev/null
+++ b/src/test/modules/test_checksums/t/002_restarts.pl
@@ -0,0 +1,110 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with a
+# restart which breaks processing.
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Initialize result storage for queries
+my $result;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+test_checksum_state($node, 'off');
+
+SKIP:
+{
+	skip 'Data checksum delay tests not enabled in PG_TEST_EXTRA', 6
+	  if (!$ENV{PG_TEST_EXTRA}
+		|| $ENV{PG_TEST_EXTRA} !~ /\bchecksum_extended\b/);
+
+	# Create a barrier for checksumming to block on, in this case a pre-
+	# existing temporary table which is kept open while processing is started.
+	# We can accomplish this by setting up an interactive psql process which
+	# keeps the temporary table created as we enable checksums in another psql
+	# process.
+	#
+	# This is a similar test to the synthetic variant in 005_injection.pl
+	# which fakes this scenario.
+	my $bsession = $node->background_psql('postgres');
+	$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+	# In another session, make sure we can see the blocking temp table but
+	# start processing anyways and check that we are blocked with a proper
+	# wait event.
+	$result = $node->safe_psql('postgres',
+		"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';"
+	);
+	is($result, 't', 'ensure we can see the temporary table');
+
+	# Enabling data checksums shouldn't work as the process is blocked on the
+	# temporary table held open by $bsession. Ensure that we reach inprogress-
+	# on before we do more tests.
+	enable_data_checksums($node, wait => 'inprogress-on');
+
+	# Wait for processing to finish and the worker waiting for leftover temp
+	# relations to be able to actually finish
+	$result = $node->poll_query_until(
+		'postgres',
+		"SELECT wait_event FROM pg_catalog.pg_stat_activity "
+		  . "WHERE backend_type = 'datachecksum worker';",
+		'ChecksumEnableTemptableWait');
+
+	# The datachecksumsworker waits for temporary tables to disappear for 3
+	# seconds before retrying, so sleep for 4 seconds to be guaranteed to see
+	# a retry cycle
+	sleep(4);
+
+	# Re-check the wait event to ensure we are blocked on the right thing.
+	$result = $node->safe_psql('postgres',
+			"SELECT wait_event FROM pg_catalog.pg_stat_activity "
+		  . "WHERE backend_type = 'datachecksum worker';");
+	is($result, 'ChecksumEnableTemptableWait',
+		'ensure the correct wait condition is set');
+	test_checksum_state($node, 'inprogress-on');
+
+	# Stop the cluster while bsession is still attached.  We can't close the
+	# session first since the brief period between closing and stopping might
+	# be enough for checksums to get enabled.
+	$node->stop;
+	$bsession->quit;
+	$node->start;
+
+	# Ensure the checksums aren't enabled across the restart.  This leaves the
+	# cluster in the same state as before we entered the SKIP block.
+	test_checksum_state($node, 'off');
+}
+
+enable_data_checksums($node, wait => 'on');
+
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '9999', 'ensure checksummed pages can be read back');
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+disable_data_checksums($node, wait => 1);
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/003_standby_restarts.pl b/src/test/modules/test_checksums/t/003_standby_restarts.pl
new file mode 100644
index 00000000000..f724d4ea74c
--- /dev/null
+++ b/src/test/modules/test_checksums/t/003_standby_restarts.pl
@@ -0,0 +1,114 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# streaming replication
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+# Initialize primary node
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+$node_primary->init(allows_streaming => 1, no_data_checksums => 1);
+$node_primary->start;
+
+my $slotname = 'physical_slot';
+$node_primary->safe_psql('postgres',
+	"SELECT pg_create_physical_replication_slot('$slotname')");
+
+# Take backup
+my $backup_name = 'my_backup';
+$node_primary->backup($backup_name);
+
+# Create streaming standby linking to primary
+my $node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $backup_name,
+	has_streaming => 1);
+$node_standby_1->append_conf(
+	'postgresql.conf', qq[
+primary_slot_name = '$slotname'
+]);
+$node_standby_1->start;
+
+# Create some content on the primary to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Wait for standbys to catch up
+$node_primary->wait_for_catchup($node_standby_1, 'replay',
+	$node_primary->lsn('insert'));
+
+# Check that checksums are turned off on all nodes
+test_checksum_state($node_primary, 'off');
+test_checksum_state($node_standby_1, 'off');
+
+# ---------------------------------------------------------------------------
+# Enable checksums for the cluster, and make sure that both the primary and
+# standby change state.
+#
+
+# Ensure that the primary switches to "inprogress-on"
+enable_data_checksums($node_primary, wait => 'inprogress-on');
+# Wait for checksum enable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Ensure that the standby has switched to "inprogress-on" or "on".  Normally it
+# would be "inprogress-on", but it is theoretically possible for the primary to
+# complete the checksum enabling *and* have the standby replay that record
+# before we reach the check below.
+my $result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting = 'off' FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'f');
+is($result, 1, 'ensure standby has absorbed the inprogress-on barrier');
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+
+is(($result eq 'inprogress-on' || $result eq 'on'),
+	1, 'ensure checksums are on, or in progress, on standby_1');
+
+# Insert some more data which should be checksummed on INSERT
+$node_primary->safe_psql('postgres',
+	"INSERT INTO t VALUES (generate_series(1, 10000));");
+
+# Wait for checksums enabled on the primary and standby
+wait_for_checksum_state($node_primary, 'on');
+wait_for_checksum_state($node_standby_1, 'on');
+
+$result =
+  $node_primary->safe_psql('postgres', "SELECT count(a) FROM t WHERE a > 1");
+is($result, '19998', 'ensure we can safely read all data with checksums');
+
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+#
+# Disable checksums and ensure it's propagated to standby and that we can
+# still read all data
+#
+
+# Disable checksums and wait for the operation to be replayed
+disable_data_checksums($node_primary);
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+# Ensure that the primary abd standby has switched to off
+wait_for_checksum_state($node_primary, 'off');
+wait_for_checksum_state($node_standby_1, 'off');
+# Doublecheck reading data without errors
+$result =
+  $node_primary->safe_psql('postgres', "SELECT count(a) FROM t WHERE a > 1");
+is($result, "19998", 'ensure we can safely read all data without checksums');
+
+$node_standby_1->stop;
+$node_primary->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/004_offline.pl b/src/test/modules/test_checksums/t/004_offline.pl
new file mode 100644
index 00000000000..e9fbcf77eab
--- /dev/null
+++ b/src/test/modules/test_checksums/t/004_offline.pl
@@ -0,0 +1,82 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums offline from various states
+# of checksum processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+test_checksum_state($node, 'off');
+
+# Enable checksums offline using pg_checksums
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are enabled
+test_checksum_state($node, 'on');
+
+# Run a dummy query just to make sure we can read back some data
+my $result =
+  $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '9999', 'ensure checksummed pages can be read back');
+
+# Disable checksums offline again using pg_checksums
+$node->stop;
+$node->checksum_disable_offline;
+$node->start;
+
+# Ensure that checksums are disabled
+test_checksum_state($node, 'off');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+enable_data_checksums($node, wait => 'inprogress-on');
+
+# Turn the cluster off and enable checksums offline, then start back up
+$bsession->quit;
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are now enabled even though processing wasn't
+# restarted
+test_checksum_state($node, 'on');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '9999', 'ensure checksummed pages can be read back');
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/005_injection.pl b/src/test/modules/test_checksums/t/005_injection.pl
new file mode 100644
index 00000000000..ae801cd336f
--- /dev/null
+++ b/src/test/modules/test_checksums/t/005_injection.pl
@@ -0,0 +1,126 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# injection point tests injecting failures into the processing
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+# ---------------------------------------------------------------------------
+# Test cluster setup
+#
+
+# Initiate testcluster
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Set up test environment
+$node->safe_psql('postgres', 'CREATE EXTENSION test_checksums;');
+
+# ---------------------------------------------------------------------------
+# Inducing failures and crashes in processing
+
+# Force enabling checksums to fail by marking one of the databases as having
+# failed in processing.
+disable_data_checksums($node, wait => 1);
+$node->safe_psql('postgres', 'SELECT dcw_inject_fail_database(true);');
+enable_data_checksums($node, wait => 'off');
+$node->safe_psql('postgres', 'SELECT dcw_inject_fail_database(false);');
+
+# Force the server to crash after enabling data checksums but before issuing
+# the checkpoint.  Since the switch has been WAL logged the server should come
+# up with checksums enabled after replay.
+test_checksum_state($node, 'off');
+$node->safe_psql('postgres', 'SELECT dc_crash_before_checkpoint();');
+enable_data_checksums($node, fast => 'true');
+my $ret = wait_for_cluster_crash($node);
+ok($ret == 1, "Cluster crash detection timeout");
+ok(!$node->is_alive, "Cluster crashed due to abort() before checkpointing");
+$node->_update_pid(-1);
+$node->start;
+test_checksum_state($node, 'on');
+
+# Another test just like the previous, but for disabling data checksums (and
+# crashing just before checkpointing).  The previous injection points were all
+# detached from through the crash so they need to be reattached.
+$node->safe_psql('postgres', 'SELECT dc_crash_before_checkpoint();');
+disable_data_checksums($node);
+$ret = wait_for_cluster_crash($node);
+ok($ret == 1, "Cluster crash detection timeout");
+ok(!$node->is_alive, "Cluster crashed due to abort() before checkpointing");
+$node->_update_pid(-1);
+$node->start;
+test_checksum_state($node, 'off');
+
+# Now inject a crash before inserting the WAL record for data checksum state
+# change, when the server comes back up again the state should not have been
+# set to the new value since the process didn't succeed.
+$node->safe_psql('postgres', 'SELECT dc_crash_before_xlog();');
+enable_data_checksums($node);
+$ret = wait_for_cluster_crash($node);
+ok($ret == 1, "Cluster crash detection timeout");
+ok(!$node->is_alive, "Cluster crashed");
+$node->_update_pid(-1);
+$node->start;
+test_checksum_state($node, 'off');
+
+# This re-runs the same test again but with first disabling data checksums and
+# then enabling again, crashing right before inserting the WAL record.  When
+# it comes back up the checksums must not be enabled.
+$node->safe_psql('postgres', 'SELECT dc_crash_before_xlog();');
+enable_data_checksums($node);
+$ret = wait_for_cluster_crash($node);
+ok($ret == 1, "Cluster crash detection timeout");
+ok(!$node->is_alive, "Cluster crashed");
+$node->_update_pid(-1);
+$node->start;
+test_checksum_state($node, 'off');
+
+# ---------------------------------------------------------------------------
+# Timing and retry related tests
+#
+
+# Force the enable checksums processing to make multiple passes by removing
+# one database from the list in the first pass.  This will simulate a CREATE
+# DATABASE during processing.  Doing this via fault injection makes the test
+# not be subject to exact timing.
+$node->safe_psql('postgres', 'SELECT dcw_prune_dblist(true);');
+enable_data_checksums($node, wait => 'on');
+
+SKIP:
+{
+	skip 'Data checksum delay tests not enabled in PG_TEST_EXTRA', 4
+	  if (!$ENV{PG_TEST_EXTRA}
+		|| $ENV{PG_TEST_EXTRA} !~ /\bchecksum_extended\b/);
+
+	# Inject a delay in the barrier for enabling checksums
+	disable_data_checksums($node, wait => 1);
+	$node->safe_psql('postgres', 'SELECT dcw_inject_delay_barrier();');
+	enable_data_checksums($node, wait => 'on');
+
+	# Fake the existence of a temporary table at the start of processing, which
+	# will force the processing to wait and retry in order to wait for it to
+	# disappear.
+	disable_data_checksums($node, wait => 1);
+	$node->safe_psql('postgres', 'SELECT dcw_fake_temptable(true);');
+	enable_data_checksums($node, wait => 'on');
+}
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/006_pgbench_single.pl b/src/test/modules/test_checksums/t/006_pgbench_single.pl
new file mode 100644
index 00000000000..96f3b2cd8a6
--- /dev/null
+++ b/src/test/modules/test_checksums/t/006_pgbench_single.pl
@@ -0,0 +1,268 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# concurrent activity via pgbench runs
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+if (!$ENV{PG_TEST_EXTRA} || $ENV{PG_TEST_EXTRA} !~ /\bchecksum_extended\b/)
+{
+	plan skip_all => 'Extended tests not enabled';
+}
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+my $node;
+my $node_loglocation = 0;
+
+# The number of full test iterations which will be performed. The exact number
+# of tests performed and the wall time taken is non-deterministic as the test
+# performs a lot of randomized actions, but 10 iterations will be a long test
+# run regardless.
+my $TEST_ITERATIONS = 10;
+
+# Variables which record the current state of the cluster
+my $data_checksum_state = 'off';
+my $pgbench = undef;
+
+# determines whether enable_data_checksums/disable_data_checksums forces an
+# immediate checkpoint
+my @flip_modes = ('true', 'false');
+
+# Start a pgbench run in the background against the server specified via the
+# port passed as parameter.
+sub background_rw_pgbench
+{
+	my $port = shift;
+
+	# If a previous pgbench is still running, start by shutting it down.
+	if ($pgbench)
+	{
+		$pgbench->finish;
+	}
+
+	# Randomize the number of pgbench clients a bit (range 1-16)
+	my $clients = 1 + int(rand(15));
+
+	my @cmd = ('pgbench', '-p', $port, '-T', '600', '-c', $clients);
+
+	# Randomize whether we spawn connections or not
+	push(@cmd, '-C') if (cointoss);
+	# Finally add the database name to use
+	push(@cmd, 'postgres');
+
+	$pgbench = IPC::Run::start(
+		\@cmd,
+		'<' => '/dev/null',
+		'>' => '/dev/null',
+		'2>' => '/dev/null',
+		IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default));
+}
+
+# Invert the state of data checksums in the cluster, if data checksums are on
+# then disable them and vice versa. Also performs proper validation of the
+# before and after state.
+sub flip_data_checksums
+{
+	# First, make sure the cluster is in the state we expect it to be
+	test_checksum_state($node, $data_checksum_state);
+
+	if ($data_checksum_state eq 'off')
+	{
+		# Coin-toss to see if we are injecting a retry due to a temptable
+		$node->safe_psql('postgres', 'SELECT dcw_fake_temptable();')
+		  if cointoss();
+
+		# log LSN right before we start changing checksums
+		my $result =
+		  $node->safe_psql('postgres', "SELECT pg_current_wal_lsn()");
+		note("LSN before enabling: " . $result . "\n");
+
+		my $mode = $flip_modes[ int(rand(@flip_modes)) ];
+
+		# Ensure that the primary switches to "inprogress-on"
+		enable_data_checksums(
+			$node,
+			wait => 'inprogress-on',
+			'fast' => $mode);
+
+		random_sleep();
+
+		# Wait for checksums enabled on the primary
+		wait_for_checksum_state($node, 'on');
+
+		# log LSN right after the primary flips checksums to "on"
+		$result = $node->safe_psql('postgres', "SELECT pg_current_wal_lsn()");
+		note("LSN after enabling: " . $result . "\n");
+
+		random_sleep();
+
+		$node->safe_psql('postgres', 'SELECT dcw_fake_temptable(false);');
+		$data_checksum_state = 'on';
+	}
+	elsif ($data_checksum_state eq 'on')
+	{
+		random_sleep();
+
+		# log LSN right before we start changing checksums
+		my $result =
+		  $node->safe_psql('postgres', "SELECT pg_current_wal_lsn()");
+		note("LSN before disabling: " . $result . "\n");
+
+		my $mode = $flip_modes[ int(rand(@flip_modes)) ];
+
+		disable_data_checksums($node, 'fast' => $mode);
+
+		# Wait for checksums disabled on the primary
+		wait_for_checksum_state($node, 'off');
+
+		# log LSN right after the primary flips checksums to "off"
+		$result = $node->safe_psql('postgres', "SELECT pg_current_wal_lsn()");
+		note("LSN after disabling: " . $result . "\n");
+
+		random_sleep();
+
+		$data_checksum_state = 'off';
+	}
+	else
+	{
+		# This should only happen due to programmer error when hacking on the
+		# test code, but since that might pass subtly by let's ensure it gets
+		# caught with a test error if so.
+		bail('data_checksum_state variable has invalid state:'
+			  . $data_checksum_state);
+	}
+}
+
+# Create and start a cluster with one node
+$node = PostgreSQL::Test::Cluster->new('main');
+$node->init(allows_streaming => 1, no_data_checksums => 1);
+# max_connections need to be bumped in order to accommodate for pgbench clients
+# and log_statement is dialled down since it otherwise will generate enormous
+# amounts of logging. Page verification failures are still logged.
+$node->append_conf(
+	'postgresql.conf',
+	qq[
+max_connections = 100
+log_statement = none
+]);
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION test_checksums;');
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1, 100000) AS a;");
+# Initialize pgbench
+$node->command_ok([ 'pgbench', '-i', '-s', '100', '-q', 'postgres' ]);
+# Start the test suite with pgbench running.
+background_rw_pgbench($node->port);
+
+# Main test suite. This loop will start a pgbench run on the cluster and while
+# that's running flip the state of data checksums concurrently. It will then
+# randomly restart thec cluster (in fast or immediate) mode and then check for
+# the desired state.  The idea behind doing things randomly is to stress out
+# any timing related issues by subjecting the cluster for varied workloads.
+# A TODO is to generate a trace such that any test failure can be traced to
+# its order of operations for debugging.
+for (my $i = 0; $i < $TEST_ITERATIONS; $i++)
+{
+	note("iteration ", ($i + 1), " of ", $TEST_ITERATIONS);
+
+	if (!$node->is_alive)
+	{
+		random_sleep();
+
+		# Start, to do recovery, and stop
+		$node->start;
+		$node->stop('fast');
+
+		# Since the log isn't being written to now, parse the log and check
+		# for instances of checksum verification failures.
+		my $log = PostgreSQL::Test::Utils::slurp_file($node->logfile,
+			$node_loglocation);
+		unlike(
+			$log,
+			qr/page verification failed/,
+			"no checksum validation errors in primary log (during WAL recovery)"
+		);
+		$node_loglocation = -s $node->logfile;
+
+		# Randomize the WAL size, to trigger checkpoints less/more often
+		my $sb = 64 + int(rand(1024));
+		$node->append_conf('postgresql.conf', qq[max_wal_size = $sb]);
+
+		$node->start;
+
+		# Start a pgbench in the background against the primary
+		background_rw_pgbench($node->port);
+	}
+
+	$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+
+	flip_data_checksums();
+	random_sleep();
+	my $result =
+	  $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+	is($result, '100000', 'ensure data pages can be read back on primary');
+
+	random_sleep();
+
+	# Potentially powercycle the node
+	if (cointoss())
+	{
+		$node->stop(stopmode());
+
+		PostgreSQL::Test::Utils::system_log("pg_controldata",
+			$node->data_dir);
+
+		my $log = PostgreSQL::Test::Utils::slurp_file($node->logfile,
+			$node_loglocation);
+		unlike(
+			$log,
+			qr/page verification failed/,
+			"no checksum validation errors in primary log (outside WAL recovery)"
+		);
+		$node_loglocation = -s $node->logfile;
+	}
+
+	random_sleep();
+}
+
+# Make sure the node is running
+if (!$node->is_alive)
+{
+	$node->start;
+}
+
+# Testrun is over, ensure that data reads back as expected and perform a final
+# verification of the data checksum state.
+my $result =
+  $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '100000', 'ensure data pages can be read back on primary');
+test_checksum_state($node, $data_checksum_state);
+
+# Perform one final pass over the logs and hunt for unexpected errors
+my $log =
+  PostgreSQL::Test::Utils::slurp_file($node->logfile, $node_loglocation);
+unlike(
+	$log,
+	qr/page verification failed/,
+	"no checksum validation errors in primary log");
+$node_loglocation = -s $node->logfile;
+
+$node->teardown_node;
+
+done_testing();
diff --git a/src/test/modules/test_checksums/t/007_pgbench_standby.pl b/src/test/modules/test_checksums/t/007_pgbench_standby.pl
new file mode 100644
index 00000000000..8b8e031cbf6
--- /dev/null
+++ b/src/test/modules/test_checksums/t/007_pgbench_standby.pl
@@ -0,0 +1,398 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster,
+# comprising of a primary and a replicated standby, with concurrent activity
+# via pgbench runs
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+my $node_primary_slot = 'physical_slot';
+my $node_primary_backup = 'primary_backup';
+my $node_primary;
+my $node_primary_loglocation = 0;
+my $node_standby_1;
+my $node_standby_1_loglocation = 0;
+
+# The number of full test iterations which will be performed. The exact number
+# of tests performed and the wall time taken is non-deterministic as the test
+# performs a lot of randomized actions, but 5 iterations will be a long test
+# run regardless.
+my $TEST_ITERATIONS = 5;
+
+# Variables which record the current state of the cluster
+my $data_checksum_state = 'off';
+
+my $pgbench_primary = undef;
+my $pgbench_standby = undef;
+
+# Variables holding state for managing the cluster and aux processes in
+# various ways
+my ($pgb_primary_stdin, $pgb_primary_stdout, $pgb_primary_stderr) =
+  ('', '', '');
+my ($pgb_standby_1_stdin, $pgb_standby_1_stdout, $pgb_standby_1_stderr) =
+  ('', '', '');
+
+if (!$ENV{PG_TEST_EXTRA} || $ENV{PG_TEST_EXTRA} !~ /\bchecksum_extended\b/)
+{
+	plan skip_all => 'Extended tests not enabled';
+}
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+# determines whether enable_data_checksums/disable_data_checksums forces an
+# immediate checkpoint
+my @flip_modes = ('true', 'false');
+
+# Start a pgbench run in the background against the server specified via the
+# port passed as parameter
+sub background_pgbench
+{
+	my ($port, $standby) = @_;
+
+	# Terminate any currently running pgbench process before continuing
+	$pgbench_primary->finish if $pgbench_primary;
+
+	my $clients = 1 + int(rand(15));
+
+	my @cmd = ('pgbench', '-p', $port, '-T', '600', '-c', $clients);
+	# Randomize whether we spawn connections or not
+	push(@cmd, '-C') if (cointoss());
+	# If we run on a standby it needs to be a read-only benchmark
+	push(@cmd, '-S') if ($standby);
+	# Finally add the database name to use
+	push(@cmd, 'postgres');
+
+	$pgbench_primary = IPC::Run::start(
+		[ 'pgbench', '-p', $port, '-T', '600', '-c', $clients, 'postgres' ],
+		'<' => '/dev/null',
+		'>' => '/dev/null',
+		'2>' => '/dev/null',
+		IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default));
+}
+
+# Invert the state of data checksums in the cluster, if data checksums are on
+# then disable them and vice versa. Also performs proper validation of the
+# before and after state.
+sub flip_data_checksums
+{
+	test_checksum_state($node_primary, $data_checksum_state);
+	test_checksum_state($node_standby_1, $data_checksum_state);
+
+	if ($data_checksum_state eq 'off')
+	{
+		# Coin-toss to see if we are injecting a retry due to a temptable
+		$node_primary->safe_psql('postgres',
+			'SELECT dcw_fake_temptable(true);')
+		  if cointoss();
+
+		# log LSN right before we start changing checksums
+		my $result =
+		  $node_primary->safe_psql('postgres', "SELECT pg_current_wal_lsn()");
+		note("LSN before enabling: " . $result . "\n");
+
+		my $mode = $flip_modes[ int(rand(@flip_modes)) ];
+
+		# Ensure that the primary switches to "inprogress-on"
+		enable_data_checksums(
+			$node_primary,
+			wait => 'inprogress-on',
+			'fast' => $mode);
+		random_sleep();
+		# Wait for checksum enable to be replayed
+		$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+		# Ensure that the standby has switched to "inprogress-on" or "on".
+		# Normally it would be "inprogress-on", but it is theoretically
+		# possible for the primary to complete the checksum enabling *and* have
+		# the standby replay that record before we reach the check below.
+		$result = $node_standby_1->poll_query_until(
+			'postgres',
+			"SELECT setting = 'off' "
+			  . "FROM pg_catalog.pg_settings "
+			  . "WHERE name = 'data_checksums';",
+			'f');
+		is($result, 1,
+			'ensure standby has absorbed the inprogress-on barrier');
+		random_sleep();
+		$result = $node_standby_1->safe_psql('postgres',
+				"SELECT setting "
+			  . "FROM pg_catalog.pg_settings "
+			  . "WHERE name = 'data_checksums';");
+
+		is(($result eq 'inprogress-on' || $result eq 'on'),
+			1, 'ensure checksums are on, or in progress, on standby_1');
+
+		# Wait for checksums enabled on the primary and standby
+		wait_for_checksum_state($node_primary, 'on');
+
+		# log LSN right after the primary flips checksums to "on"
+		$result =
+		  $node_primary->safe_psql('postgres', "SELECT pg_current_wal_lsn()");
+		note("LSN after enabling: " . $result . "\n");
+
+		random_sleep();
+		wait_for_checksum_state($node_standby_1, 'on');
+
+		$node_primary->safe_psql('postgres',
+			'SELECT dcw_fake_temptable(false);');
+		$data_checksum_state = 'on';
+	}
+	elsif ($data_checksum_state eq 'on')
+	{
+		random_sleep();
+
+		# log LSN right before we start changing checksums
+		my $result =
+		  $node_primary->safe_psql('postgres', "SELECT pg_current_wal_lsn()");
+		note("LSN before disabling: " . $result . "\n");
+
+		my $mode = $flip_modes[ int(rand(@flip_modes)) ];
+
+		disable_data_checksums($node_primary, 'fast' => $mode);
+		$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+		# Wait for checksums disabled on the primary and standby
+		wait_for_checksum_state($node_primary, 'off');
+		wait_for_checksum_state($node_standby_1, 'off');
+
+		# log LSN right after the primary flips checksums to "off"
+		$result =
+		  $node_primary->safe_psql('postgres', "SELECT pg_current_wal_lsn()");
+		note("LSN after disabling: " . $result . "\n");
+
+		random_sleep();
+		wait_for_checksum_state($node_standby_1, 'off');
+
+		$data_checksum_state = 'off';
+	}
+	else
+	{
+		# This should only happen due to programmer error when hacking on the
+		# test code, but since that might pass subtly by let's ensure it gets
+		# caught with a test error if so.
+		is(1, 0, 'data_checksum_state variable has invalid state');
+	}
+}
+
+# Create and start a cluster with one primary and one standby node, and ensure
+# they are caught up and in sync.
+$node_primary = PostgreSQL::Test::Cluster->new('main');
+$node_primary->init(allows_streaming => 1, no_data_checksums => 1);
+# max_connections need to be bumped in order to accommodate for pgbench clients
+# and log_statement is dialled down since it otherwise will generate enormous
+# amounts of logging. Page verification failures are still logged.
+$node_primary->append_conf(
+	'postgresql.conf',
+	qq[
+max_connections = 30
+log_statement = none
+]);
+$node_primary->start;
+$node_primary->safe_psql('postgres', 'CREATE EXTENSION test_checksums;');
+# Create some content to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1, 100000) AS a;");
+$node_primary->safe_psql('postgres',
+	"SELECT pg_create_physical_replication_slot('$node_primary_slot');");
+$node_primary->backup($node_primary_backup);
+
+$node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $node_primary_backup,
+	has_streaming => 1);
+$node_standby_1->append_conf(
+	'postgresql.conf', qq[
+primary_slot_name = '$node_primary_slot'
+]);
+$node_standby_1->start;
+
+# Initialize pgbench and wait for the objects to be created on the standby
+$node_primary->command_ok([ 'pgbench', '-i', '-s', '100', '-q', 'postgres' ]);
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Start the test suite with pgbench running on all nodes
+background_pgbench($node_standby_1->port, 1);
+background_pgbench($node_primary->port, 0);
+
+# Main test suite. This loop will start a pgbench run on the cluster and while
+# that's running flip the state of data checksums concurrently. It will then
+# randomly restart the cluster (in fast or immediate) mode and then check for
+# the desired state.  The idea behind doing things randomly is to stress out
+# any timing related issues by subjecting the cluster for varied workloads.
+# A TODO is to generate a trace such that any test failure can be traced to
+# its order of operations for debugging.
+for (my $i = 0; $i < $TEST_ITERATIONS; $i++)
+{
+	note("iteration ", ($i + 1), " of ", $TEST_ITERATIONS);
+
+	if (!$node_primary->is_alive)
+	{
+		random_sleep();
+
+		# start, to do recovery, and stop
+		$node_primary->start;
+		$node_primary->stop('fast');
+
+		# Since the log isn't being written to now, parse the log and check
+		# for instances of checksum verification failures.
+		my $log = PostgreSQL::Test::Utils::slurp_file($node_primary->logfile,
+			$node_primary_loglocation);
+		unlike(
+			$log,
+			qr/page verification failed/,
+			"no checksum validation errors in primary log (during WAL recovery)"
+		);
+		$node_primary_loglocation = -s $node_primary->logfile;
+
+		# randomize the WAL size, to trigger checkpoints less/more often
+		my $sb = 64 + int(rand(960));
+		$node_primary->append_conf('postgresql.conf', qq[max_wal_size = $sb]);
+
+		note("changing primary max_wal_size to " . $sb);
+
+		$node_primary->start;
+
+		# Start a pgbench in the background against the primary
+		background_pgbench($node_primary->port, 0);
+	}
+
+	if (!$node_standby_1->is_alive)
+	{
+		random_sleep();
+
+		$node_standby_1->start;
+		$node_standby_1->stop('fast');
+
+		# Since the log isn't being written to now, parse the log and check
+		# for instances of checksum verification failures.
+		my $log =
+		  PostgreSQL::Test::Utils::slurp_file($node_standby_1->logfile,
+			$node_standby_1_loglocation);
+		unlike(
+			$log,
+			qr/page verification failed/,
+			"no checksum validation errors in standby_1 log (during WAL recovery)"
+		);
+		$node_standby_1_loglocation = -s $node_standby_1->logfile;
+
+		# randomize the WAL size, to trigger checkpoints less/more often
+		my $sb = 64 + int(rand(960));
+		$node_standby_1->append_conf('postgresql.conf',
+			qq[max_wal_size = $sb]);
+
+		note("changing standby max_wal_size to " . $sb);
+
+		$node_standby_1->start;
+
+		# Start a select-only pgbench in the background on the standby
+		background_pgbench($node_standby_1->port, 1);
+	}
+
+	$node_primary->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+
+	flip_data_checksums();
+	random_sleep();
+	my $result = $node_primary->safe_psql('postgres',
+		"SELECT count(*) FROM t WHERE a > 1");
+	is($result, '100000', 'ensure data pages can be read back on primary');
+	random_sleep();
+	$node_primary->wait_for_catchup($node_standby_1, 'write');
+
+	random_sleep();
+
+	# Potentially powercycle the cluster (the nodes independently)
+	# XXX should maybe try stopping nodes in the opposite order too?
+	if (cointoss())
+	{
+		$node_primary->stop(stopmode());
+
+		# print the contents of the control file on the primary
+		PostgreSQL::Test::Utils::system_log("pg_controldata",
+			$node_primary->data_dir);
+
+		# slurp the file after shutdown, so that it doesn't interfere with the recovery
+		my $log = PostgreSQL::Test::Utils::slurp_file($node_primary->logfile,
+			$node_primary_loglocation);
+		unlike(
+			$log,
+			qr/page verification failed/,
+			"no checksum validation errors in primary log (outside WAL recovery)"
+		);
+		$node_primary_loglocation = -s $node_primary->logfile;
+	}
+
+	random_sleep();
+
+	if (cointoss())
+	{
+		$node_standby_1->stop(stopmode());
+
+		# print the contents of the control file on the standby
+		PostgreSQL::Test::Utils::system_log("pg_controldata",
+			$node_standby_1->data_dir);
+
+		# slurp the file after shutdown, so that it doesn't interfere with the recovery
+		my $log =
+		  PostgreSQL::Test::Utils::slurp_file($node_standby_1->logfile,
+			$node_standby_1_loglocation);
+		unlike(
+			$log,
+			qr/page verification failed/,
+			"no checksum validation errors in standby_1 log (outside WAL recovery)"
+		);
+		$node_standby_1_loglocation = -s $node_standby_1->logfile;
+	}
+}
+
+# make sure the nodes are running
+if (!$node_primary->is_alive)
+{
+	$node_primary->start;
+}
+
+if (!$node_standby_1->is_alive)
+{
+	$node_standby_1->start;
+}
+
+# Testrun is over, ensure that data reads back as expected and perform a final
+# verification of the data checksum state.
+my $result =
+  $node_primary->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '100000', 'ensure data pages can be read back on primary');
+test_checksum_state($node_primary, $data_checksum_state);
+test_checksum_state($node_standby_1, $data_checksum_state);
+
+# Perform one final pass over the logs and hunt for unexpected errors
+my $log = PostgreSQL::Test::Utils::slurp_file($node_primary->logfile,
+	$node_primary_loglocation);
+unlike(
+	$log,
+	qr/page verification failed/,
+	"no checksum validation errors in primary log");
+$node_primary_loglocation = -s $node_primary->logfile;
+$log = PostgreSQL::Test::Utils::slurp_file($node_standby_1->logfile,
+	$node_standby_1_loglocation);
+unlike(
+	$log,
+	qr/page verification failed/,
+	"no checksum validation errors in standby_1 log");
+$node_standby_1_loglocation = -s $node_standby_1->logfile;
+
+$node_standby_1->teardown_node;
+$node_primary->teardown_node;
+
+done_testing();
diff --git a/src/test/modules/test_checksums/t/DataChecksums/Utils.pm b/src/test/modules/test_checksums/t/DataChecksums/Utils.pm
new file mode 100644
index 00000000000..cf670be944c
--- /dev/null
+++ b/src/test/modules/test_checksums/t/DataChecksums/Utils.pm
@@ -0,0 +1,283 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+=pod
+
+=head1 NAME
+
+DataChecksums::Utils - Utility functions for testing data checksums in a running cluster
+
+=head1 SYNOPSIS
+
+  use PostgreSQL::Test::Cluster;
+  use DataChecksums::Utils qw( .. );
+
+  # Create, and start, a new cluster
+  my $node = PostgreSQL::Test::Cluster->new('primary');
+  $node->init;
+  $node->start;
+
+  test_checksum_state($node, 'off');
+
+  enable_data_checksums($node);
+
+  wait_for_checksum_state($node, 'on');
+
+
+=cut
+
+package DataChecksums::Utils;
+
+use strict;
+use warnings FATAL => 'all';
+use Exporter 'import';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+our @EXPORT = qw(
+  cointoss
+  disable_data_checksums
+  enable_data_checksums
+  random_sleep
+  stopmode
+  test_checksum_state
+  wait_for_checksum_state
+  wait_for_cluster_crash
+);
+
+=pod
+
+=head1 METHODS
+
+=over
+
+=item test_checksum_state(node, state)
+
+Test that the current value of the data checksum GUC in the server running
+at B<node> matches B<state>.  If the values differ, a test failure is logged.
+Returns True if the values match, otherwise False.
+
+=cut
+
+sub test_checksum_state
+{
+	my ($postgresnode, $state) = @_;
+
+	my $result = $postgresnode->safe_psql('postgres',
+		"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+	);
+	is($result, $state, 'ensure checksums are set to ' . $state);
+	return $result eq $state;
+}
+
+=item wait_for_checksum_state(node, state)
+
+Test the value of the data checksum GUC in the server running at B<node>
+repeatedly until it matches B<state> or times out.  Processing will run for
+$PostgreSQL::Test::Utils::timeout_default seconds before timing out.  If the
+values differ when the process times out, False is returned and a test failure
+is logged, otherwise True.
+
+=cut
+
+sub wait_for_checksum_state
+{
+	my ($postgresnode, $state) = @_;
+
+	my $res = $postgresnode->poll_query_until(
+		'postgres',
+		"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+		$state);
+	is($res, 1, 'ensure data checksums are transitioned to ' . $state);
+	return $res == 1;
+}
+
+=item wait_for_cluster_crash(node, params)
+
+Repeatedly test if the cluster running at B<node> for responds to connections
+and return when it no longer does so, or when it times out.  Processing will
+run for $PostgreSQL::Test::Utils::timeout_default seconds unless a timeout
+value is specified as a parameter.  Returns True if the cluster crashed, else
+False if the process timed out.
+
+=over
+
+=item timeout
+
+Approximate number of seconds to wait for cluster to crash, default is
+$PostgreSQL::Test::Utils::timeout_default.  There are no real-time guarantee
+that the total process time won't exceed the timeout.
+
+=back
+
+=cut
+
+sub wait_for_cluster_crash
+{
+	my $postgresnode = shift;
+	my %params = @_;
+	my $crash = 0;
+
+	$params{timeout} = $PostgreSQL::Test::Utils::timeout_default
+	  unless (defined($params{timeout}));
+
+	for (my $naps = 0; $naps < $params{timeout}; $naps++)
+	{
+		if (!$postgresnode->is_alive)
+		{
+			$crash = 1;
+			last;
+		}
+		sleep(1);
+	}
+
+	return $crash == 1;
+}
+
+=item enable_data_checksums($node, %params)
+
+Function for enabling data checksums in the cluster running at B<node>.
+
+=over
+
+=item cost_delay
+
+The B<cost_delay> to use when enabling data checksums, default is 0.
+
+=item cost_limit
+
+The B<cost_limit> to use when enabling data checksums, default is 100.
+
+=item fast
+
+If set to C<true> an immediate checkpoint will be issued after data
+checksums are enabled.  Setting this to false will lead to slower tests.
+The default is C<true>.
+
+=item wait
+
+If defined, the function will wait for the state defined in this parameter,
+waiting timing out, before returning.  The function will wait for
+$PostgreSQL::Test::Utils::timeout_default seconds before timing out.
+
+=back
+
+=cut
+
+sub enable_data_checksums
+{
+	my $postgresnode = shift;
+	my %params = @_;
+
+	# Set sane defaults for the parameters
+	$params{cost_delay} = 0 unless (defined($params{cost_delay}));
+	$params{cost_limit} = 100 unless (defined($params{cost_limit}));
+	$params{fast} = 'true' unless (defined($params{fast}));
+
+	my $query = <<'EOQ';
+SELECT pg_enable_data_checksums(%s, %s, %s);
+EOQ
+
+	$postgresnode->safe_psql(
+		'postgres',
+		sprintf($query,
+			$params{cost_delay}, $params{cost_limit}, $params{fast}));
+
+	wait_for_checksum_state($postgresnode, $params{wait})
+	  if (defined($params{wait}));
+}
+
+=item disable_data_checksums($node, %params)
+
+Function for disabling data checksums in the cluster running at B<node>.
+
+=over
+
+=item wait
+
+If defined, the function will wait for the state to turn to B<off>, or
+waiting timing out, before returning.  The function will wait for
+$PostgreSQL::Test::Utils::timeout_default seconds before timing out.
+Unlike in C<enable_data_checksums> the value of the parameter is discarded.
+
+=over
+
+=item fast
+
+If set to C<true> the checkpoint after disabling will be set to immediate, else
+it will be deferred.  The default if no value is set is B<true>.
+
+=back
+
+=cut
+
+sub disable_data_checksums
+{
+	my $postgresnode = shift;
+	my %params = @_;
+
+	# Set sane defaults for the parameters
+	$params{fast} = 'true' unless (defined($params{fast}));
+
+	my $query = <<'EOQ';
+SELECT pg_disable_data_checksums(%s);
+EOQ
+
+	$postgresnode->safe_psql('postgres', sprintf($query, $params{fast}));
+
+	wait_for_checksum_state($postgresnode, 'off') if (defined($params{wait}));
+}
+
+=item cointoss
+
+Helper for retrieving a binary value with random distribution for deciding
+whether to turn things off during testing.
+
+=back
+
+=cut
+
+sub cointoss
+{
+	return int(rand() < 0.5);
+}
+
+=item random_sleep(max)
+
+Helper for injecting random sleeps here and there in the testrun. The sleep
+duration will be in the range (0,B<max>), but won't be predictable in order to
+avoid sleep patterns that manage to avoid race conditions and timing bugs.
+The default B<max> is 3 seconds.
+
+=back
+
+=cut
+
+sub random_sleep
+{
+	my $max = shift;
+	sleep(int(rand(defined($max) ? $max : 3))) if cointoss;
+}
+
+=item stopmode
+
+Small helper function for randomly selecting a valid stopmode.
+
+=back
+
+=cut
+
+sub stopmode
+{
+	return 'immediate' if (cointoss);
+	return 'fast';
+}
+
+=pod
+
+=back
+
+=cut
+
+1;
diff --git a/src/test/modules/test_checksums/test_checksums--1.0.sql b/src/test/modules/test_checksums/test_checksums--1.0.sql
new file mode 100644
index 00000000000..aa086d5c430
--- /dev/null
+++ b/src/test/modules/test_checksums/test_checksums--1.0.sql
@@ -0,0 +1,28 @@
+/* src/test/modules/test_checksums/test_checksums--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_checksums" to load this file. \quit
+
+CREATE FUNCTION dcw_inject_delay_barrier(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION dcw_inject_fail_database(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION dcw_prune_dblist(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION dcw_fake_temptable(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION dc_crash_before_checkpoint(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION dc_crash_before_xlog(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_checksums/test_checksums.c b/src/test/modules/test_checksums/test_checksums.c
new file mode 100644
index 00000000000..c182f2c868b
--- /dev/null
+++ b/src/test/modules/test_checksums/test_checksums.c
@@ -0,0 +1,225 @@
+/*--------------------------------------------------------------------------
+ *
+ * test_checksums.c
+ *		Test data checksums
+ *
+ * Copyright (c) 2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *		src/test/modules/test_checksums/test_checksums.c
+ *
+ * -------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "funcapi.h"
+#include "miscadmin.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/latch.h"
+#include "utils/injection_point.h"
+#include "utils/wait_event.h"
+
+#define USEC_PER_SEC    1000000
+
+PG_MODULE_MAGIC;
+
+extern PGDLLEXPORT void dc_delay_barrier(const char *name, const void *private_data, void *arg);
+extern PGDLLEXPORT void dc_fail_database(const char *name, const void *private_data, void *arg);
+extern PGDLLEXPORT void dc_dblist(const char *name, const void *private_data, void *arg);
+extern PGDLLEXPORT void dc_fake_temptable(const char *name, const void *private_data, void *arg);
+
+extern PGDLLEXPORT void crash(const char *name, const void *private_data, void *arg);
+
+/*
+ * Test for delaying emission of procsignalbarriers.
+ */
+void
+dc_delay_barrier(const char *name, const void *private_data, void *arg)
+{
+	(void) name;
+	(void) private_data;
+
+	(void) WaitLatch(MyLatch,
+					 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+					 (3 * 1000),
+					 WAIT_EVENT_PG_SLEEP);
+}
+
+PG_FUNCTION_INFO_V1(dcw_inject_delay_barrier);
+Datum
+dcw_inject_delay_barrier(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksums-enable-checksums-delay",
+							 "test_checksums",
+							 "dc_delay_barrier",
+							 NULL,
+							 0);
+	else
+		InjectionPointDetach("datachecksums-enable-checksums-delay");
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
+
+void
+dc_fail_database(const char *name, const void *private_data, void *arg)
+{
+	static bool first_pass = true;
+	DataChecksumsWorkerResult *res = (DataChecksumsWorkerResult *) arg;
+
+	if (first_pass)
+		*res = DATACHECKSUMSWORKER_FAILED;
+	first_pass = false;
+}
+
+PG_FUNCTION_INFO_V1(dcw_inject_fail_database);
+Datum
+dcw_inject_fail_database(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksumsworker-fail-db",
+							 "test_checksums",
+							 "dc_fail_database",
+							 NULL,
+							 0);
+	else
+		InjectionPointDetach("datachecksumsworker-fail-db");
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
+
+/*
+ * Test to remove an entry from the Databaselist to force re-processing since
+ * not all databases could be processed in the first iteration of the loop.
+ */
+void
+dc_dblist(const char *name, const void *private_data, void *arg)
+{
+	static bool first_pass = true;
+	List	   *DatabaseList = (List *) arg;
+
+	if (first_pass)
+		DatabaseList = list_delete_last(DatabaseList);
+	first_pass = false;
+}
+
+PG_FUNCTION_INFO_V1(dcw_prune_dblist);
+Datum
+dcw_prune_dblist(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksumsworker-initial-dblist",
+							 "test_checksums",
+							 "dc_dblist",
+							 NULL,
+							 0);
+	else
+		InjectionPointDetach("datachecksumsworker-initial-dblist");
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
+
+/*
+ * Test to force waiting for existing temptables.
+ */
+void
+dc_fake_temptable(const char *name, const void *private_data, void *arg)
+{
+	static bool first_pass = true;
+	int		   *numleft = (int *) arg;
+
+	if (first_pass)
+		*numleft = 1;
+	first_pass = false;
+}
+
+PG_FUNCTION_INFO_V1(dcw_fake_temptable);
+Datum
+dcw_fake_temptable(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksumsworker-fake-temptable-wait",
+							 "test_checksums",
+							 "dc_fake_temptable",
+							 NULL,
+							 0);
+	else
+		InjectionPointDetach("datachecksumsworker-fake-temptable-wait");
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
+
+void
+crash(const char *name, const void *private_data, void *arg)
+{
+	abort();
+}
+
+/*
+ * dc_crash_before_checkpoint
+ *
+ * Ensure that the server crashes just before the checkpoint is issued after
+ * enabling or disabling checksums.
+ */
+PG_FUNCTION_INFO_V1(dc_crash_before_checkpoint);
+Datum
+dc_crash_before_checkpoint(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	InjectionPointAttach("datachecksums-enable-checksums-pre-checkpoint",
+						 "test_checksums", "crash", NULL, 0);
+	InjectionPointAttach("datachecksums-disable-checksums-pre-checkpoint",
+						 "test_checksums", "crash", NULL, 0);
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
+
+/*
+ * dc_crash_before_xlog
+ *
+ * Ensure that the server crashes right before it is about insert the xlog
+ * record XLOG_CHECKSUMS.
+ */
+PG_FUNCTION_INFO_V1(dc_crash_before_xlog);
+Datum
+dc_crash_before_xlog(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksums-xlogchecksums-pre-xloginsert",
+							 "test_checksums", "crash", NULL, 0);
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
diff --git a/src/test/modules/test_checksums/test_checksums.control b/src/test/modules/test_checksums/test_checksums.control
new file mode 100644
index 00000000000..84b4cc035a7
--- /dev/null
+++ b/src/test/modules/test_checksums/test_checksums.control
@@ -0,0 +1,4 @@
+comment = 'Test code for data checksums'
+default_version = '1.0'
+module_pathname = '$libdir/test_checksums'
+relocatable = true
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 35413f14019..3af7944acea 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3872,6 +3872,51 @@ sub advance_wal
 	}
 }
 
+=item $node->checksum_enable_offline()
+
+Enable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_enable_offline
+{
+	my ($self) = @_;
+
+	print "# Enabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-e');
+	return;
+}
+
+=item checksum_disable_offline
+
+Disable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_disable_offline
+{
+	my ($self) = @_;
+
+	print "# Disabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-d');
+	return;
+}
+
+sub checksum_verify_offline
+{
+	my ($self) = @_;
+
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-c');
+	return;
+}
+
 =pod
 
 =back
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 7f1cb3bb4af..432e6e596e2 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2073,6 +2073,42 @@ pg_stat_progress_create_index| SELECT s.pid,
     s.param15 AS partitions_done
    FROM (pg_stat_get_progress_info('CREATE INDEX'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
+pg_stat_progress_data_checksums| SELECT s.pid,
+    s.datid,
+    d.datname,
+        CASE s.param1
+            WHEN 0 THEN 'enabling'::text
+            WHEN 1 THEN 'disabling'::text
+            WHEN 2 THEN 'waiting'::text
+            WHEN 3 THEN 'waiting on temporary tables'::text
+            WHEN 4 THEN 'waiting on checkpoint'::text
+            WHEN 5 THEN 'done'::text
+            ELSE NULL::text
+        END AS phase,
+        CASE s.param2
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param2
+        END AS databases_total,
+    s.param3 AS databases_done,
+        CASE s.param4
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param4
+        END AS relations_total,
+        CASE s.param5
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param5
+        END AS relations_done,
+        CASE s.param6
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param6
+        END AS blocks_total,
+        CASE s.param7
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param7
+        END AS blocks_done
+   FROM (pg_stat_get_progress_info('DATACHECKSUMS'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)))
+  ORDER BY s.datid;
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,
diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 67e1860e984..c9feff8331e 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -51,6 +51,22 @@ client backend|relation|vacuum
 client backend|temp relation|normal
 client backend|wal|init
 client backend|wal|normal
+datachecksum launcher|relation|bulkread
+datachecksum launcher|relation|bulkwrite
+datachecksum launcher|relation|init
+datachecksum launcher|relation|normal
+datachecksum launcher|relation|vacuum
+datachecksum launcher|temp relation|normal
+datachecksum launcher|wal|init
+datachecksum launcher|wal|normal
+datachecksum worker|relation|bulkread
+datachecksum worker|relation|bulkwrite
+datachecksum worker|relation|init
+datachecksum worker|relation|normal
+datachecksum worker|relation|vacuum
+datachecksum worker|temp relation|normal
+datachecksum worker|wal|init
+datachecksum worker|wal|normal
 io worker|relation|bulkread
 io worker|relation|bulkwrite
 io worker|relation|init
@@ -95,7 +111,7 @@ walsummarizer|wal|init
 walsummarizer|wal|normal
 walwriter|wal|init
 walwriter|wal|normal
-(79 rows)
+(95 rows)
 \a
 -- ensure that both seqscan and indexscan plans are allowed
 SET enable_seqscan TO on;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 37f26f6c6b7..90451ab09c0 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -416,6 +416,7 @@ CheckPointStmt
 CheckpointStatsData
 CheckpointerRequest
 CheckpointerShmemStruct
+ChecksumType
 Chromosome
 CkptSortItem
 CkptTsStatus
@@ -608,6 +609,10 @@ DataPageDeleteStack
 DataTypesUsageChecks
 DataTypesUsageVersionCheck
 DatabaseInfo
+DataChecksumsWorkerDatabase
+DataChecksumsWorkerResult
+DataChecksumsWorkerResultEntry
+DataChecksumsWorkerShmemStruct
 DateADT
 DateTimeErrorExtra
 Datum
@@ -4241,6 +4246,7 @@ xl_btree_split
 xl_btree_unlink_page
 xl_btree_update
 xl_btree_vacuum
+xl_checksum_state
 xl_clog_truncate
 xl_commit_ts_truncate
 xl_dbase_create_file_copy_rec
-- 
2.39.3 (Apple Git-146)

v20251006-0002-Log-checksum-version-during-checkpoints-et.patchapplication/octet-stream; name=v20251006-0002-Log-checksum-version-during-checkpoints-et.patch; x-unix-mode=0644Download
From fac6dfc4cdfef3302068420389e8aeeafc508014 Mon Sep 17 00:00:00 2001
From: tomas <tomas>
Date: Sat, 30 Aug 2025 15:57:21 +0200
Subject: [PATCH v20251006 2/2] Log checksum version during checkpoints etc.

log data_checksum_version, when:
- reading/writing the control file
- on every checkpoint
- setting ControlFile->data_checksum_version
---
 src/backend/access/transam/xlog.c | 30 ++++++++++++++++++++++++++----
 1 file changed, 26 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 59f5cbe839f..1dbffd6097d 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4445,6 +4445,9 @@ WriteControlFile(void)
 				(errcode_for_file_access(),
 				 errmsg("could not close file \"%s\": %m",
 						XLOG_CONTROL_FILE)));
+
+	elog(LOG, "WriteControlFile ControlFile->data_checksum_version = %d ControlFile->checkPointCopy.data_checksum_version = %d",
+		 ControlFile->data_checksum_version, ControlFile->checkPointCopy.data_checksum_version);
 }
 
 static void
@@ -4663,6 +4666,9 @@ ReadControlFile(void)
 	elog(LOG, "ReadControlFile checkpoint %X/%08X redo %X/%08X",
 		 LSN_FORMAT_ARGS(ControlFile->checkPoint),
 		 LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo));
+
+	elog(LOG, "ReadControlFile ControlFile->data_checksum_version = %d ControlFile->checkPointCopy.data_checksum_version = %d",
+		 ControlFile->data_checksum_version, ControlFile->checkPointCopy.data_checksum_version);
 }
 
 /*
@@ -5567,6 +5573,9 @@ XLOGShmemInit(void)
 	/* Use the checksum info from control file */
 	XLogCtl->data_checksum_version = ControlFile->data_checksum_version;
 
+	elog(LOG, "XLOGShmemInit ControlFile->data_checksum_version = %d  ControlFile->checkPointCopy.data_checksum_version = %d",
+		 ControlFile->data_checksum_version, ControlFile->checkPointCopy.data_checksum_version);
+
 	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
 
 	SpinLockInit(&XLogCtl->Insert.insertpos_lck);
@@ -7336,7 +7345,7 @@ LogCheckpointEnd(bool restartpoint)
 						"%d removed, %d recycled; write=%ld.%03d s, "
 						"sync=%ld.%03d s, total=%ld.%03d s; sync files=%d, "
 						"longest=%ld.%03d s, average=%ld.%03d s; distance=%d kB, "
-						"estimate=%d kB; lsn=%X/%08X, redo lsn=%X/%08X",
+						"estimate=%d kB; lsn=%X/%08X, redo lsn=%X/%08X, checksums=%d (%d)",
 						CheckpointStats.ckpt_bufs_written,
 						(double) CheckpointStats.ckpt_bufs_written * 100 / NBuffers,
 						CheckpointStats.ckpt_slru_written,
@@ -7352,7 +7361,9 @@ LogCheckpointEnd(bool restartpoint)
 						(int) (PrevCheckPointDistance / 1024.0),
 						(int) (CheckPointDistanceEstimate / 1024.0),
 						LSN_FORMAT_ARGS(ControlFile->checkPoint),
-						LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo))));
+						LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo),
+						ControlFile->data_checksum_version,
+						ControlFile->checkPointCopy.data_checksum_version)));
 	else
 		ereport(LOG,
 				(errmsg("checkpoint complete: wrote %d buffers (%.1f%%), "
@@ -7360,7 +7371,7 @@ LogCheckpointEnd(bool restartpoint)
 						"%d removed, %d recycled; write=%ld.%03d s, "
 						"sync=%ld.%03d s, total=%ld.%03d s; sync files=%d, "
 						"longest=%ld.%03d s, average=%ld.%03d s; distance=%d kB, "
-						"estimate=%d kB; lsn=%X/%08X, redo lsn=%X/%08X",
+						"estimate=%d kB; lsn=%X/%08X, redo lsn=%X/%08X, checksums=%d (%d)",
 						CheckpointStats.ckpt_bufs_written,
 						(double) CheckpointStats.ckpt_bufs_written * 100 / NBuffers,
 						CheckpointStats.ckpt_slru_written,
@@ -7376,7 +7387,9 @@ LogCheckpointEnd(bool restartpoint)
 						(int) (PrevCheckPointDistance / 1024.0),
 						(int) (CheckPointDistanceEstimate / 1024.0),
 						LSN_FORMAT_ARGS(ControlFile->checkPoint),
-						LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo))));
+						LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo),
+						ControlFile->data_checksum_version,
+						ControlFile->checkPointCopy.data_checksum_version)));
 }
 
 /*
@@ -7870,6 +7883,9 @@ CreateCheckPoint(int flags)
 	/* make sure we start with the checksum version as of the checkpoint */
 	ControlFile->data_checksum_version = checkPoint.data_checksum_version;
 
+	elog(LOG, "CreateCheckPoint ControlFile->data_checksum_version = %d ControlFile->checkPointCopy.data_checksum_version = %d",
+		 ControlFile->data_checksum_version, ControlFile->checkPointCopy.data_checksum_version);
+
 	/*
 	 * Persist unloggedLSN value. It's reset on crash recovery, so this goes
 	 * unused on non-shutdown checkpoints, but seems useful to store it always
@@ -8017,6 +8033,9 @@ CreateEndOfRecoveryRecord(void)
 	/* start with the latest checksum version (as of the end of recovery) */
 	ControlFile->data_checksum_version = XLogCtl->data_checksum_version;
 
+	elog(LOG, "CreateEndOfRecoveryRecord ControlFile->data_checksum_version = %d  ControlFile->checkPointCopy.data_checksum_version = %d",
+		 ControlFile->data_checksum_version, ControlFile->checkPointCopy.data_checksum_version);
+
 	UpdateControlFile();
 	LWLockRelease(ControlFileLock);
 
@@ -8362,6 +8381,9 @@ CreateRestartPoint(int flags)
 		/* we shall start with the latest checksum version */
 		ControlFile->data_checksum_version = lastCheckPoint.data_checksum_version;
 
+		elog(LOG, "CreateRestartPoint ControlFile->data_checksum_version = %d  ControlFile->checkPointCopy.data_checksum_version = %d",
+			 ControlFile->data_checksum_version, ControlFile->checkPointCopy.data_checksum_version);
+
 		UpdateControlFile();
 	}
 	LWLockRelease(ControlFileLock);
-- 
2.39.3 (Apple Git-146)

#63Daniel Gustafsson
daniel@yesql.se
In reply to: Daniel Gustafsson (#62)
2 attachment(s)
Re: Changing the state of data checksums in a running cluster

Rebase due to recent conflicts with only a trivial whitespace fix, otherwise
the same as the previous version.

--
Daniel Gustafsson

Attachments:

v20251105-0002-Log-checksum-version-during-checkpoints-et.patchapplication/octet-stream; name=v20251105-0002-Log-checksum-version-during-checkpoints-et.patch; x-unix-mode=0644Download
From e7d573bf8d3b832f196caf99765cb069956eb80c Mon Sep 17 00:00:00 2001
From: tomas <tomas>
Date: Sat, 30 Aug 2025 15:57:21 +0200
Subject: [PATCH v20251105 2/2] Log checksum version during checkpoints etc.

log data_checksum_version, when:
- reading/writing the control file
- on every checkpoint
- setting ControlFile->data_checksum_version
---
 src/backend/access/transam/xlog.c | 30 ++++++++++++++++++++++++++----
 1 file changed, 26 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index d70d0493dcb..1893396af85 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4447,6 +4447,9 @@ WriteControlFile(void)
 				(errcode_for_file_access(),
 				 errmsg("could not close file \"%s\": %m",
 						XLOG_CONTROL_FILE)));
+
+	elog(LOG, "WriteControlFile ControlFile->data_checksum_version = %d ControlFile->checkPointCopy.data_checksum_version = %d",
+		 ControlFile->data_checksum_version, ControlFile->checkPointCopy.data_checksum_version);
 }
 
 static void
@@ -4665,6 +4668,9 @@ ReadControlFile(void)
 	elog(LOG, "ReadControlFile checkpoint %X/%08X redo %X/%08X",
 		 LSN_FORMAT_ARGS(ControlFile->checkPoint),
 		 LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo));
+
+	elog(LOG, "ReadControlFile ControlFile->data_checksum_version = %d ControlFile->checkPointCopy.data_checksum_version = %d",
+		 ControlFile->data_checksum_version, ControlFile->checkPointCopy.data_checksum_version);
 }
 
 /*
@@ -5569,6 +5575,9 @@ XLOGShmemInit(void)
 	/* Use the checksum info from control file */
 	XLogCtl->data_checksum_version = ControlFile->data_checksum_version;
 
+	elog(LOG, "XLOGShmemInit ControlFile->data_checksum_version = %d  ControlFile->checkPointCopy.data_checksum_version = %d",
+		 ControlFile->data_checksum_version, ControlFile->checkPointCopy.data_checksum_version);
+
 	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
 
 	SpinLockInit(&XLogCtl->Insert.insertpos_lck);
@@ -7338,7 +7347,7 @@ LogCheckpointEnd(bool restartpoint)
 						"%d removed, %d recycled; write=%ld.%03d s, "
 						"sync=%ld.%03d s, total=%ld.%03d s; sync files=%d, "
 						"longest=%ld.%03d s, average=%ld.%03d s; distance=%d kB, "
-						"estimate=%d kB; lsn=%X/%08X, redo lsn=%X/%08X",
+						"estimate=%d kB; lsn=%X/%08X, redo lsn=%X/%08X, checksums=%d (%d)",
 						CheckpointStats.ckpt_bufs_written,
 						(double) CheckpointStats.ckpt_bufs_written * 100 / NBuffers,
 						CheckpointStats.ckpt_slru_written,
@@ -7354,7 +7363,9 @@ LogCheckpointEnd(bool restartpoint)
 						(int) (PrevCheckPointDistance / 1024.0),
 						(int) (CheckPointDistanceEstimate / 1024.0),
 						LSN_FORMAT_ARGS(ControlFile->checkPoint),
-						LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo))));
+						LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo),
+						ControlFile->data_checksum_version,
+						ControlFile->checkPointCopy.data_checksum_version)));
 	else
 		ereport(LOG,
 				(errmsg("checkpoint complete: wrote %d buffers (%.1f%%), "
@@ -7362,7 +7373,7 @@ LogCheckpointEnd(bool restartpoint)
 						"%d removed, %d recycled; write=%ld.%03d s, "
 						"sync=%ld.%03d s, total=%ld.%03d s; sync files=%d, "
 						"longest=%ld.%03d s, average=%ld.%03d s; distance=%d kB, "
-						"estimate=%d kB; lsn=%X/%08X, redo lsn=%X/%08X",
+						"estimate=%d kB; lsn=%X/%08X, redo lsn=%X/%08X, checksums=%d (%d)",
 						CheckpointStats.ckpt_bufs_written,
 						(double) CheckpointStats.ckpt_bufs_written * 100 / NBuffers,
 						CheckpointStats.ckpt_slru_written,
@@ -7378,7 +7389,9 @@ LogCheckpointEnd(bool restartpoint)
 						(int) (PrevCheckPointDistance / 1024.0),
 						(int) (CheckPointDistanceEstimate / 1024.0),
 						LSN_FORMAT_ARGS(ControlFile->checkPoint),
-						LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo))));
+						LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo),
+						ControlFile->data_checksum_version,
+						ControlFile->checkPointCopy.data_checksum_version)));
 }
 
 /*
@@ -7872,6 +7885,9 @@ CreateCheckPoint(int flags)
 	/* make sure we start with the checksum version as of the checkpoint */
 	ControlFile->data_checksum_version = checkPoint.data_checksum_version;
 
+	elog(LOG, "CreateCheckPoint ControlFile->data_checksum_version = %d ControlFile->checkPointCopy.data_checksum_version = %d",
+		 ControlFile->data_checksum_version, ControlFile->checkPointCopy.data_checksum_version);
+
 	/*
 	 * Persist unloggedLSN value. It's reset on crash recovery, so this goes
 	 * unused on non-shutdown checkpoints, but seems useful to store it always
@@ -8019,6 +8035,9 @@ CreateEndOfRecoveryRecord(void)
 	/* start with the latest checksum version (as of the end of recovery) */
 	ControlFile->data_checksum_version = XLogCtl->data_checksum_version;
 
+	elog(LOG, "CreateEndOfRecoveryRecord ControlFile->data_checksum_version = %d  ControlFile->checkPointCopy.data_checksum_version = %d",
+		 ControlFile->data_checksum_version, ControlFile->checkPointCopy.data_checksum_version);
+
 	UpdateControlFile();
 	LWLockRelease(ControlFileLock);
 
@@ -8364,6 +8383,9 @@ CreateRestartPoint(int flags)
 		/* we shall start with the latest checksum version */
 		ControlFile->data_checksum_version = lastCheckPoint.data_checksum_version;
 
+		elog(LOG, "CreateRestartPoint ControlFile->data_checksum_version = %d  ControlFile->checkPointCopy.data_checksum_version = %d",
+			 ControlFile->data_checksum_version, ControlFile->checkPointCopy.data_checksum_version);
+
 		UpdateControlFile();
 	}
 	LWLockRelease(ControlFileLock);
-- 
2.39.3 (Apple Git-146)

v20251105-0001-Online-enabling-and-disabling-of-data-chec.patchapplication/octet-stream; name=v20251105-0001-Online-enabling-and-disabling-of-data-chec.patch; x-unix-mode=0644Download
From 9053f508374208b3fe28aa372e5c2ec1d2ad7079 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Fri, 15 Aug 2025 14:48:02 +0200
Subject: [PATCH v20251105 1/2] Online enabling and disabling of data checksums

This allows data checksums to be enabled, or disabled, in a running
cluster without restricting access to the cluster during processing.

Data checksums could prior to this only be enabled during initdb or
when the cluster is offline using the pg_checksums app. This commit
introduce functionality to enable, or disable, data checksums while
the cluster is running regardless of how it was initialized.

A background worker launcher process is responsible for launching a
dynamic per-database background worker which will mark all buffers
dirty for all relation with storage in order for them to have data
checksums calcuated on write.  Once all relations in all databases
have been processed, the data_checksums state will be set to on and
the cluster will at that point be identical to one which had data
checksums enabled during initialization or via offline processing.

When data checksums are being enabled, concurrent I/O operations
from backends other than the data checksums worker will write the
checksums but not verify them on reading.  Only when all backends
have absorbed the procsignalbarrier for setting data_checksums to
on will they also start verifying checksums on reading.  The same
process is repeated during disabling; all backends write checksums
but do not verify them until the barrier for setting the state to
off has been absorbed by all.  This in-progress state is used to
ensure there are no false negatives (or positives) due to reading
a checksum which is not in sync with the page.

A new testmodule, test_checksums, is introduced with an extensive
set of tests covering both online and offline data checksum mode
changes.  The tests for online processing are gated begind the
PG_TEST_EXTRA flag to some degree due to being very time consuming
to run.

This work is based on an earlier version of this patch which was
reviewed by among others Heikki Linnakangas, Robert Haas, Andres
Freund, Tomas Vondra, Michael Banck and Andrey Borodin.  During
the work on this new version, Tomas Vondra has given invaluable
assistance with not only coding and reviewing but very in-depth
testing.

Author: Daniel Gustafsson <daniel@yesql.se>
Author: Magnus Hagander <magnus@hagander.net>
Co-authored-by: Tomas Vondra <tomas@vondra.me>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com
Discussion: https://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de
Discussion: https://postgr.es/m/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com
---
 doc/src/sgml/func/func-admin.sgml             |   71 +
 doc/src/sgml/glossary.sgml                    |   23 +
 doc/src/sgml/monitoring.sgml                  |  208 ++-
 doc/src/sgml/ref/pg_checksums.sgml            |    6 +
 doc/src/sgml/regress.sgml                     |   12 +
 doc/src/sgml/wal.sgml                         |   59 +-
 src/backend/access/rmgrdesc/xlogdesc.c        |   24 +
 src/backend/access/transam/xlog.c             |  678 +++++++-
 src/backend/access/transam/xlogfuncs.c        |   57 +
 src/backend/access/transam/xlogrecovery.c     |   13 +
 src/backend/backup/basebackup.c               |    6 +-
 src/backend/catalog/system_functions.sql      |   20 +
 src/backend/catalog/system_views.sql          |   20 +
 src/backend/postmaster/Makefile               |    1 +
 src/backend/postmaster/auxprocess.c           |   19 +
 src/backend/postmaster/bgworker.c             |    7 +
 src/backend/postmaster/datachecksumsworker.c  | 1471 +++++++++++++++++
 src/backend/postmaster/meson.build            |    1 +
 src/backend/postmaster/postmaster.c           |    5 +
 src/backend/replication/logical/decode.c      |    1 +
 src/backend/storage/ipc/ipci.c                |    4 +
 src/backend/storage/ipc/procsignal.c          |   14 +
 src/backend/storage/page/README               |    4 +-
 src/backend/storage/page/bufpage.c            |   10 +-
 src/backend/utils/activity/pgstat_backend.c   |    2 +
 src/backend/utils/activity/pgstat_io.c        |    2 +
 .../utils/activity/wait_event_names.txt       |    4 +
 src/backend/utils/adt/pgstatfuncs.c           |    8 +-
 src/backend/utils/init/miscinit.c             |    3 +-
 src/backend/utils/init/postinit.c             |   20 +-
 src/backend/utils/misc/guc_parameters.dat     |    5 +-
 src/backend/utils/misc/guc_tables.c           |    9 +-
 src/bin/pg_checksums/pg_checksums.c           |    4 +-
 src/bin/pg_controldata/pg_controldata.c       |    2 +
 src/bin/pg_upgrade/controldata.c              |    9 +
 src/include/access/xlog.h                     |   14 +-
 src/include/access/xlog_internal.h            |    7 +
 src/include/catalog/pg_control.h              |    6 +-
 src/include/catalog/pg_proc.dat               |   19 +
 src/include/commands/progress.h               |   17 +
 src/include/miscadmin.h                       |    6 +
 src/include/postmaster/datachecksumsworker.h  |   51 +
 src/include/postmaster/proctypelist.h         |    2 +
 src/include/storage/bufpage.h                 |    2 +-
 src/include/storage/checksum.h                |   15 +
 src/include/storage/lwlocklist.h              |    1 +
 src/include/storage/proc.h                    |    6 +-
 src/include/storage/procsignal.h              |    5 +
 src/include/utils/backend_progress.h          |    1 +
 src/test/modules/Makefile                     |    1 +
 src/test/modules/meson.build                  |    1 +
 src/test/modules/test_checksums/.gitignore    |    2 +
 src/test/modules/test_checksums/Makefile      |   40 +
 src/test/modules/test_checksums/README        |   22 +
 src/test/modules/test_checksums/meson.build   |   36 +
 .../modules/test_checksums/t/001_basic.pl     |   63 +
 .../modules/test_checksums/t/002_restarts.pl  |  110 ++
 .../test_checksums/t/003_standby_restarts.pl  |  114 ++
 .../modules/test_checksums/t/004_offline.pl   |   82 +
 .../modules/test_checksums/t/005_injection.pl |  126 ++
 .../test_checksums/t/006_pgbench_single.pl    |  268 +++
 .../test_checksums/t/007_pgbench_standby.pl   |  398 +++++
 .../test_checksums/t/DataChecksums/Utils.pm   |  283 ++++
 .../test_checksums/test_checksums--1.0.sql    |   28 +
 .../modules/test_checksums/test_checksums.c   |  225 +++
 .../test_checksums/test_checksums.control     |    4 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |   45 +
 src/test/regress/expected/rules.out           |   36 +
 src/test/regress/expected/stats.out           |   18 +-
 src/tools/pgindent/typedefs.list              |    6 +
 70 files changed, 4815 insertions(+), 47 deletions(-)
 create mode 100644 src/backend/postmaster/datachecksumsworker.c
 create mode 100644 src/include/postmaster/datachecksumsworker.h
 create mode 100644 src/test/modules/test_checksums/.gitignore
 create mode 100644 src/test/modules/test_checksums/Makefile
 create mode 100644 src/test/modules/test_checksums/README
 create mode 100644 src/test/modules/test_checksums/meson.build
 create mode 100644 src/test/modules/test_checksums/t/001_basic.pl
 create mode 100644 src/test/modules/test_checksums/t/002_restarts.pl
 create mode 100644 src/test/modules/test_checksums/t/003_standby_restarts.pl
 create mode 100644 src/test/modules/test_checksums/t/004_offline.pl
 create mode 100644 src/test/modules/test_checksums/t/005_injection.pl
 create mode 100644 src/test/modules/test_checksums/t/006_pgbench_single.pl
 create mode 100644 src/test/modules/test_checksums/t/007_pgbench_standby.pl
 create mode 100644 src/test/modules/test_checksums/t/DataChecksums/Utils.pm
 create mode 100644 src/test/modules/test_checksums/test_checksums--1.0.sql
 create mode 100644 src/test/modules/test_checksums/test_checksums.c
 create mode 100644 src/test/modules/test_checksums/test_checksums.control

diff --git a/doc/src/sgml/func/func-admin.sgml b/doc/src/sgml/func/func-admin.sgml
index 1b465bc8ba7..f3a8782ede0 100644
--- a/doc/src/sgml/func/func-admin.sgml
+++ b/doc/src/sgml/func/func-admin.sgml
@@ -2979,4 +2979,75 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 
   </sect2>
 
+  <sect2 id="functions-admin-checksum">
+   <title>Data Checksum Functions</title>
+
+   <para>
+    The functions shown in <xref linkend="functions-checksums-table" /> can
+    be used to enable or disable data checksums in a running cluster.
+    See <xref linkend="checksums" /> for details.
+   </para>
+
+   <table id="functions-checksums-table">
+    <title>Data Checksum Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_enable_data_checksums</primary>
+        </indexterm>
+        <function>pg_enable_data_checksums</function> ( <optional><parameter>cost_delay</parameter> <type>int</type>, <parameter>cost_limit</parameter> <type>int</type></optional> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Initiates data checksums for the cluster. This will switch the data
+        checksums mode to <literal>inprogress-on</literal> as well as start a
+        background worker that will process all pages in the database and
+        enable checksums on them. When all data pages have had checksums
+        enabled, the cluster will automatically switch data checksums mode to
+        <literal>on</literal>.
+       </para>
+       <para>
+        If <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter> are
+        specified, the speed of the process is throttled using the same principles as
+        <link linkend="runtime-config-resource-vacuum-cost">Cost-based Vacuum Delay</link>.
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_disable_data_checksums</primary>
+        </indexterm>
+        <function>pg_disable_data_checksums</function> ()
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Disables data checksum validation and calculation for the cluster. This
+        will switch the data checksum mode to <literal>inprogress-off</literal>
+        while data checksums are being disabled. When all active backends have
+        stopped validating data checksums, the data checksum mode will be
+        changed to <literal>off</literal>.  At this point the data pages will
+        still have checksums recorded but they are not updated when pages are
+        modified.
+       </para></entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+
+  </sect2>
+
   </sect1>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 8651f0cdb91..9bac0c96348 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -184,6 +184,8 @@
      (but not the autovacuum workers),
      the <glossterm linkend="glossary-background-writer">background writer</glossterm>,
      the <glossterm linkend="glossary-checkpointer">checkpointer</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker">data checksums worker</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker-launcher">data checksums worker launcher</glossterm>,
      the <glossterm linkend="glossary-logger">logger</glossterm>,
      the <glossterm linkend="glossary-startup-process">startup process</glossterm>,
      the <glossterm linkend="glossary-wal-archiver">WAL archiver</glossterm>,
@@ -573,6 +575,27 @@
    </glossdef>
   </glossentry>
 
+  <glossentry id="glossary-data-checksums-worker">
+   <glossterm>Data Checksums Worker</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which enables or disables data checksums in a specific database.
+    </para>
+   </glossdef>
+  </glossentry>
+
+  <glossentry id="glossary-data-checksums-worker-launcher">
+   <glossterm>Data Checksums Worker Launcher</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which starts <glossterm linkend="glossary-data-checksums-worker"> processes</glossterm>
+     for each database.
+    </para>
+   </glossdef>
+  </glossentry>
+
   <glossentry id="glossary-db-cluster">
    <glossterm>Database cluster</glossterm>
    <glossdef>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index f3bf527d5b4..b56e220f3d8 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3551,8 +3551,9 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Number of data page checksum failures detected in this
-       database (or on a shared object), or NULL if data checksums are
-       disabled.
+       database (or on a shared object).
+       Detected failures are reported regardless of the
+       <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -3562,8 +3563,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Time at which the last data page checksum failure was detected in
-       this database (or on a shared object), or NULL if data checksums are
-       disabled.
+       this database (or on a shared object). Last failure is reported
+       regardless of the <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -6946,6 +6947,205 @@ FROM pg_stat_get_backend_idset() AS backendid;
 
  </sect2>
 
+ <sect2 id="data-checksum-progress-reporting">
+  <title>Data Checksum Progress Reporting</title>
+
+  <indexterm>
+   <primary>pg_stat_progress_data_checksums</primary>
+  </indexterm>
+
+  <para>
+   When data checksums are being enabled on a running cluster, the
+   <structname>pg_stat_progress_data_checksums</structname> view will contain
+   a row for the launcher process, and one row for each worker process which
+   is currently calculating checksums for the data pages in one database.
+  </para>
+
+  <table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
+   <title><structname>pg_stat_progress_data_checksums</structname> View</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        Column Type
+       </para>
+       <para>
+        Description>
+       </para>
+      </entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>pid</structfield> <type>integer</type>
+       </para>
+       <para>
+        Process ID of a datachecksumworker process.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datid</structfield> <type>oid</type>
+      </para>
+      <para>
+       OID of this database, or 0 for the launcher process
+       relation
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>datname</structfield> <type>name</type>
+      </para>
+      <para>
+       Name of this database, or <literal>NULL</literal> for the
+       launcher process.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>phase</structfield> <type>text</type>
+       </para>
+       <para>
+        Current processing phase, see <xref linkend="datachecksum-phases"/>
+        for description of the phases.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of databases which will be processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this set to <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of databases which have been processed. Only the
+        launcher worker has this value set, the other worker processes
+        have this set to <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of relations which will be processed, or
+        <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of relations yet. The launcher process has
+        this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of relations which have been processed. The launcher
+        process has this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>blocks_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks in the current relation which will be processed,
+        or <literal>NULL</literal> if the data checksums worker process hasn't
+        calculated the number of blocks yet. The launcher process has
+        this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>blocks_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks in the current relation which have been processed.
+        The launcher process has this <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+  <table id="datachecksum-phases">
+   <title>Data Checksum Phases</title>
+   <tgroup cols="2">
+    <colspec colname="col1" colwidth="1*"/>
+    <colspec colname="col2" colwidth="2*"/>
+    <thead>
+     <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+    <tbody>
+     <row>
+      <entry><literal>enabling</literal></entry>
+      <entry>
+       The command is currently enabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>disabling</literal></entry>
+      <entry>
+       The command is currently disabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on temporary tables</literal></entry>
+      <entry>
+       The command is currently waiting for all temporary tables which existed
+       at the time the command was started to be removed.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on checkpoint</literal></entry>
+      <entry>
+       The command is currently waiting for a checkpoint to update the checksum
+       state before finishing.
+      </entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+ </sect2>
+
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index e9e393495df..e764b8be04d 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -45,6 +45,12 @@ PostgreSQL documentation
    exit status is nonzero if the operation failed.
   </para>
 
+  <para>
+   When enabling checksums, if checksums were in the process of being enabled
+   when the cluster was shut down, <application>pg_checksums</application>
+   will still process all relations regardless of the online processing.
+  </para>
+
   <para>
    When verifying checksums, every file in the cluster is scanned. When
    enabling checksums, each relation file block with a changed checksum is
diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index 8838fe7f022..7074751834e 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -263,6 +263,18 @@ make check-world PG_TEST_EXTRA='kerberos ldap ssl load_balance libpq_encryption'
 </programlisting>
    The following values are currently supported:
    <variablelist>
+    <varlistentry>
+     <term><literal>checksum_extended</literal></term>
+     <listitem>
+      <para>
+       Runs additional tests for enabling data checksums which inject delays
+       and re-tries in the processing, as well as tests that run pgbench
+       concurrently and randomly restarts the cluster.  Some of these test
+       suites requires injection points enabled in the installation.
+      </para>
+     </listitem>
+    </varlistentry>
+
     <varlistentry>
      <term><literal>kerberos</literal></term>
      <listitem>
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index f3b86b26be9..0ada90ca0b1 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -246,9 +246,10 @@
   <para>
    Checksums can be disabled when the cluster is initialized using <link
    linkend="app-initdb-data-checksums"><application>initdb</application></link>.
-   They can also be enabled or disabled at a later time as an offline
-   operation. Data checksums are enabled or disabled at the full cluster
-   level, and cannot be specified individually for databases or tables.
+   They can also be enabled or disabled at a later time either as an offline
+   operation or online in a running cluster allowing concurrent access. Data
+   checksums are enabled or disabled at the full cluster level, and cannot be
+   specified individually for databases or tables.
   </para>
 
   <para>
@@ -265,7 +266,7 @@
   </para>
 
   <sect2 id="checksums-offline-enable-disable">
-   <title>Off-line Enabling of Checksums</title>
+   <title>Offline Enabling of Checksums</title>
 
    <para>
     The <link linkend="app-pgchecksums"><application>pg_checksums</application></link>
@@ -274,6 +275,56 @@
    </para>
 
   </sect2>
+
+  <sect2 id="checksums-online-enable-disable">
+   <title>Online Enabling of Checksums</title>
+
+   <para>
+    Checksums can be enabled or disabled online, by calling the appropriate
+    <link linkend="functions-admin-checksum">functions</link>.
+   </para>
+
+   <para>
+    Enabling checksums will put the cluster checksum mode in
+    <literal>inprogress-on</literal> mode.  During this time, checksums will be
+    written but not verified. In addition to this, a background worker process
+    is started that enables checksums on all existing data in the cluster. Once
+    this worker has completed processing all databases in the cluster, the
+    checksum mode will automatically switch to <literal>on</literal>. The
+    processing will consume two background worker processes, make sure that
+    <varname>max_worker_processes</varname> allows for at least two more
+    additional processes.
+   </para>
+
+   <para>
+    The process will initially wait for all open transactions to finish before
+    it starts, so that it can be certain that there are no tables that have been
+    created inside a transaction that has not committed yet and thus would not
+    be visible to the process enabling checksums. It will also, for each database,
+    wait for all pre-existing temporary tables to get removed before it finishes.
+    If long-lived temporary tables are used in the application it may be necessary
+    to terminate these application connections to allow the process to complete.
+   </para>
+
+   <para>
+    If the cluster is stopped while in <literal>inprogress-on</literal> mode, for
+    any reason, then this process must be restarted manually. To do this,
+    re-execute the function <function>pg_enable_data_checksums()</function>
+    once the cluster has been restarted. The process will start over, there is
+    no support for resuming work from where it was interrupted.
+   </para>
+
+   <note>
+    <para>
+     Enabling checksums can cause significant I/O to the system, as most of the
+     database pages will need to be rewritten, and will be written both to the
+     data files and the WAL. The impact may be limited by throttling using the
+     <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter>
+     parameters of the <function>pg_enable_data_checksums</function> function.
+    </para>
+   </note>
+
+  </sect2>
  </sect1>
 
   <sect1 id="wal-intro">
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index cd6c2a2f650..c50d654db30 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -18,6 +18,7 @@
 #include "access/xlog.h"
 #include "access/xlog_internal.h"
 #include "catalog/pg_control.h"
+#include "storage/bufpage.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -167,6 +168,26 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
 		memcpy(&wal_level, rec, sizeof(int));
 		appendStringInfo(buf, "wal_level %s", get_wal_level_string(wal_level));
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state xlrec;
+
+		memcpy(&xlrec, rec, sizeof(xl_checksum_state));
+		switch (xlrec.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_VERSION:
+				appendStringInfoString(buf, "on");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				appendStringInfoString(buf, "inprogress-off");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				appendStringInfoString(buf, "inprogress-on");
+				break;
+			default:
+				appendStringInfoString(buf, "off");
+		}
+	}
 }
 
 const char *
@@ -218,6 +239,9 @@ xlog_identify(uint8 info)
 		case XLOG_CHECKPOINT_REDO:
 			id = "CHECKPOINT_REDO";
 			break;
+		case XLOG_CHECKSUMS:
+			id = "CHECKSUMS";
+			break;
 	}
 
 	return id;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 9900e3e0179..d70d0493dcb 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -286,6 +286,11 @@ static XLogRecPtr RedoRecPtr;
  */
 static bool doPageWrites;
 
+/*
+ * Force creating a restart point on the next CHECKPOINT after XLOG_CHECKSUMS.
+ */
+static bool checksumRestartPoint = false;
+
 /*----------
  * Shared-memory data structures for XLOG control
  *
@@ -550,6 +555,9 @@ typedef struct XLogCtlData
 	 */
 	XLogRecPtr	lastFpwDisableRecPtr;
 
+	/* last data_checksum_version we've seen */
+	uint32		data_checksum_version;
+
 	slock_t		info_lck;		/* locks shared variables shown above */
 } XLogCtlData;
 
@@ -573,6 +581,44 @@ static WALInsertLockPadded *WALInsertLocks = NULL;
  */
 static ControlFileData *ControlFile = NULL;
 
+/*
+ * This must match largest number of sets in barrier_eq and barrier_ne in the
+ * below checksum_barriers definition.
+ */
+#define MAX_BARRIER_CONDITIONS 2
+
+/*
+ * Configuration of conditions which must match when absorbing a procsignal
+ * barrier during data checksum enable/disable operations.  A single function
+ * is used for absorbing all barriers, and the set of conditions to use is
+ * looked up in the checksum_barriers struct.  The struct member for the target
+ * state defines which state the backend must currently be in, and which it
+ * must not be in.
+ */
+typedef struct ChecksumBarrierCondition
+{
+	/* The target state of the barrier */
+	int			target;
+	/* A set of states in which at least one MUST match the current state */
+	int			barrier_eq[MAX_BARRIER_CONDITIONS];
+	/* The number of elements in the barrier_eq set */
+	int			barrier_eq_sz;
+	/* A set of states which all MUST NOT match the current state */
+	int			barrier_ne[MAX_BARRIER_CONDITIONS];
+	/* The number of elements in the barrier_ne set */
+	int			barrier_ne_sz;
+}			ChecksumBarrierCondition;
+
+static const ChecksumBarrierCondition checksum_barriers[] =
+{
+	{PG_DATA_CHECKSUM_OFF, {PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION, PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION}, 2, {PG_DATA_CHECKSUM_VERSION}, 1},
+	{PG_DATA_CHECKSUM_VERSION, {PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION}, 1, {0}, 0},
+	{PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION, {PG_DATA_CHECKSUM_ANY_VERSION}, 1, {PG_DATA_CHECKSUM_VERSION}, 1},
+	{PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION, {PG_DATA_CHECKSUM_VERSION}, 1, {0}, 0},
+	{-1}
+};
+
+
 /*
  * Calculate the amount of space left on the page after 'endptr'. Beware
  * multiple evaluation!
@@ -647,6 +693,36 @@ static XLogRecPtr LocalMinRecoveryPoint;
 static TimeLineID LocalMinRecoveryPointTLI;
 static bool updateMinRecoveryPoint = true;
 
+/*
+ * Local state fror Controlfile data_checksum_version.  After initialization
+ * this is only updated when absorbing a procsignal barrier during interrupt
+ * processing.  The reason for keeping a copy in backend-private memory is to
+ * avoid locking for interrogating checksum state.  Possible values are the
+ * checksum versions defined in storage/bufpage.h as well as zero when data
+ * checksums are disabled.
+ */
+static uint32 LocalDataChecksumVersion = 0;
+
+/*
+ * Flag to remember if the procsignalbarrier being absorbed for checksums is
+ * the first one.  The first procsignalbarrier can in rare cases be for the
+ * state we've initialized, i.e. a duplicate.  This may happen for any
+ * data_checksum_version value, but for PG_DATA_CHECKSUM_ON_VERSION this would
+ * trigger an assert failure (this is the only transition with an assert) when
+ * processing the barrier.  This may happen if the process is spawned between
+ * the update of XLogCtl->data_checksum_version and the barrier being emitted.
+ * This can only happen on the very first barrier so mark that with this flag.
+ */
+static bool InitialDataChecksumTransition = true;
+
+/*
+ * Variable backing the GUC, keep it in sync with LocalDataChecksumVersion.
+ * See SetLocalDataChecksumVersion().
+ */
+int			data_checksums = 0;
+
+static void SetLocalDataChecksumVersion(uint32 data_checksum_version);
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -715,6 +791,8 @@ static void WALInsertLockAcquireExclusive(void);
 static void WALInsertLockRelease(void);
 static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);
 
+static void XLogChecksums(uint32 new_type);
+
 /*
  * Insert an XLOG record represented by an already-constructed chain of data
  * chunks.  This is a low-level routine; to construct the WAL record header
@@ -829,9 +907,10 @@ XLogInsertRecord(XLogRecData *rdata,
 		 * only happen just after a checkpoint, so it's better to be slow in
 		 * this case and fast otherwise.
 		 *
-		 * Also check to see if fullPageWrites was just turned on or there's a
-		 * running backup (which forces full-page writes); if we weren't
-		 * already doing full-page writes then go back and recompute.
+		 * Also check to see if fullPageWrites was just turned on, there's a
+		 * running backup or if checksums are enabled (all of which forces
+		 * full-page writes); if we weren't already doing full-page writes
+		 * then go back and recompute.
 		 *
 		 * If we aren't doing full-page writes then RedoRecPtr doesn't
 		 * actually affect the contents of the XLOG record, so we'll update
@@ -844,7 +923,9 @@ XLogInsertRecord(XLogRecData *rdata,
 			Assert(RedoRecPtr < Insert->RedoRecPtr);
 			RedoRecPtr = Insert->RedoRecPtr;
 		}
-		doPageWrites = (Insert->fullPageWrites || Insert->runningBackups > 0);
+		doPageWrites = (Insert->fullPageWrites ||
+						Insert->runningBackups > 0 ||
+						DataChecksumsNeedWrite());
 
 		if (doPageWrites &&
 			(!prevDoPageWrites ||
@@ -4251,6 +4332,12 @@ InitControlFile(uint64 sysidentifier, uint32 data_checksum_version)
 	ControlFile->wal_log_hints = wal_log_hints;
 	ControlFile->track_commit_timestamp = track_commit_timestamp;
 	ControlFile->data_checksum_version = data_checksum_version;
+
+	/*
+	 * Set the data_checksum_version value into XLogCtl, which is where all
+	 * processes get the current value from. (Maybe it should go just there?)
+	 */
+	XLogCtl->data_checksum_version = data_checksum_version;
 }
 
 static void
@@ -4575,9 +4662,9 @@ ReadControlFile(void)
 
 	CalculateCheckpointSegments();
 
-	/* Make the initdb settings visible as GUC variables, too */
-	SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no",
-					PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
+	elog(LOG, "ReadControlFile checkpoint %X/%08X redo %X/%08X",
+		 LSN_FORMAT_ARGS(ControlFile->checkPoint),
+		 LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo));
 }
 
 /*
@@ -4611,13 +4698,430 @@ GetMockAuthenticationNonce(void)
 }
 
 /*
- * Are checksums enabled for data pages?
+ * DataChecksumsNeedWrite
+ *		Returns whether data checksums must be written or not
+ *
+ * Returns true iff data checksums are enabled or are in the process of being
+ * enabled. During "inprogress-on" and "inprogress-off" states checksums must
+ * be written even though they are not verified (see datachecksumsworker.c for
+ * a longer discussion).
+ *
+ * This function is intended for callsites which are about to write a data page
+ * to storage, and need to know whether to re-calculate the checksum for the
+ * page header. Calling this function must be performed as close to the write
+ * operation as possible to keep the critical section short.
  */
 bool
-DataChecksumsEnabled(void)
+DataChecksumsNeedWrite(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * DataChecksumsNeedVerify
+ *		Returns whether data checksums must be verified or not
+ *
+ * Data checksums are only verified if they are fully enabled in the cluster.
+ * During the "inprogress-on" and "inprogress-off" states they are only
+ * updated, not verified (see datachecksumsworker.c for a longer discussion).
+ *
+ * This function is intended for callsites which have read data and are about
+ * to perform checksum validation based on the result of this.  Calling this
+ * function must be performed as close to the validation call as possible to
+ * keep the critical section short. This is in order to protect against time of
+ * check/time of use situations around data checksum validation.
+ */
+bool
+DataChecksumsNeedVerify(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
+}
+
+/*
+ * DataChecksumsOnInProgress
+ *		Returns whether data checksums are being enabled
+ *
+ * Most operations don't need to worry about the "inprogress" states, and
+ * should use DataChecksumsNeedVerify() or DataChecksumsNeedWrite(). The
+ * "inprogress-on" state for enabling checksums is used when the checksum
+ * worker is setting checksums on all pages, it can thus be used to check for
+ * aborted checksum processing which need to be restarted.
+ */
+inline bool
+DataChecksumsOnInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+}
+
+/*
+ * DataChecksumsOffInProgress
+ *		Returns whether data checksums are being disabled
+ *
+ * The "inprogress-off" state for disabling checksums is used for when the
+ * worker resets the catalog state.  DataChecksumsNeedVerify() or
+ * DataChecksumsNeedWrite() should be used for deciding whether to read/write
+ * checksums.
+ */
+bool
+DataChecksumsOffInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * SetDataChecksumsOnInProgress
+ *		Sets the data checksum state to "inprogress-on" to enable checksums
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on". See
+ * SetDataChecksumsOn below for a description on how this state change works.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOnInProgress(bool immediate_checkpoint)
+{
+	uint64		barrier;
+	int			flags;
+
+	Assert(ControlFile != NULL);
+
+	/*
+	 * The state transition is performed in a critical section with
+	 * checkpoints held off to provide crash safety.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state change in all backends to ensure that all backends are in
+	 * "inprogress-on". Once done we know that all backends are writing data
+	 * checksums.
+	 */
+	WaitForProcSignalBarrier(barrier);
+
+	/*
+	 * force checkpoint to persist the current checksum state in control file
+	 * etc.
+	 *
+	 * XXX is this needed? There's already a checkpoint at the end of
+	 * ProcessAllDatabases, maybe this is redundant?
+	 */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_FAST;
+	RequestCheckpoint(flags);
+}
+
+/*
+ * SetDataChecksumsOn
+ *		Enables data checksums cluster-wide
+ *
+ * Enabling data checksums is performed using two barriers, the first one to
+ * set the state to "inprogress-on" (done by SetDataChecksumsOnInProgress())
+ * and the second one to set the state to "on" (done here). Below is a short
+ * description of the processing, a more detailed write-up can be found in
+ * datachecksumsworker.c.
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on".  This state
+ * requires data checksums to be written but not verified. This ensures that
+ * all data pages can be checksummed without the risk of false negatives in
+ * validation during the process.  When all existing pages are guaranteed to
+ * have checksums, and all new pages will be initiated with checksums, the
+ * state can be changed to "on". Once the state is "on" checksums will be both
+ * written and verified. See datachecksumsworker.c for a longer discussion on
+ * how data checksums can be enabled in a running cluster.
+ *
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOn(bool immediate_checkpoint)
 {
+	uint64		barrier;
+	int			flags;
+
 	Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+
+	/*
+	 * The only allowed state transition to "on" is from "inprogress-on" since
+	 * that state ensures that all pages will have data checksums written.
+	 */
+	if (XLogCtl->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		SpinLockRelease(&XLogCtl->info_lck);
+		elog(ERROR, "checksums not in \"inprogress-on\" mode");
+	}
+
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	INJECTION_POINT("datachecksums-enable-checksums-delay", NULL);
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_VERSION);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state transition of "on" in all backends. When done we know that
+	 * data checksums are enabled in all backends and data checksums are both
+	 * written and verified.
+	 */
+	WaitForProcSignalBarrier(barrier);
+
+	INJECTION_POINT("datachecksums-enable-checksums-pre-checkpoint", NULL);
+
+	/* XXX is this needed? */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_FAST;
+	RequestCheckpoint(flags);
+}
+
+/*
+ * SetDataChecksumsOff
+ *		Disables data checksums cluster-wide
+ *
+ * Disabling data checksums must be performed with two sets of barriers, each
+ * carrying a different state. The state is first set to "inprogress-off"
+ * during which checksums are still written but not verified. This ensures that
+ * backends which have yet to observe the state change from "on" won't get
+ * validation errors on concurrently modified pages. Once all backends have
+ * changed to "inprogress-off", the barrier for moving to "off" can be emitted.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOff(bool immediate_checkpoint)
+{
+	uint64		barrier;
+	int			flags;
+
+	Assert(ControlFile);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+
+	/* If data checksums are already disabled there is nothing to do */
+	if (XLogCtl->data_checksum_version == 0)
+	{
+		SpinLockRelease(&XLogCtl->info_lck);
+		return;
+	}
+
+	/*
+	 * If data checksums are currently enabled we first transition to the
+	 * "inprogress-off" state during which backends continue to write
+	 * checksums without verifying them. When all backends are in
+	 * "inprogress-off" the next transition to "off" can be performed, after
+	 * which all data checksum processing is disabled.
+	 */
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
+	{
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+		START_CRIT_SECTION();
+
+		XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+
+		END_CRIT_SECTION();
+		MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+		/*
+		 * Update local state in all backends to ensure that any backend in
+		 * "on" state is changed to "inprogress-off".
+		 */
+		WaitForProcSignalBarrier(barrier);
+
+		/*
+		 * force checkpoint to persist the current checksum state in control
+		 * file etc.
+		 *
+		 * XXX is this safe? What if the crash/shutdown happens while waiting
+		 * for the checkpoint? Also, should we persist the checksum first and
+		 * only then flip the flag in XLogCtl?
+		 */
+		INJECTION_POINT("datachecksums-disable-checksums-pre-checkpoint", NULL);
+		flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+		if (immediate_checkpoint)
+			flags = flags | CHECKPOINT_FAST;
+		RequestCheckpoint(flags);
+
+		/*
+		 * At this point we know that no backends are verifying data checksums
+		 * during reading. Next, we can safely move to state "off" to also
+		 * stop writing checksums.
+		 */
+	}
+	else
+	{
+		/*
+		 * Ending up here implies that the checksums state is "inprogress-on"
+		 * or "inprogress-off" and we can transition directly to "off" from
+		 * there.
+		 */
+		SpinLockRelease(&XLogCtl->info_lck);
+	}
+
+	/*
+	 * Ensure that we don't incur a checkpoint during disabling checksums.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(0);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = 0;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	WaitForProcSignalBarrier(barrier);
+
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_FAST;
+	RequestCheckpoint(flags);
+}
+
+/*
+ * AbsorbDataChecksumsBarrier
+ *		Generic function for absorbing data checksum state changes
+ *
+ * All procsignalbarriers regarding data checksum state changes are absorbed
+ * with this function.  The set of conditions required for the state change to
+ * be accepted are listed in the checksum_barriers struct, target_state is
+ * used to look up the relevant entry.
+ */
+bool
+AbsorbDataChecksumsBarrier(int target_state)
+{
+	const		ChecksumBarrierCondition *condition = checksum_barriers;
+	int			current = LocalDataChecksumVersion;
+	bool		found = false;
+
+	/*
+	 * Find the barrier condition definition for the target state. Not finding
+	 * a condition would be a grave programmer error as the states are a
+	 * discrete set.
+	 */
+	while (condition->target != target_state && condition->target != -1)
+		condition++;
+	if (unlikely(condition->target == -1))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("invalid target state %i for data checksum barrier",
+					   target_state));
+
+	/*
+	 * The current state MUST be equal to one of the EQ states defined in this
+	 * barrier condition, or equal to the target_state if - and only if -
+	 * InitialDataChecksumTransition is true.
+	 */
+	for (int i = 0; i < condition->barrier_eq_sz; i++)
+	{
+		if (current == condition->barrier_eq[i] ||
+			condition->barrier_eq[i] == PG_DATA_CHECKSUM_ANY_VERSION)
+			found = true;
+	}
+	if (InitialDataChecksumTransition && current == target_state)
+		found = true;
+
+	/*
+	 * The current state MUST NOT be equal to any of the NE states defined in
+	 * this barrier condition.
+	 */
+	for (int i = 0; i < condition->barrier_ne_sz; i++)
+	{
+		if (current == condition->barrier_ne[i])
+			found = false;
+	}
+
+	/*
+	 * If the relevent state criteria aren't satisfied, throw an error which
+	 * will be caught by the procsignal machinery for a later retry.
+	 */
+	if (!found)
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("incorrect data checksum state %i for target state %i",
+					   current, target_state));
+
+	SetLocalDataChecksumVersion(target_state);
+	InitialDataChecksumTransition = false;
+	return true;
+}
+
+/*
+ * InitLocalControlData
+ *
+ * Set up backend local caches of controldata variables which may change at
+ * any point during runtime and thus require special cased locking. So far
+ * this only applies to data_checksum_version, but it's intended to be general
+ * purpose enough to handle future cases.
+ */
+void
+InitLocalDataChecksumVersion(void)
+{
+	SpinLockAcquire(&XLogCtl->info_lck);
+	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+	SpinLockRelease(&XLogCtl->info_lck);
+}
+
+void
+SetLocalDataChecksumVersion(uint32 data_checksum_version)
+{
+	LocalDataChecksumVersion = data_checksum_version;
+
+	data_checksums = data_checksum_version;
+}
+
+/* guc hook */
+const char *
+show_data_checksums(void)
+{
+	if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)
+		return "on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		return "inprogress-on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+		return "inprogress-off";
+	else
+		return "off";
 }
 
 /*
@@ -4892,6 +5396,7 @@ LocalProcessControlFile(bool reset)
 	Assert(reset || ControlFile == NULL);
 	ControlFile = palloc(sizeof(ControlFileData));
 	ReadControlFile();
+	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
 }
 
 /*
@@ -5061,6 +5566,11 @@ XLOGShmemInit(void)
 	XLogCtl->InstallXLogFileSegmentActive = false;
 	XLogCtl->WalWriterSleeping = false;
 
+	/* Use the checksum info from control file */
+	XLogCtl->data_checksum_version = ControlFile->data_checksum_version;
+
+	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+
 	SpinLockInit(&XLogCtl->Insert.insertpos_lck);
 	SpinLockInit(&XLogCtl->info_lck);
 	pg_atomic_init_u64(&XLogCtl->logInsertResult, InvalidXLogRecPtr);
@@ -6202,6 +6712,47 @@ StartupXLOG(void)
 	pfree(endOfRecoveryInfo->recoveryStopReason);
 	pfree(endOfRecoveryInfo);
 
+	/*
+	 * If we reach this point with checksums in the state inprogress-on, it
+	 * means that data checksums were in the process of being enabled when the
+	 * cluster shut down. Since processing didn't finish, the operation will
+	 * have to be restarted from scratch since there is no capability to
+	 * continue where it was when the cluster shut down. Thus, revert the
+	 * state back to off, and inform the user with a warning message. Being
+	 * able to restart processing is a TODO, but it wouldn't be possible to
+	 * restart here since we cannot launch a dynamic background worker
+	 * directly from here (it has to be from a regular backend).
+	 */
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		XLogChecksums(0);
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = 0;
+		SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		ereport(WARNING,
+				(errmsg("data checksums state has been set of off"),
+				 errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
+	}
+
+	/*
+	 * If data checksums were being disabled when the cluster was shut down,
+	 * we know that we have a state where all backends have stopped validating
+	 * checksums and we can move to off instead of prompting the user to
+	 * perform any action.
+	 */
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+	{
+		XLogChecksums(0);
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = 0;
+		SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+		SpinLockRelease(&XLogCtl->info_lck);
+	}
+
 	/*
 	 * All done with end-of-recovery actions.
 	 *
@@ -6493,7 +7044,7 @@ GetRedoRecPtr(void)
 	XLogRecPtr	ptr;
 
 	/*
-	 * The possibly not up-to-date copy in XlogCtl is enough. Even if we
+	 * The possibly not up-to-date copy in XLogCtl is enough. Even if we
 	 * grabbed a WAL insertion lock to read the authoritative value in
 	 * Insert->RedoRecPtr, someone might update it just after we've released
 	 * the lock.
@@ -7057,6 +7608,12 @@ CreateCheckPoint(int flags)
 	checkPoint.fullPageWrites = Insert->fullPageWrites;
 	checkPoint.wal_level = wal_level;
 
+	/*
+	 * Get the current data_checksum_version value from xlogctl, valid at the
+	 * time of the checkpoint.
+	 */
+	checkPoint.data_checksum_version = XLogCtl->data_checksum_version;
+
 	if (shutdown)
 	{
 		XLogRecPtr	curInsert = XLogBytePosToRecPtr(Insert->CurrBytePos);
@@ -7312,6 +7869,9 @@ CreateCheckPoint(int flags)
 	ControlFile->minRecoveryPoint = InvalidXLogRecPtr;
 	ControlFile->minRecoveryPointTLI = 0;
 
+	/* make sure we start with the checksum version as of the checkpoint */
+	ControlFile->data_checksum_version = checkPoint.data_checksum_version;
+
 	/*
 	 * Persist unloggedLSN value. It's reset on crash recovery, so this goes
 	 * unused on non-shutdown checkpoints, but seems useful to store it always
@@ -7455,6 +8015,10 @@ CreateEndOfRecoveryRecord(void)
 	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
 	ControlFile->minRecoveryPoint = recptr;
 	ControlFile->minRecoveryPointTLI = xlrec.ThisTimeLineID;
+
+	/* start with the latest checksum version (as of the end of recovery) */
+	ControlFile->data_checksum_version = XLogCtl->data_checksum_version;
+
 	UpdateControlFile();
 	LWLockRelease(ControlFileLock);
 
@@ -7796,6 +8360,10 @@ CreateRestartPoint(int flags)
 			if (flags & CHECKPOINT_IS_SHUTDOWN)
 				ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY;
 		}
+
+		/* we shall start with the latest checksum version */
+		ControlFile->data_checksum_version = lastCheckPoint.data_checksum_version;
+
 		UpdateControlFile();
 	}
 	LWLockRelease(ControlFileLock);
@@ -8207,6 +8775,26 @@ XLogReportParameters(void)
 	}
 }
 
+/*
+ * Log the new state of checksums
+ */
+static void
+XLogChecksums(uint32 new_type)
+{
+	xl_checksum_state xlrec;
+	XLogRecPtr	recptr;
+
+	xlrec.new_checksumtype = new_type;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, sizeof(xl_checksum_state));
+
+	INJECTION_POINT("datachecksums-xlogchecksums-pre-xloginsert", &new_type);
+
+	recptr = XLogInsert(RM_XLOG_ID, XLOG_CHECKSUMS);
+	XLogFlush(recptr);
+}
+
 /*
  * Update full_page_writes in shared memory, and write an
  * XLOG_FPW_CHANGE record if necessary.
@@ -8641,6 +9229,74 @@ xlog_redo(XLogReaderState *record)
 	{
 		/* nothing to do here, just for informational purposes */
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state state;
+		uint64		barrier;
+
+		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
+
+		/*
+		 * XXX Could this end up written to the control file prematurely? IIRC
+		 * that happens during checkpoint, so what if that gets triggered e.g.
+		 * because someone runs CHECKPOINT? If we then crash (or something
+		 * like that), could that confuse the instance?
+		 */
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = state.new_checksumtype;
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		/*
+		 * Block on a procsignalbarrier to await all processes having seen the
+		 * change to checksum status. Once the barrier has been passed we can
+		 * initiate the corresponding processing.
+		 */
+		switch (state.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			default:
+				Assert(state.new_checksumtype == 0);
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+		}
+
+		/*
+		 * force creating a restart point for the first CHECKPOINT after
+		 * seeing XLOG_CHECKSUMS in WAL
+		 */
+		checksumRestartPoint = true;
+	}
+
+	if (checksumRestartPoint &&
+		(info == XLOG_CHECKPOINT_ONLINE ||
+		 info == XLOG_CHECKPOINT_REDO ||
+		 info == XLOG_CHECKPOINT_SHUTDOWN))
+	{
+		int			flags;
+
+		elog(LOG, "forcing creation of a restart point after XLOG_CHECKSUMS");
+
+		/* We explicitly want an immediate checkpoint here */
+		flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT | CHECKPOINT_FAST;
+		RequestCheckpoint(flags);
+
+		checksumRestartPoint = false;
+	}
 }
 
 /*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 8c3090165f0..d786374209f 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -26,6 +26,7 @@
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "postmaster/datachecksumsworker.h"
 #include "replication/walreceiver.h"
 #include "storage/fd.h"
 #include "storage/latch.h"
@@ -748,3 +749,59 @@ pg_promote(PG_FUNCTION_ARGS)
 						   wait_seconds)));
 	PG_RETURN_BOOL(false);
 }
+
+/*
+ * Disables data checksums for the cluster, if applicable. Starts a background
+ * worker which turns off the data checksums.
+ */
+Datum
+disable_data_checksums(PG_FUNCTION_ARGS)
+{
+	bool		fast = PG_GETARG_BOOL(0);
+
+	ereport(LOG,
+			errmsg("disable_data_checksums, fast: %d", fast));
+
+	if (!superuser())
+		ereport(ERROR,
+				errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+				errmsg("must be superuser to change data checksum state"));
+
+	StartDataChecksumsWorkerLauncher(DISABLE_DATACHECKSUMS, 0, 0, fast);
+	PG_RETURN_VOID();
+}
+
+/*
+ * Enables data checksums for the cluster, if applicable.  Supports vacuum-
+ * like cost based throttling to limit system load. Starts a background worker
+ * which updates data checksums on existing data.
+ */
+Datum
+enable_data_checksums(PG_FUNCTION_ARGS)
+{
+	int			cost_delay = PG_GETARG_INT32(0);
+	int			cost_limit = PG_GETARG_INT32(1);
+	bool		fast = PG_GETARG_BOOL(2);
+
+	ereport(LOG,
+			errmsg("enable_data_checksums, cost_delay: %d cost_limit: %d fast: %d", cost_delay, cost_limit, fast));
+
+	if (!superuser())
+		ereport(ERROR,
+				errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+				errmsg("must be superuser to change data checksum state"));
+
+	if (cost_delay < 0)
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("cost delay cannot be a negative value"));
+
+	if (cost_limit <= 0)
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("cost limit must be greater than zero"));
+
+	StartDataChecksumsWorkerLauncher(ENABLE_DATACHECKSUMS, cost_delay, cost_limit, fast);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 93c50831b26..e8c39471607 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -782,6 +782,10 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 		CheckPointTLI = ControlFile->checkPointCopy.ThisTimeLineID;
 		RedoStartLSN = ControlFile->checkPointCopy.redo;
 		RedoStartTLI = ControlFile->checkPointCopy.ThisTimeLineID;
+
+		elog(LOG, "InitWalRecovery checkpoint %X/%08X redo %X/%08X",
+			 LSN_FORMAT_ARGS(CheckPointLoc), LSN_FORMAT_ARGS(RedoStartLSN));
+
 		record = ReadCheckpointRecord(xlogprefetcher, CheckPointLoc,
 									  CheckPointTLI);
 		if (record != NULL)
@@ -1665,6 +1669,9 @@ PerformWalRecovery(void)
 	bool		reachedRecoveryTarget = false;
 	TimeLineID	replayTLI;
 
+	elog(LOG, "PerformWalRecovery checkpoint %X/%08X redo %X/%08X",
+		 LSN_FORMAT_ARGS(CheckPointLoc), LSN_FORMAT_ARGS(RedoStartLSN));
+
 	/*
 	 * Initialize shared variables for tracking progress of WAL replay, as if
 	 * we had just replayed the record before the REDO location (or the
@@ -1673,12 +1680,14 @@ PerformWalRecovery(void)
 	SpinLockAcquire(&XLogRecoveryCtl->info_lck);
 	if (RedoStartLSN < CheckPointLoc)
 	{
+		elog(LOG, "(RedoStartLSN < CheckPointLoc)");
 		XLogRecoveryCtl->lastReplayedReadRecPtr = InvalidXLogRecPtr;
 		XLogRecoveryCtl->lastReplayedEndRecPtr = RedoStartLSN;
 		XLogRecoveryCtl->lastReplayedTLI = RedoStartTLI;
 	}
 	else
 	{
+		elog(LOG, "(RedoStartLSN >= CheckPointLoc)");
 		XLogRecoveryCtl->lastReplayedReadRecPtr = xlogreader->ReadRecPtr;
 		XLogRecoveryCtl->lastReplayedEndRecPtr = xlogreader->EndRecPtr;
 		XLogRecoveryCtl->lastReplayedTLI = CheckPointTLI;
@@ -1690,6 +1699,10 @@ PerformWalRecovery(void)
 	XLogRecoveryCtl->recoveryPauseState = RECOVERY_NOT_PAUSED;
 	SpinLockRelease(&XLogRecoveryCtl->info_lck);
 
+	elog(LOG, "PerformWalRecovery lastReplayedReadRecPtr %X/%08X lastReplayedEndRecPtr %X/%08X",
+		 LSN_FORMAT_ARGS(XLogRecoveryCtl->lastReplayedReadRecPtr),
+		 LSN_FORMAT_ARGS(XLogRecoveryCtl->lastReplayedEndRecPtr));
+
 	/* Also ensure XLogReceiptTime has a sane value */
 	XLogReceiptTime = GetCurrentTimestamp();
 
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index bb7d90aa5d9..54dcfbcb333 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1613,7 +1613,8 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	 * enabled for this cluster, and if this is a relation file, then verify
 	 * the checksum.
 	 */
-	if (!noverify_checksums && DataChecksumsEnabled() &&
+	if (!noverify_checksums &&
+		DataChecksumsNeedWrite() &&
 		RelFileNumberIsValid(relfilenumber))
 		verify_checksum = true;
 
@@ -2007,6 +2008,9 @@ verify_page_checksum(Page page, XLogRecPtr start_lsn, BlockNumber blkno,
 	if (PageIsNew(page) || PageGetLSN(page) >= start_lsn)
 		return true;
 
+	if (!DataChecksumsNeedVerify())
+		return true;
+
 	/* Perform the actual checksum calculation. */
 	checksum = pg_checksum_page(page, blkno);
 
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 2d946d6d9e9..0bded82b84c 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -657,6 +657,22 @@ LANGUAGE INTERNAL
 STRICT VOLATILE PARALLEL UNSAFE
 AS 'pg_replication_origin_session_setup';
 
+CREATE OR REPLACE FUNCTION
+  pg_enable_data_checksums(cost_delay integer DEFAULT 0,
+                           cost_limit integer DEFAULT 100,
+						   fast boolean DEFAULT false)
+RETURNS void
+STRICT VOLATILE LANGUAGE internal
+PARALLEL RESTRICTED
+AS 'enable_data_checksums';
+
+CREATE OR REPLACE FUNCTION
+  pg_disable_data_checksums(fast boolean DEFAULT false)
+RETURNS void
+STRICT VOLATILE LANGUAGE internal
+PARALLEL RESTRICTED
+AS 'disable_data_checksums';
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
@@ -782,6 +798,10 @@ REVOKE EXECUTE ON FUNCTION pg_ls_logicalmapdir() FROM PUBLIC;
 
 REVOKE EXECUTE ON FUNCTION pg_ls_replslotdir(text) FROM PUBLIC;
 
+REVOKE EXECUTE ON FUNCTION pg_enable_data_checksums(integer, integer, boolean) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_disable_data_checksums(boolean) FROM public;
+
 --
 -- We also set up some things as accessible to standard roles.
 --
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index dec8df4f8ee..fe149aabdbe 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1371,6 +1371,26 @@ CREATE VIEW pg_stat_progress_copy AS
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_data_checksums AS
+    SELECT
+        S.pid AS pid, S.datid, D.datname AS datname,
+        CASE S.param1 WHEN 0 THEN 'enabling'
+                      WHEN 1 THEN 'disabling'
+                      WHEN 2 THEN 'waiting'
+                      WHEN 3 THEN 'waiting on temporary tables'
+                      WHEN 4 THEN 'waiting on checkpoint'
+                      WHEN 5 THEN 'done'
+                      END AS phase,
+        CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+        S.param3 AS databases_done,
+        CASE S.param4 WHEN -1 THEN NULL ELSE S.param4 END AS relations_total,
+        CASE S.param5 WHEN -1 THEN NULL ELSE S.param5 END AS relations_done,
+        CASE S.param6 WHEN -1 THEN NULL ELSE S.param6 END AS blocks_total,
+        CASE S.param7 WHEN -1 THEN NULL ELSE S.param7 END AS blocks_done
+    FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid
+    ORDER BY S.datid; -- return the launcher process first
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/postmaster/Makefile b/src/backend/postmaster/Makefile
index 0f4435d2d97..0c36765acfe 100644
--- a/src/backend/postmaster/Makefile
+++ b/src/backend/postmaster/Makefile
@@ -18,6 +18,7 @@ OBJS = \
 	bgworker.o \
 	bgwriter.o \
 	checkpointer.o \
+	datachecksumsworker.o \
 	fork_process.o \
 	interrupt.o \
 	launch_backend.o \
diff --git a/src/backend/postmaster/auxprocess.c b/src/backend/postmaster/auxprocess.c
index a6d3630398f..5742a1dd724 100644
--- a/src/backend/postmaster/auxprocess.c
+++ b/src/backend/postmaster/auxprocess.c
@@ -15,6 +15,7 @@
 #include <unistd.h>
 #include <signal.h>
 
+#include "access/xlog.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/auxprocess.h"
@@ -68,6 +69,24 @@ AuxiliaryProcessMainCommon(void)
 
 	ProcSignalInit(NULL, 0);
 
+	/*
+	 * Initialize a local cache of the data_checksum_version, to be updated by
+	 * the procsignal-based barriers.
+	 *
+	 * This intentionally happens after initializing the procsignal, otherwise
+	 * we might miss a state change. This means we can get a barrier for the
+	 * state we've just initialized - but it can happen only once.
+	 *
+	 * The postmaster (which is what gets forked into the new child process)
+	 * does not handle barriers, therefore it may not have the current value
+	 * of LocalDataChecksumVersion value (it'll have the value read from the
+	 * control file, which may be arbitrarily old).
+	 *
+	 * NB: Even if the postmaster handled barriers, the value might still be
+	 * stale, as it might have changed after this process forked.
+	 */
+	InitLocalDataChecksumVersion();
+
 	/*
 	 * Auxiliary processes don't run transactions, but they may need a
 	 * resource owner anyway to manage buffer pins acquired outside
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 142a02eb5e9..ed3dc05406c 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -18,6 +18,7 @@
 #include "pgstat.h"
 #include "port/atomics.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "replication/logicallauncher.h"
 #include "replication/logicalworker.h"
@@ -135,6 +136,12 @@ static const struct
 	},
 	{
 		"SequenceSyncWorkerMain", SequenceSyncWorkerMain
+	},
+	{
+		"DataChecksumsWorkerLauncherMain", DataChecksumsWorkerLauncherMain
+	},
+	{
+		"DataChecksumsWorkerMain", DataChecksumsWorkerMain
 	}
 };
 
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
new file mode 100644
index 00000000000..3deb57a96de
--- /dev/null
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -0,0 +1,1471 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.c
+ *	  Background worker for enabling or disabling data checksums online
+ *
+ * When enabling data checksums on a database at initdb time or when shut down
+ * with pg_checksums, no extra process is required as each page is checksummed,
+ * and verified, when accessed.  When enabling checksums on an already running
+ * cluster, this worker will ensure that all pages are checksummed before
+ * verification of the checksums is turned on. In the case of disabling
+ * checksums, the state transition is performed only in the control file, no
+ * changes are performed on the data pages.
+ *
+ * Checksums can be either enabled or disabled cluster-wide, with on/off being
+ * the end state for data_checksums.
+ *
+ * Enabling checksums
+ * ------------------
+ * When enabling checksums in an online cluster, data_checksums will be set to
+ * "inprogress-on" which signals that write operations MUST compute and write
+ * the checksum on the data page, but during reading the checksum SHALL NOT be
+ * verified. This ensures that all objects created during checksumming will
+ * have checksums set, but no reads will fail due to incorrect checksum. The
+ * DataChecksumsWorker will compile a list of databases which exist at the
+ * start of checksumming, and all of these which haven't been dropped during
+ * the processing MUST have been processed successfully in order for checksums
+ * to be enabled. Any new relation created during processing will see the
+ * in-progress state and will automatically be checksummed.
+ *
+ * For each database, all relations which have storage are read and every data
+ * page is marked dirty to force a write with the checksum. This will generate
+ * a lot of WAL as the entire database is read and written.
+ *
+ * If the processing is interrupted by a cluster restart, it will be restarted
+ * from the beginning again as state isn't persisted.
+ *
+ * Disabling checksums
+ * -------------------
+ * When disabling checksums, data_checksums will be set to "inprogress-off"
+ * which signals that checksums are written but no longer verified. This ensure
+ * that backends which have yet to move from the "on" state will still be able
+ * to process data checksum validation.
+ *
+ * Synchronization and Correctness
+ * -------------------------------
+ * The processes involved in enabling, or disabling, data checksums in an
+ * online cluster must be properly synchronized with the normal backends
+ * serving concurrent queries to ensure correctness. Correctness is defined
+ * as the following:
+ *
+ *    - Backends SHALL NOT violate local data_checksums state
+ *    - Data checksums SHALL NOT be considered enabled cluster-wide until all
+ *      currently connected backends have the local state "enabled"
+ *
+ * There are two levels of synchronization required for enabling data checksums
+ * in an online cluster: (i) changing state in the active backends ("on",
+ * "off", "inprogress-on" and "inprogress-off"), and (ii) ensuring no
+ * incompatible objects and processes are left in a database when workers end.
+ * The former deals with cluster-wide agreement on data checksum state and the
+ * latter with ensuring that any concurrent activity cannot break the data
+ * checksum contract during processing.
+ *
+ * Synchronizing the state change is done with procsignal barriers, where the
+ * WAL logging backend updating the global state in the controlfile will wait
+ * for all other backends to absorb the barrier. Barrier absorption will happen
+ * during interrupt processing, which means that connected backends will change
+ * state at different times. To prevent data checksum state changes when
+ * writing and verifying checksums, interrupts shall be held off before
+ * interrogating state and resumed when the IO operation has been performed.
+ *
+ *   When Enabling Data Checksums
+ *   ----------------------------
+ *   A process which fails to observe data checksums being enabled can induce
+ *   two types of errors: failing to write the checksum when modifying the page
+ *   and failing to validate the data checksum on the page when reading it.
+ *
+ *   When processing starts all backends belong to one of the below sets, with
+ *   one set being empty:
+ *
+ *   Bd: Backends in "off" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   If processing is started in an online cluster then all backends are in Bd.
+ *   If processing was halted by the cluster shutting down, the controlfile
+ *   state "inprogress-on" will be observed on system startup and all backends
+ *   will be in Bd. Backends transition Bd -> Bi via a procsignalbarrier.  When
+ *   the DataChecksumsWorker has finished writing checksums on all pages and
+ *   enables data checksums cluster-wide, there are four sets of backends where
+ *   Bd shall be an empty set:
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   Backends in Bi and Be will write checksums when modifying a page, but only
+ *   backends in Be will verify the checksum during reading. The Bg backend is
+ *   blocked waiting for all backends in Bi to process interrupts and move to
+ *   Be. Any backend starting while Bg is waiting on the procsignalbarrier will
+ *   observe the global state being "on" and will thus automatically belong to
+ *   Be.  Checksums are enabled cluster-wide when Bi is an empty set. Bi and Be
+ *   are compatible sets while still operating based on their local state as
+ *   both write data checksums.
+ *
+ *   When Disabling Data Checksums
+ *   -----------------------------
+ *   A process which fails to observe that data checksums have been disabled
+ *   can induce two types of errors: writing the checksum when modifying the
+ *   page and validating a data checksum which is no longer correct due to
+ *   modifications to the page.
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bo: Backends in "inprogress-off" state
+ *
+ *   Backends transition from the Be state to Bd like so: Be -> Bo -> Bd
+ *
+ *   The goal is to transition all backends to Bd making the others empty sets.
+ *   Backends in Bo write data checksums, but don't validate them, such that
+ *   backends still in Be can continue to validate pages until the barrier has
+ *   been absorbed such that they are in Bo. Once all backends are in Bo, the
+ *   barrier to transition to "off" can be raised and all backends can safely
+ *   stop writing data checksums as no backend is enforcing data checksum
+ *   validation any longer.
+ *
+ *
+ * Potential optimizations
+ * -----------------------
+ * Below are some potential optimizations and improvements which were brought
+ * up during reviews of this feature, but which weren't implemented in the
+ * initial version. These are ideas listed without any validation on their
+ * feasibility or potential payoff. More discussion on these can be found on
+ * the -hackers threads linked to in the commit message of this feature.
+ *
+ *   * Launching datachecksumsworker for resuming operation from the startup
+ *     process: Currently users have to restart processing manually after a
+ *     restart since dynamic background worker cannot be started from the
+ *     postmaster. Changing the startup process could make restarting the
+ *     processing automatic on cluster restart.
+ *   * Avoid dirtying the page when checksums already match: Iff the checksum
+ *     on the page happens to already match we still dirty the page. It should
+ *     be enough to only do the log_newpage_buffer() call in that case.
+ *   * Invent a lightweight WAL record that doesn't contain the full-page
+ *     image but just the block number: On replay, the redo routine would read
+ *     the page from disk.
+ *   * Teach pg_checksums to avoid checksummed pages when pg_checksums is used
+ *     to enable checksums on a cluster which is in inprogress-on state and
+ *     may have checksummed pages (make pg_checksums be able to resume an
+ *     online operation).
+ *
+ *
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/postmaster/datachecksumsworker.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/htup_details.h"
+#include "access/xact.h"
+#include "access/xloginsert.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_database.h"
+#include "commands/progress.h"
+#include "commands/vacuum.h"
+#include "common/relpath.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "postmaster/bgworker.h"
+#include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/bufmgr.h"
+#include "storage/checksum.h"
+#include "storage/ipc.h"
+#include "storage/lmgr.h"
+#include "storage/lwlock.h"
+#include "storage/procarray.h"
+#include "storage/smgr.h"
+#include "tcop/tcopprot.h"
+#include "utils/fmgroids.h"
+#include "utils/injection_point.h"
+#include "utils/lsyscache.h"
+#include "utils/ps_status.h"
+#include "utils/syscache.h"
+
+/*
+ * Number of times we retry to open a database before giving up and consider
+ * it to have failed processing.
+ */
+#define DATACHECKSUMSWORKER_MAX_DB_RETRIES 5
+
+/*
+ * Signaling between backends calling pg_enable/disable_data_checksums, the
+ * checksums launcher process, and the checksums worker process.
+ *
+ * This struct is protected by DataChecksumsWorkerLock
+ */
+typedef struct DataChecksumsWorkerShmemStruct
+{
+	/*
+	 * These are set by pg_{enable|disable|verify}_data_checksums, to tell the
+	 * launcher what the target state is.
+	 */
+	DataChecksumsWorkerOperation launch_operation;
+	int			launch_cost_delay;
+	int			launch_cost_limit;
+	bool		launch_fast;
+
+	/*
+	 * Is a launcher process is currently running?
+	 *
+	 * This is set by the launcher process, after it has read the above
+	 * launch_* parameters.
+	 */
+	bool		launcher_running;
+
+	/*
+	 * These fields indicate the target state that the launcher is currently
+	 * working towards. They can be different from the corresponding launch_*
+	 * fields, if a new pg_enable/disable_data_checksums() call was made while
+	 * the launcher/worker was already running.
+	 *
+	 * The below members are set when the launcher starts, and are only
+	 * accessed read-only by the single worker. Thus, we can access these
+	 * without a lock. If multiple workers, or dynamic cost parameters, are
+	 * supported at some point then this would need to be revisited.
+	 */
+	DataChecksumsWorkerOperation operation;
+	int			cost_delay;
+	int			cost_limit;
+	bool		immediate_checkpoint;
+
+	/*
+	 * Signaling between the launcher and the worker process.
+	 *
+	 * As there is only a single worker, and the launcher won't read these
+	 * until the worker exits, they can be accessed without the need for a
+	 * lock. If multiple workers are supported then this will have to be
+	 * revisited.
+	 */
+
+	/* result, set by worker before exiting */
+	DataChecksumsWorkerResult success;
+
+	/*
+	 * tells the worker process whether it should also process the shared
+	 * catalogs
+	 */
+	bool		process_shared_catalogs;
+} DataChecksumsWorkerShmemStruct;
+
+/* Shared memory segment for datachecksumsworker */
+static DataChecksumsWorkerShmemStruct *DataChecksumsWorkerShmem;
+
+typedef struct DataChecksumsWorkerDatabase
+{
+	Oid			dboid;
+	char	   *dbname;
+} DataChecksumsWorkerDatabase;
+
+typedef struct DataChecksumsWorkerResultEntry
+{
+	Oid			dboid;
+	DataChecksumsWorkerResult result;
+	int			retries;
+} DataChecksumsWorkerResultEntry;
+
+
+/*
+ * Flag set by the interrupt handler
+ */
+static volatile sig_atomic_t abort_requested = false;
+
+/*
+ * Have we set the DataChecksumsWorkerShmemStruct->launcher_running flag?
+ * If we have, we need to clear it before exiting!
+ */
+static volatile sig_atomic_t launcher_running = false;
+
+/*
+ * Are we enabling data checksums, or disabling them?
+ */
+static DataChecksumsWorkerOperation operation;
+
+/* Prototypes */
+static List *BuildDatabaseList(void);
+static List *BuildRelationList(bool temp_relations, bool include_shared);
+static void FreeDatabaseList(List *dblist);
+static DataChecksumsWorkerResult ProcessDatabase(DataChecksumsWorkerDatabase *db);
+static bool ProcessAllDatabases(bool immediate_checkpoint);
+static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy);
+static void launcher_cancel_handler(SIGNAL_ARGS);
+static void WaitForAllTransactionsToFinish(void);
+
+/*
+ * StartDataChecksumsWorkerLauncher
+ *		Main entry point for datachecksumsworker launcher process
+ *
+ * The main entrypoint for starting data checksums processing for enabling as
+ * well as disabling.
+ */
+void
+StartDataChecksumsWorkerLauncher(DataChecksumsWorkerOperation op,
+								 int cost_delay,
+								 int cost_limit,
+								 bool fast)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	bool		launcher_running;
+
+#ifdef USE_ASSERT_CHECKING
+	/* The cost delay settings have no effect when disabling */
+	if (op == DISABLE_DATACHECKSUMS)
+		Assert(cost_delay == 0 && cost_limit == 0);
+#endif
+
+	/* Store the desired state in shared memory */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	DataChecksumsWorkerShmem->launch_operation = op;
+	DataChecksumsWorkerShmem->launch_cost_delay = cost_delay;
+	DataChecksumsWorkerShmem->launch_cost_limit = cost_limit;
+	DataChecksumsWorkerShmem->launch_fast = fast;
+
+	/* is the launcher already running? */
+	launcher_running = DataChecksumsWorkerShmem->launcher_running;
+
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * Launch a new launcher process, if it's not running already.
+	 *
+	 * If the launcher is currently busy enabling the checksums, and we want
+	 * them disabled (or vice versa), the launcher will notice that at latest
+	 * when it's about to exit, and will loop back process the new request. So
+	 * if the launcher is already running, we don't need to do anything more
+	 * here to abort it.
+	 *
+	 * If you call pg_enable/disable_data_checksums() twice in a row, before
+	 * the launcher has had a chance to start up, we still end up launching it
+	 * twice.  That's OK, the second invocation will see that a launcher is
+	 * already running and exit quickly.
+	 *
+	 * TODO: We could optimize here and skip launching the launcher, if we are
+	 * already in the desired state, i.e. if the checksums are already enabled
+	 * and you call pg_enable_data_checksums().
+	 */
+	if (!launcher_running)
+	{
+		/*
+		 * Prepare the BackgroundWorker and launch it.
+		 */
+		memset(&bgw, 0, sizeof(bgw));
+		bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+		bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+		snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+		snprintf(bgw.bgw_function_name, BGW_MAXLEN, "DataChecksumsWorkerLauncherMain");
+		snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksum launcher");
+		snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksum launcher");
+		bgw.bgw_restart_time = BGW_NEVER_RESTART;
+		bgw.bgw_notify_pid = MyProcPid;
+		bgw.bgw_main_arg = (Datum) 0;
+
+		if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+			ereport(ERROR,
+					errcode(ERRCODE_INSUFFICIENT_RESOURCES),
+					errmsg("failed to start background worker to process data checksums"));
+	}
+}
+
+/*
+ * ProcessSingleRelationFork
+ *		Enable data checksums in a single relation/fork.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy)
+{
+	BlockNumber numblocks = RelationGetNumberOfBlocksInFork(reln, forkNum);
+	char		activity[NAMEDATALEN * 2 + 128];
+	char	   *relns;
+
+	relns = get_namespace_name(RelationGetNamespace(reln));
+
+	if (!relns)
+		return false;
+
+	/* Report the current relation to pgstat_activity */
+	snprintf(activity, sizeof(activity) - 1, "processing: %s.%s (%s, %dblocks)",
+			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	/*
+	 * As of now we only update the block counter for main forks in order to
+	 * not cause too frequent calls. TODO: investigate whether we should do it
+	 * more frequent?
+	 */
+	if (forkNum == MAIN_FORKNUM)
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+									 numblocks);
+
+	/*
+	 * We are looping over the blocks which existed at the time of process
+	 * start, which is safe since new blocks are created with checksums set
+	 * already due to the state being "inprogress-on".
+	 */
+	for (BlockNumber blknum = 0; blknum < numblocks; blknum++)
+	{
+		Buffer		buf = ReadBufferExtended(reln, forkNum, blknum, RBM_NORMAL, strategy);
+
+		/* Need to get an exclusive lock before we can flag as dirty */
+		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+
+		/*
+		 * Mark the buffer as dirty and force a full page write.  We have to
+		 * re-write the page to WAL even if the checksum hasn't changed,
+		 * because if there is a replica it might have a slightly different
+		 * version of the page with an invalid checksum, caused by unlogged
+		 * changes (e.g. hintbits) on the master happening while checksums
+		 * were off. This can happen if there was a valid checksum on the page
+		 * at one point in the past, so only when checksums are first on, then
+		 * off, and then turned on again.  TODO: investigate if this could be
+		 * avoided if the checksum is calculated to be correct and wal_level
+		 * is set to "minimal",
+		 */
+		START_CRIT_SECTION();
+		MarkBufferDirty(buf);
+		log_newpage_buffer(buf, false);
+		END_CRIT_SECTION();
+
+		UnlockReleaseBuffer(buf);
+
+		/*
+		 * This is the only place where we check if we are asked to abort, the
+		 * abortion will bubble up from here. It's safe to check this without
+		 * a lock, because if we miss it being set, we will try again soon.
+		 */
+		Assert(operation == ENABLE_DATACHECKSUMS);
+		if (DataChecksumsWorkerShmem->launch_operation == DISABLE_DATACHECKSUMS)
+			abort_requested = true;
+
+		if (abort_requested)
+			return false;
+
+		/*
+		 * As of now we only update the block counter for main forks in order
+		 * to not cause too frequent calls. TODO: investigate whether we
+		 * should do it more frequent?
+		 */
+		if (forkNum == MAIN_FORKNUM)
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+										 (blknum + 1));
+
+		vacuum_delay_point(false);
+	}
+
+	pfree(relns);
+	return true;
+}
+
+/*
+ * ProcessSingleRelationByOid
+ *		Process a single relation based on oid.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationByOid(Oid relationId, BufferAccessStrategy strategy)
+{
+	Relation	rel;
+	bool		aborted = false;
+
+	StartTransactionCommand();
+
+	rel = try_relation_open(relationId, AccessShareLock);
+	if (rel == NULL)
+	{
+		/*
+		 * Relation no longer exists. We don't consider this an error since
+		 * there are no pages in it that need data checksums, and thus return
+		 * true. The worker operates off a list of relations generated at the
+		 * start of processing, so relations being dropped in the meantime is
+		 * to be expected.
+		 */
+		CommitTransactionCommand();
+		pgstat_report_activity(STATE_IDLE, NULL);
+		return true;
+	}
+	RelationGetSmgr(rel);
+
+	for (ForkNumber fnum = 0; fnum <= MAX_FORKNUM; fnum++)
+	{
+		if (smgrexists(rel->rd_smgr, fnum))
+		{
+			if (!ProcessSingleRelationFork(rel, fnum, strategy))
+			{
+				aborted = true;
+				break;
+			}
+		}
+	}
+	relation_close(rel, AccessShareLock);
+	elog(DEBUG2,
+		 "data checksum processing done for relation with OID %u: %s",
+		 relationId, (aborted ? "aborted" : "finished"));
+
+	CommitTransactionCommand();
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return !aborted;
+}
+
+/*
+ * ProcessDatabase
+ *		Enable data checksums in a single database.
+ *
+ * We do this by launching a dynamic background worker into this database, and
+ * waiting for it to finish.  We have to do this in a separate worker, since
+ * each process can only be connected to one database during its lifetime.
+ */
+static DataChecksumsWorkerResult
+ProcessDatabase(DataChecksumsWorkerDatabase *db)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	BgwHandleStatus status;
+	pid_t		pid;
+	char		activity[NAMEDATALEN + 64];
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_FAILED;
+
+	memset(&bgw, 0, sizeof(bgw));
+	bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+	bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+	snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+	snprintf(bgw.bgw_function_name, BGW_MAXLEN, "%s", "DataChecksumsWorkerMain");
+	snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksum worker");
+	snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksum worker");
+	bgw.bgw_restart_time = BGW_NEVER_RESTART;
+	bgw.bgw_notify_pid = MyProcPid;
+	bgw.bgw_main_arg = ObjectIdGetDatum(db->dboid);
+
+	/*
+	 * If there are no worker slots available, make sure we retry processing
+	 * this database. This will make the datachecksumsworker move on to the
+	 * next database and quite likely fail with the same problem. TODO: Maybe
+	 * we need a backoff to avoid running through all the databases here in
+	 * short order.
+	 */
+	if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+	{
+		ereport(WARNING,
+				errmsg("failed to start worker for enabling data checksums in database \"%s\", retrying",
+					   db->dbname),
+				errhint("The max_worker_processes setting might be too low."));
+		return DATACHECKSUMSWORKER_RETRYDB;
+	}
+
+	status = WaitForBackgroundWorkerStartup(bgw_handle, &pid);
+	if (status == BGWH_STOPPED)
+	{
+		ereport(WARNING,
+				errmsg("could not start background worker for enabling data checksums in database \"%s\"",
+					   db->dbname),
+				errhint("More details on the error might be found in the server log."));
+		return DATACHECKSUMSWORKER_FAILED;
+	}
+
+	/*
+	 * If the postmaster crashed we cannot end up with a processed database so
+	 * we have no alternative other than exiting. When enabling checksums we
+	 * won't at this time have changed the pg_control version to enabled so
+	 * when the cluster comes back up processing will have to be restarted.
+	 * When disabling, the pg_control version will be set to off before this
+	 * so when the cluster comes up checksums will be off as expected.
+	 */
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				errcode(ERRCODE_ADMIN_SHUTDOWN),
+				errmsg("cannot enable data checksums without the postmaster process"),
+				errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums()."));
+
+	Assert(status == BGWH_STARTED);
+	ereport(DEBUG1,
+			errmsg("initiating data checksum processing in database \"%s\"",
+				   db->dbname));
+
+	snprintf(activity, sizeof(activity) - 1,
+			 "Waiting for worker in database %s (pid %ld)", db->dbname, (long) pid);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	status = WaitForBackgroundWorkerShutdown(bgw_handle);
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				errcode(ERRCODE_ADMIN_SHUTDOWN),
+				errmsg("postmaster exited during data checksum processing in \"%s\"",
+					   db->dbname),
+				errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums()."));
+
+	if (DataChecksumsWorkerShmem->success == DATACHECKSUMSWORKER_ABORTED)
+		ereport(LOG,
+				errmsg("data checksums processing was aborted in database \"%s\"",
+					   db->dbname));
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return DataChecksumsWorkerShmem->success;
+}
+
+/*
+ * launcher_exit
+ *
+ * Internal routine for cleaning up state when the launcher process exits. We
+ * need to clean up the abort flag to ensure that processing can be restarted
+ * again after it was previously aborted.
+ */
+static void
+launcher_exit(int code, Datum arg)
+{
+	if (launcher_running)
+	{
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		launcher_running = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		LWLockRelease(DataChecksumsWorkerLock);
+	}
+}
+
+/*
+ * launcher_cancel_handler
+ *
+ * Internal routine for reacting to SIGINT and flagging the worker to abort.
+ * The worker won't be interrupted immediately but will check for abort flag
+ * between each block in a relation.
+ */
+static void
+launcher_cancel_handler(SIGNAL_ARGS)
+{
+	int			save_errno = errno;
+
+	abort_requested = true;
+
+	/*
+	 * There is no sleeping in the main loop, the flag will be checked
+	 * periodically in ProcessSingleRelationFork. The worker does however
+	 * sleep when waiting for concurrent transactions to end so we still need
+	 * to set the latch.
+	 */
+	SetLatch(MyLatch);
+
+	errno = save_errno;
+}
+
+/*
+ * WaitForAllTransactionsToFinish
+ *		Blocks awaiting all current transactions to finish
+ *
+ * Returns when all transactions which are active at the call of the function
+ * have ended, or if the postmaster dies while waiting. If the postmaster dies
+ * the abort flag will be set to indicate that the caller of this shouldn't
+ * proceed.
+ *
+ * NB: this will return early, if aborted by SIGINT or if the target state
+ * is changed while we're running.
+ */
+static void
+WaitForAllTransactionsToFinish(void)
+{
+	TransactionId waitforxid;
+
+	LWLockAcquire(XidGenLock, LW_SHARED);
+	waitforxid = XidFromFullTransactionId(TransamVariables->nextXid);
+	LWLockRelease(XidGenLock);
+
+	while (TransactionIdPrecedes(GetOldestActiveTransactionId(false, true), waitforxid))
+	{
+		char		activity[64];
+		int			rc;
+
+		/* Oldest running xid is older than us, so wait */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for current transactions to finish (waiting for %u)",
+				 waitforxid);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 3 seconds */
+		ResetLatch(MyLatch);
+		rc = WaitLatch(MyLatch,
+					   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+					   3000,
+					   WAIT_EVENT_CHECKSUM_ENABLE_STARTCONDITION);
+
+		/*
+		 * If the postmaster died we won't be able to enable checksums
+		 * cluster-wide so abort and hope to continue when restarted.
+		 */
+		if (rc & WL_POSTMASTER_DEATH)
+			ereport(FATAL,
+					errcode(ERRCODE_ADMIN_SHUTDOWN),
+					errmsg("postmaster exited during data checksum processing"),
+					errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums()."));
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_SHARED);
+		if (DataChecksumsWorkerShmem->launch_operation != operation)
+			abort_requested = true;
+		LWLockRelease(DataChecksumsWorkerLock);
+		if (abort_requested)
+			break;
+	}
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+	return;
+}
+
+/*
+ * DataChecksumsWorkerLauncherMain
+ *
+ * Main function for launching dynamic background workers for processing data
+ * checksums in databases. This function has the bgworker management, with
+ * ProcessAllDatabases being responsible for looping over the databases and
+ * initiating processing.
+ */
+void
+DataChecksumsWorkerLauncherMain(Datum arg)
+{
+	on_shmem_exit(launcher_exit, 0);
+
+	ereport(DEBUG1,
+			errmsg("background worker \"datachecksum launcher\" started"));
+
+	pqsignal(SIGTERM, die);
+	pqsignal(SIGINT, launcher_cancel_handler);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_LAUNCHER;
+	init_ps_display(NULL);
+
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	if (DataChecksumsWorkerShmem->launcher_running)
+	{
+		/* Launcher was already running, let it finish */
+		LWLockRelease(DataChecksumsWorkerLock);
+		return;
+	}
+
+	launcher_running = true;
+
+	/*
+	 * Initialize a connection to shared catalogs only.
+	 */
+	BackgroundWorkerInitializeConnectionByOid(InvalidOid, InvalidOid, 0);
+
+	operation = DataChecksumsWorkerShmem->launch_operation;
+	DataChecksumsWorkerShmem->launcher_running = true;
+	DataChecksumsWorkerShmem->operation = operation;
+	DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+	DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+	DataChecksumsWorkerShmem->immediate_checkpoint = DataChecksumsWorkerShmem->launch_fast;
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * The target state can change while we are busy enabling/disabling
+	 * checksums, if the user calls pg_disable/enable_data_checksums() before
+	 * we are finished with the previous request. In that case, we will loop
+	 * back here, to process the new request.
+	 */
+again:
+
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	if (operation == ENABLE_DATACHECKSUMS)
+	{
+		/*
+		 * If we are asked to enable checksums in a cluster which already has
+		 * checksums enabled, exit immediately as there is nothing more to do.
+		 * Hold interrupts to make sure state doesn't change during checking.
+		 */
+		HOLD_INTERRUPTS();
+		if (DataChecksumsNeedVerify())
+		{
+			RESUME_INTERRUPTS();
+			goto done;
+		}
+		RESUME_INTERRUPTS();
+
+		/*
+		 * Set the state to inprogress-on and wait on the procsignal barrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_ENABLING);
+		SetDataChecksumsOnInProgress(DataChecksumsWorkerShmem->immediate_checkpoint);
+
+		/*
+		 * All backends are now in inprogress-on state and are writing data
+		 * checksums.  Start processing all data at rest.
+		 */
+		if (!ProcessAllDatabases(DataChecksumsWorkerShmem->immediate_checkpoint))
+		{
+			/*
+			 * If the target state changed during processing then it's not a
+			 * failure, so restart processing instead.
+			 */
+			LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+			if (DataChecksumsWorkerShmem->launch_operation != operation)
+			{
+				LWLockRelease(DataChecksumsWorkerLock);
+				goto done;
+			}
+			LWLockRelease(DataChecksumsWorkerLock);
+			ereport(ERROR,
+					errcode(ERRCODE_INSUFFICIENT_RESOURCES),
+					errmsg("unable to enable data checksums in cluster"));
+		}
+
+		/*
+		 * Data checksums have been set on all pages, set the state to on in
+		 * order to instruct backends to validate checksums on reading.
+		 */
+		SetDataChecksumsOn(DataChecksumsWorkerShmem->immediate_checkpoint);
+	}
+	else
+	{
+		int			flags;
+
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_DISABLING);
+		SetDataChecksumsOff(DataChecksumsWorkerShmem->immediate_checkpoint);
+
+		flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+		if (DataChecksumsWorkerShmem->immediate_checkpoint)
+			flags = flags | CHECKPOINT_FAST;
+		RequestCheckpoint(flags);
+	}
+
+done:
+
+	/*
+	 * All done. But before we exit, check if the target state was changed
+	 * while we were running. In that case we will have to start all over
+	 * again.
+	 */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+	if (DataChecksumsWorkerShmem->launch_operation != operation)
+	{
+		DataChecksumsWorkerShmem->operation = DataChecksumsWorkerShmem->launch_operation;
+		operation = DataChecksumsWorkerShmem->launch_operation;
+		DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+		DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+		LWLockRelease(DataChecksumsWorkerLock);
+		goto again;
+	}
+
+	/* Shut down progress reporting as we are done */
+	pgstat_progress_end_command();
+
+	launcher_running = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	LWLockRelease(DataChecksumsWorkerLock);
+}
+
+/*
+ * ProcessAllDatabases
+ *		Compute the list of all databases and process checksums in each
+ *
+ * This will repeatedly generate a list of databases to process for enabling
+ * checksums. Until no new databases are found, this will loop around computing
+ * a new list and comparing it to the already seen ones.
+ *
+ * If immediate_checkpoint is set to true then a CHECKPOINT_FAST will be
+ * issued. This is useful for testing but should be avoided in production use
+ * as it may affect cluster performance drastically.
+ */
+static bool
+ProcessAllDatabases(bool immediate_checkpoint)
+{
+	List	   *DatabaseList;
+	HTAB	   *ProcessedDatabases = NULL;
+	HASHCTL		hash_ctl;
+	bool		found_failed = false;
+	int			flags;
+
+	/* Initialize a hash tracking all processed databases */
+	memset(&hash_ctl, 0, sizeof(hash_ctl));
+	hash_ctl.keysize = sizeof(Oid);
+	hash_ctl.entrysize = sizeof(DataChecksumsWorkerResultEntry);
+	ProcessedDatabases = hash_create("Processed databases",
+									 64,
+									 &hash_ctl,
+									 HASH_ELEM | HASH_BLOBS);
+
+	/*
+	 * Set up so first run processes shared catalogs, but not once in every
+	 * db.
+	 */
+	DataChecksumsWorkerShmem->process_shared_catalogs = true;
+
+	/*
+	 * Get a list of all databases to process. This may include databases that
+	 * were created during our runtime.  Since a database can be created as a
+	 * copy of any other database (which may not have existed in our last
+	 * run), we have to repeat this loop until no new databases show up in the
+	 * list.
+	 */
+	DatabaseList = BuildDatabaseList();
+
+	/* Allow a test case to modify the initial list of databases */
+	INJECTION_POINT("datachecksumsworker-initial-dblist", DatabaseList);
+
+	/*
+	 * Update progress reporting with the total number of databases we need to
+	 * process.  This number should not be changed during processing, the
+	 * columns for processed databases is instead increased such that it can
+	 * be compared against the total.
+	 */
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_DBS_TOTAL,
+			PROGRESS_DATACHECKSUMS_DBS_DONE,
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE,
+			PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+			PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+		};
+
+		int64		vals[6];
+
+		vals[0] = list_length(DatabaseList);
+		vals[1] = 0;
+
+		/* translated to NULL */
+		vals[2] = -1;
+		vals[3] = -1;
+		vals[4] = -1;
+		vals[5] = -1;
+
+		pgstat_progress_update_multi_param(6, index, vals);
+	}
+
+	while (true)
+	{
+		int			processed_databases = 0;
+
+		foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+		{
+			DataChecksumsWorkerResult result;
+			DataChecksumsWorkerResultEntry *entry;
+			bool		found;
+
+			/*
+			 * Check if this database has been processed already, and if so
+			 * whether it should be retried or skipped.
+			 */
+			entry = (DataChecksumsWorkerResultEntry *) hash_search(ProcessedDatabases, &db->dboid,
+																   HASH_FIND, NULL);
+
+			if (entry)
+			{
+				if (entry->result == DATACHECKSUMSWORKER_RETRYDB)
+				{
+					/*
+					 * Limit the number of retries to avoid infinite looping
+					 * in case there simply won't be enough workers in the
+					 * cluster to finish this operation.
+					 */
+					if (entry->retries > DATACHECKSUMSWORKER_MAX_DB_RETRIES)
+						entry->result = DATACHECKSUMSWORKER_FAILED;
+				}
+
+				/* Skip if this database has been processed already */
+				if (entry->result != DATACHECKSUMSWORKER_RETRYDB)
+					continue;
+			}
+
+			result = ProcessDatabase(db);
+			processed_databases++;
+
+			/*
+			 * Update the number of processed databases in the progress
+			 * report.
+			 */
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_DBS_DONE,
+										 processed_databases);
+
+			/* Allow a test process to alter the result of the operation */
+			INJECTION_POINT("datachecksumsworker-fail-db", &result);
+
+			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
+			{
+				/*
+				 * If one database has completed shared catalogs, we don't
+				 * have to process them again.
+				 */
+				if (DataChecksumsWorkerShmem->process_shared_catalogs)
+					DataChecksumsWorkerShmem->process_shared_catalogs = false;
+			}
+			else if (result == DATACHECKSUMSWORKER_ABORTED)
+			{
+				/* Abort flag set, so exit the whole process */
+				return false;
+			}
+
+			entry = hash_search(ProcessedDatabases, &db->dboid, HASH_ENTER, &found);
+			entry->dboid = db->dboid;
+			entry->result = result;
+			if (!found)
+				entry->retries = 0;
+			else
+				entry->retries++;
+		}
+
+		elog(DEBUG1,
+			 "%i databases processed for data checksum enabling, %s",
+			 processed_databases,
+			 (processed_databases ? "process with restart" : "process completed"));
+
+		FreeDatabaseList(DatabaseList);
+
+		/*
+		 * If no databases were processed in this run of the loop, we have now
+		 * finished all databases and no concurrently created ones can exist.
+		 */
+		if (processed_databases == 0)
+			break;
+
+		/*
+		 * Re-generate the list of databases for another pass. Since we wait
+		 * for all pre-existing transactions finish, this way we can be
+		 * certain that there are no databases left without checksums.
+		 */
+		WaitForAllTransactionsToFinish();
+		DatabaseList = BuildDatabaseList();
+	}
+
+	/*
+	 * ProcessedDatabases now has all databases and the results of their
+	 * processing. Failure to enable checksums for a database can be because
+	 * they actually failed for some reason, or because the database was
+	 * dropped between us getting the database list and trying to process it.
+	 * Get a fresh list of databases to detect the second case where the
+	 * database was dropped before we had started processing it. If a database
+	 * still exists, but enabling checksums failed then we fail the entire
+	 * checksumming process and exit with an error.
+	 */
+	WaitForAllTransactionsToFinish();
+	DatabaseList = BuildDatabaseList();
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+	{
+		DataChecksumsWorkerResultEntry *entry;
+		bool		found;
+
+		entry = hash_search(ProcessedDatabases, (void *) &db->dboid,
+							HASH_FIND, &found);
+
+		/*
+		 * We are only interested in the processed databases which failed, and
+		 * where the failed database still exists.  This indicates that
+		 * enabling checksums actually failed, and not that the failure was
+		 * due to the db being concurrently dropped.
+		 */
+		if (found && entry->result == DATACHECKSUMSWORKER_FAILED)
+		{
+			ereport(WARNING,
+					errmsg("failed to enable data checksums in \"%s\"", db->dbname));
+			found_failed = found;
+			continue;
+		}
+	}
+
+	FreeDatabaseList(DatabaseList);
+
+	if (found_failed)
+	{
+		/* Disable checksums on cluster, because we failed */
+		SetDataChecksumsOff(DataChecksumsWorkerShmem->immediate_checkpoint);
+		/* Force a checkpoint to make everything consistent */
+		flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+		if (immediate_checkpoint)
+			flags = flags | CHECKPOINT_FAST;
+		RequestCheckpoint(flags);
+		ereport(ERROR,
+				errcode(ERRCODE_INSUFFICIENT_RESOURCES),
+				errmsg("data checksums failed to get enabled in all databases, aborting"),
+				errhint("The server log might have more information on the cause of the error."));
+	}
+
+	/*
+	 * When enabling checksums, we have to wait for a checkpoint for the
+	 * checksums to change from in-progress to on.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT);
+
+	/*
+	 * Force a checkpoint to get everything out to disk. The use of immediate
+	 * checkpoints is for running tests, as they would otherwise not execute
+	 * in such a way that they can reliably be placed under timeout control.
+	 */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_FAST;
+	RequestCheckpoint(flags);
+
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_BARRIER);
+	return true;
+}
+
+/*
+ * DataChecksumsWorkerShmemSize
+ *		Compute required space for datachecksumsworker-related shared memory
+ */
+Size
+DataChecksumsWorkerShmemSize(void)
+{
+	Size		size;
+
+	size = sizeof(DataChecksumsWorkerShmemStruct);
+	size = MAXALIGN(size);
+
+	return size;
+}
+
+/*
+ * DataChecksumsWorkerShmemInit
+ *		Allocate and initialize datachecksumsworker-related shared memory
+ */
+void
+DataChecksumsWorkerShmemInit(void)
+{
+	bool		found;
+
+	DataChecksumsWorkerShmem = (DataChecksumsWorkerShmemStruct *)
+		ShmemInitStruct("DataChecksumsWorker Data",
+						DataChecksumsWorkerShmemSize(),
+						&found);
+
+	if (!found)
+	{
+		MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
+
+		/*
+		 * Even if this is a redundant assignment, we want to be explicit
+		 * about our intent for readability, since we want to be able to query
+		 * this state in case of restartability.
+		 */
+		DataChecksumsWorkerShmem->launch_operation = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		DataChecksumsWorkerShmem->launch_fast = false;
+	}
+}
+
+/*
+ * BuildDatabaseList
+ *		Compile a list of all currently available databases in the cluster
+ *
+ * This creates the list of databases for the datachecksumsworker workers to
+ * add checksums to. If the caller wants to ensure that no concurrently
+ * running CREATE DATABASE calls exist, this needs to be preceded by a call
+ * to WaitForAllTransactionsToFinish().
+ */
+static List *
+BuildDatabaseList(void)
+{
+	List	   *DatabaseList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(DatabaseRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_database pgdb = (Form_pg_database) GETSTRUCT(tup);
+		DataChecksumsWorkerDatabase *db;
+
+		oldctx = MemoryContextSwitchTo(ctx);
+
+		db = (DataChecksumsWorkerDatabase *) palloc0(sizeof(DataChecksumsWorkerDatabase));
+
+		db->dboid = pgdb->oid;
+		db->dbname = pstrdup(NameStr(pgdb->datname));
+
+		DatabaseList = lappend(DatabaseList, db);
+
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return DatabaseList;
+}
+
+static void
+FreeDatabaseList(List *dblist)
+{
+	if (!dblist)
+		return;
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, dblist)
+	{
+		if (db->dbname != NULL)
+			pfree(db->dbname);
+	}
+
+	list_free_deep(dblist);
+}
+
+/*
+ * BuildRelationList
+ *		Compile a list of relations in the database
+ *
+ * Returns a list of OIDs for the request relation types. If temp_relations
+ * is True then only temporary relations are returned. If temp_relations is
+ * False then non-temporary relations which have data checksums are returned.
+ * If include_shared is True then shared relations are included as well in a
+ * non-temporary list. include_shared has no relevance when building a list of
+ * temporary relations.
+ */
+static List *
+BuildRelationList(bool temp_relations, bool include_shared)
+{
+	List	   *RelationList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(RelationRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_class pgc = (Form_pg_class) GETSTRUCT(tup);
+
+		/*
+		 * Only include temporary relations when asked for a temp relation
+		 * list.
+		 */
+		if (pgc->relpersistence == RELPERSISTENCE_TEMP)
+		{
+			if (!temp_relations)
+				continue;
+		}
+		else
+		{
+			/*
+			 * If we are only interested in temp relations then continue
+			 * immediately as the current relation isn't a temp relation.
+			 */
+			if (temp_relations)
+				continue;
+
+			if (!RELKIND_HAS_STORAGE(pgc->relkind))
+				continue;
+
+			if (pgc->relisshared && !include_shared)
+				continue;
+		}
+
+		oldctx = MemoryContextSwitchTo(ctx);
+		RelationList = lappend_oid(RelationList, pgc->oid);
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return RelationList;
+}
+
+/*
+ * DataChecksumsWorkerMain
+ *
+ * Main function for enabling checksums in a single database, This is the
+ * function set as the bgw_function_name in the dynamic background worker
+ * process initiated for each database by the worker launcher. After enabling
+ * data checksums in each applicable relation in the database, it will wait for
+ * all temporary relations that were present when the function started to
+ * disappear before returning. This is required since we cannot rewrite
+ * existing temporary relations with data checksums.
+ */
+void
+DataChecksumsWorkerMain(Datum arg)
+{
+	Oid			dboid = DatumGetObjectId(arg);
+	List	   *RelationList = NIL;
+	List	   *InitialTempTableList = NIL;
+	BufferAccessStrategy strategy;
+	bool		aborted = false;
+	int64		rels_done;
+
+	operation = ENABLE_DATACHECKSUMS;
+
+	pqsignal(SIGTERM, die);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_WORKER;
+	init_ps_display(NULL);
+
+	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
+											  BGWORKER_BYPASS_ALLOWCONN);
+
+	/* worker will have a separate entry in pg_stat_progress_data_checksums */
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	/*
+	 * Get a list of all temp tables present as we start in this database. We
+	 * need to wait until they are all gone until we are done, since we cannot
+	 * access these relations and modify them.
+	 */
+	InitialTempTableList = BuildRelationList(true, false);
+
+	/*
+	 * Enable vacuum cost delay, if any.
+	 */
+	Assert(DataChecksumsWorkerShmem->operation == ENABLE_DATACHECKSUMS);
+	VacuumCostDelay = DataChecksumsWorkerShmem->cost_delay;
+	VacuumCostLimit = DataChecksumsWorkerShmem->cost_limit;
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumCostPageHit = 0;
+	VacuumCostPageMiss = 0;
+	VacuumCostPageDirty = 0;
+
+	/*
+	 * Create and set the vacuum strategy as our buffer strategy.
+	 */
+	strategy = GetAccessStrategy(BAS_VACUUM);
+
+	RelationList = BuildRelationList(false,
+									 DataChecksumsWorkerShmem->process_shared_catalogs);
+
+	/* Update the total number of relations to be processed in this DB. */
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE
+		};
+
+		int64		vals[2];
+
+		vals[0] = list_length(RelationList);
+		vals[1] = 0;
+
+		pgstat_progress_update_multi_param(2, index, vals);
+	}
+
+	/* Process the relations */
+	rels_done = 0;
+	foreach_oid(reloid, RelationList)
+	{
+		if (!ProcessSingleRelationByOid(reloid, strategy))
+		{
+			aborted = true;
+			break;
+		}
+
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_RELS_DONE,
+									 ++rels_done);
+	}
+	list_free(RelationList);
+
+	if (aborted)
+	{
+		DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+		ereport(DEBUG1,
+				errmsg("data checksum processing aborted in database OID %u",
+					   dboid));
+		return;
+	}
+
+	/* The worker is about to wait for temporary tables to go away. */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL);
+
+	/*
+	 * Wait for all temp tables that existed when we started to go away. This
+	 * is necessary since we cannot "reach" them to enable checksums. Any temp
+	 * tables created after we started will already have checksums in them
+	 * (due to the "inprogress-on" state), so no need to wait for those.
+	 */
+	for (;;)
+	{
+		List	   *CurrentTempTables;
+		int			numleft;
+		char		activity[64];
+
+		CurrentTempTables = BuildRelationList(true, false);
+		numleft = 0;
+		foreach_oid(tmptbloid, InitialTempTableList)
+		{
+			if (list_member_oid(CurrentTempTables, tmptbloid))
+				numleft++;
+		}
+		list_free(CurrentTempTables);
+
+		INJECTION_POINT("datachecksumsworker-fake-temptable-wait", &numleft);
+
+		if (numleft == 0)
+			break;
+
+		/*
+		 * At least one temp table is left to wait for, indicate in pgstat
+		 * activity and progress reporting.
+		 */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for %d temp tables to be removed", numleft);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 3 seconds */
+		ResetLatch(MyLatch);
+		(void) WaitLatch(MyLatch,
+						 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+						 3000,
+						 WAIT_EVENT_CHECKSUM_ENABLE_TEMPTABLE_WAIT);
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		aborted = DataChecksumsWorkerShmem->launch_operation != operation;
+		LWLockRelease(DataChecksumsWorkerLock);
+
+		if (aborted || abort_requested)
+		{
+			DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+			ereport(DEBUG1,
+					errmsg("data checksum processing aborted in database OID %u",
+						   dboid));
+			return;
+		}
+	}
+
+	list_free(InitialTempTableList);
+
+	/* worker done */
+	pgstat_progress_end_command();
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
+}
diff --git a/src/backend/postmaster/meson.build b/src/backend/postmaster/meson.build
index 0008603cfee..ce10ef1059a 100644
--- a/src/backend/postmaster/meson.build
+++ b/src/backend/postmaster/meson.build
@@ -6,6 +6,7 @@ backend_sources += files(
   'bgworker.c',
   'bgwriter.c',
   'checkpointer.c',
+  'datachecksumsworker.c',
   'fork_process.c',
   'interrupt.c',
   'launch_backend.c',
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 00de559ba8f..8910f099018 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2991,6 +2991,11 @@ PostmasterStateMachine(void)
 									B_INVALID,
 									B_STANDALONE_BACKEND);
 
+			/* also add checksumming processes */
+			remainMask = btmask_add(remainMask,
+									B_DATACHECKSUMSWORKER_LAUNCHER,
+									B_DATACHECKSUMSWORKER_WORKER);
+
 			/* All types should be included in targetMask or remainMask */
 			Assert((remainMask.mask | targetMask.mask) == BTYPE_MASK_ALL.mask);
 		}
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index cc03f0706e9..f9f06821a8f 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -186,6 +186,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 		case XLOG_FPW_CHANGE:
 		case XLOG_FPI_FOR_HINT:
 		case XLOG_FPI:
+		case XLOG_CHECKSUMS:
 		case XLOG_OVERWRITE_CONTRECORD:
 		case XLOG_CHECKPOINT_REDO:
 			break;
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 2fa045e6b0f..44213d140ae 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -30,6 +30,8 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "postmaster/postmaster.h"
 #include "postmaster/walsummarizer.h"
 #include "replication/logicallauncher.h"
 #include "replication/origin.h"
@@ -150,6 +152,7 @@ CalculateShmemSize(int *num_semaphores)
 	size = add_size(size, InjectionPointShmemSize());
 	size = add_size(size, SlotSyncShmemSize());
 	size = add_size(size, AioShmemSize());
+	size = add_size(size, DataChecksumsWorkerShmemSize());
 
 	/* include additional requested shmem from preload libraries */
 	size = add_size(size, total_addin_request);
@@ -332,6 +335,7 @@ CreateOrAttachShmemStructs(void)
 	PgArchShmemInit();
 	ApplyLauncherShmemInit();
 	SlotSyncShmemInit();
+	DataChecksumsWorkerShmemInit();
 
 	/*
 	 * Set up other modules that need some shared memory space
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 087821311cc..2f6ccdfb32f 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -18,12 +18,14 @@
 #include <unistd.h>
 
 #include "access/parallel.h"
+#include "access/xlog.h"
 #include "commands/async.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "port/pg_bitutils.h"
 #include "replication/logicalworker.h"
 #include "replication/walsender.h"
+#include "storage/checksum.h"
 #include "storage/condition_variable.h"
 #include "storage/ipc.h"
 #include "storage/latch.h"
@@ -576,6 +578,18 @@ ProcessProcSignalBarrier(void)
 					case PROCSIGNAL_BARRIER_SMGRRELEASE:
 						processed = ProcessBarrierSmgrRelease();
 						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON:
+						processed = AbsorbDataChecksumsBarrier(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_ON:
+						processed = AbsorbDataChecksumsBarrier(PG_DATA_CHECKSUM_VERSION);
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF:
+						processed = AbsorbDataChecksumsBarrier(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_OFF:
+						processed = AbsorbDataChecksumsBarrier(PG_DATA_CHECKSUM_OFF);
+						break;
 				}
 
 				/*
diff --git a/src/backend/storage/page/README b/src/backend/storage/page/README
index e30d7ac59ad..73c36a63908 100644
--- a/src/backend/storage/page/README
+++ b/src/backend/storage/page/README
@@ -10,7 +10,9 @@ http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf, discussed
 2010/12/22 on -hackers list.
 
 Current implementation requires this be enabled system-wide at initdb time, or
-by using the pg_checksums tool on an offline cluster.
+by using the pg_checksums tool on an offline cluster.  Checksums can also be
+enabled at runtime using pg_enable_data_checksums(), and disabled by using
+pg_disable_data_checksums().
 
 The checksum is not valid at all times on a data page!!
 The checksum is valid when the page leaves the shared pool and is checked
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index aac6e695954..cfb1753ffba 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -107,7 +107,7 @@ PageIsVerified(PageData *page, BlockNumber blkno, int flags, bool *checksum_fail
 	 */
 	if (!PageIsNew(page))
 	{
-		if (DataChecksumsEnabled())
+		if (DataChecksumsNeedVerify())
 		{
 			checksum = pg_checksum_page(page, blkno);
 
@@ -151,8 +151,8 @@ PageIsVerified(PageData *page, BlockNumber blkno, int flags, bool *checksum_fail
 		if ((flags & (PIV_LOG_WARNING | PIV_LOG_LOG)) != 0)
 			ereport(flags & PIV_LOG_WARNING ? WARNING : LOG,
 					(errcode(ERRCODE_DATA_CORRUPTED),
-					 errmsg("page verification failed, calculated checksum %u but expected %u",
-							checksum, p->pd_checksum)));
+					 errmsg("page verification failed, calculated checksum %u but expected %u (page LSN %X/%08X)",
+							checksum, p->pd_checksum, LSN_FORMAT_ARGS(PageXLogRecPtrGet(p->pd_lsn)))));
 
 		if (header_sane && (flags & PIV_IGNORE_CHECKSUM_FAILURE))
 			return true;
@@ -1511,7 +1511,7 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 	static char *pageCopy = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return page;
 
 	/*
@@ -1541,7 +1541,7 @@ void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
 	/* If we don't need a checksum, just return */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return;
 
 	((PageHeader) page)->pd_checksum = pg_checksum_page(page, blkno);
diff --git a/src/backend/utils/activity/pgstat_backend.c b/src/backend/utils/activity/pgstat_backend.c
index 199ba2cc17a..7afe0098267 100644
--- a/src/backend/utils/activity/pgstat_backend.c
+++ b/src/backend/utils/activity/pgstat_backend.c
@@ -380,6 +380,8 @@ pgstat_tracks_backend_bktype(BackendType bktype)
 		case B_CHECKPOINTER:
 		case B_IO_WORKER:
 		case B_STARTUP:
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 			return false;
 
 		case B_AUTOVAC_WORKER:
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index 13ae57ed649..a290d56f409 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -362,6 +362,8 @@ pgstat_tracks_io_bktype(BackendType bktype)
 		case B_LOGGER:
 			return false;
 
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 		case B_AUTOVAC_LAUNCHER:
 		case B_AUTOVAC_WORKER:
 		case B_BACKEND:
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index 7553f6eacef..430178c699c 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -116,6 +116,9 @@ CHECKPOINT_DELAY_COMPLETE	"Waiting for a backend that blocks a checkpoint from c
 CHECKPOINT_DELAY_START	"Waiting for a backend that blocks a checkpoint from starting."
 CHECKPOINT_DONE	"Waiting for a checkpoint to complete."
 CHECKPOINT_START	"Waiting for a checkpoint to start."
+CHECKSUM_ENABLE_STARTCONDITION	"Waiting for data checksums enabling to start."
+CHECKSUM_ENABLE_FINISHCONDITION	"Waiting for data checksums to be enabled."
+CHECKSUM_ENABLE_TEMPTABLE_WAIT	"Waiting for temporary tables to be dropped for data checksums to be enabled."
 EXECUTE_GATHER	"Waiting for activity from a child process while executing a <literal>Gather</literal> plan node."
 HASH_BATCH_ALLOCATE	"Waiting for an elected Parallel Hash participant to allocate a hash table."
 HASH_BATCH_ELECT	"Waiting to elect a Parallel Hash participant to allocate a hash table."
@@ -355,6 +358,7 @@ DSMRegistry	"Waiting to read or update the dynamic shared memory registry."
 InjectionPoint	"Waiting to read or update information related to injection points."
 SerialControl	"Waiting to read or update shared <filename>pg_serial</filename> state."
 AioWorkerSubmissionQueue	"Waiting to access AIO worker submission queue."
+DataChecksumsWorker	"Waiting for data checksumsworker."
 
 #
 # END OF PREDEFINED LWLOCKS (DO NOT CHANGE THIS LINE)
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index a710508979e..5df447b6788 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -295,6 +295,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		cmdtype = PROGRESS_COMMAND_BASEBACKUP;
 	else if (pg_strcasecmp(cmd, "COPY") == 0)
 		cmdtype = PROGRESS_COMMAND_COPY;
+	else if (pg_strcasecmp(cmd, "DATACHECKSUMS") == 0)
+		cmdtype = PROGRESS_COMMAND_DATACHECKSUMS;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -1167,9 +1169,6 @@ pg_stat_get_db_checksum_failures(PG_FUNCTION_ARGS)
 	int64		result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
@@ -1185,9 +1184,6 @@ pg_stat_get_db_checksum_last_failure(PG_FUNCTION_ARGS)
 	TimestampTz result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index fec79992c8d..9b78e0012ef 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -844,7 +844,8 @@ InitializeSessionUserIdStandalone(void)
 	 * workers, in slot sync worker and in background workers.
 	 */
 	Assert(!IsUnderPostmaster || AmAutoVacuumWorkerProcess() ||
-		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess());
+		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess() ||
+		   AmDataChecksumsWorkerProcess());
 
 	/* call only once */
 	Assert(!OidIsValid(AuthenticatedUserId));
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 98f9598cd78..b598deb5648 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -746,6 +746,24 @@ InitPostgres(const char *in_dbname, Oid dboid,
 
 	ProcSignalInit(MyCancelKey, MyCancelKeyLength);
 
+	/*
+	 * Initialize a local cache of the data_checksum_version, to be updated by
+	 * the procsignal-based barriers.
+	 *
+	 * This intentionally happens after initializing the procsignal, otherwise
+	 * we might miss a state change. This means we can get a barrier for the
+	 * state we've just initialized - but it can happen only once.
+	 *
+	 * The postmaster (which is what gets forked into the new child process)
+	 * does not handle barriers, therefore it may not have the current value
+	 * of LocalDataChecksumVersion value (it'll have the value read from the
+	 * control file, which may be arbitrarily old).
+	 *
+	 * NB: Even if the postmaster handled barriers, the value might still be
+	 * stale, as it might have changed after this process forked.
+	 */
+	InitLocalDataChecksumVersion();
+
 	/*
 	 * Also set up timeout handlers needed for backend operation.  We need
 	 * these in every case except bootstrap.
@@ -874,7 +892,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
 					 errhint("You should immediately run CREATE USER \"%s\" SUPERUSER;.",
 							 username != NULL ? username : "postgres")));
 	}
-	else if (AmBackgroundWorkerProcess())
+	else if (AmBackgroundWorkerProcess() || AmDataChecksumsWorkerProcess())
 	{
 		if (username == NULL && !OidIsValid(useroid))
 		{
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 1128167c025..1330840e9c3 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -531,11 +531,12 @@
   max => '1.0',
 },
 
-{ name => 'data_checksums', type => 'bool', context => 'PGC_INTERNAL', group => 'PRESET_OPTIONS',
+{ name => 'data_checksums', type => 'enum', context => 'PGC_INTERNAL', group => 'PRESET_OPTIONS',
   short_desc => 'Shows whether data checksums are turned on for this cluster.',
   flags => 'GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED',
   variable => 'data_checksums',
-  boot_val => 'false',
+  boot_val => 'PG_DATA_CHECKSUM_OFF',
+  options => 'data_checksums_options',
 },
 
 # Can't be set by ALTER SYSTEM as it can lead to recursive definition
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 00c8376cf4d..ae1506d87f5 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -491,6 +491,14 @@ static const struct config_enum_entry file_copy_method_options[] = {
 	{NULL, 0, false}
 };
 
+static const struct config_enum_entry data_checksums_options[] = {
+	{"on", PG_DATA_CHECKSUM_VERSION, true},
+	{"off", PG_DATA_CHECKSUM_OFF, true},
+	{"inprogress-on", PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION, true},
+	{"inprogress-off", PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION, true},
+	{NULL, 0, false}
+};
+
 /*
  * Options for enum values stored in other modules
  */
@@ -617,7 +625,6 @@ static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
 static int	num_os_semaphores;
-static bool data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 46cb2f36efa..327a677cb81 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -585,7 +585,7 @@ main(int argc, char *argv[])
 		ControlFile->state != DB_SHUTDOWNED_IN_RECOVERY)
 		pg_fatal("cluster must be shut down");
 
-	if (ControlFile->data_checksum_version == 0 &&
+	if (ControlFile->data_checksum_version != PG_DATA_CHECKSUM_VERSION &&
 		mode == PG_MODE_CHECK)
 		pg_fatal("data checksums are not enabled in cluster");
 
@@ -593,7 +593,7 @@ main(int argc, char *argv[])
 		mode == PG_MODE_DISABLE)
 		pg_fatal("data checksums are already disabled in cluster");
 
-	if (ControlFile->data_checksum_version > 0 &&
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION &&
 		mode == PG_MODE_ENABLE)
 		pg_fatal("data checksums are already enabled in cluster");
 
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index 10de058ce91..acf5c7b026e 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -280,6 +280,8 @@ main(int argc, char *argv[])
 		   ControlFile->checkPointCopy.oldestCommitTsXid);
 	printf(_("Latest checkpoint's newestCommitTsXid:%u\n"),
 		   ControlFile->checkPointCopy.newestCommitTsXid);
+	printf(_("Latest checkpoint's data_checksum_version:%u\n"),
+		   ControlFile->checkPointCopy.data_checksum_version);
 	printf(_("Time of latest checkpoint:            %s\n"),
 		   ckpttime_str);
 	printf(_("Fake LSN counter for unlogged rels:   %X/%08X\n"),
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index 90cef0864de..29684e82440 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -15,6 +15,7 @@
 #include "access/xlog_internal.h"
 #include "common/string.h"
 #include "pg_upgrade.h"
+#include "storage/bufpage.h"
 
 
 /*
@@ -736,6 +737,14 @@ check_control_data(ControlData *oldctrl,
 	 * check_for_isn_and_int8_passing_mismatch().
 	 */
 
+	/*
+	 * If data checksums are in any in-progress state then disallow the
+	 * upgrade. The user should either let the process finish, or turn off
+	 * data checksums, before retrying.
+	 */
+	if (oldctrl->data_checksum_version > PG_DATA_CHECKSUM_VERSION)
+		pg_fatal("checksums are being enabled in the old cluster");
+
 	/*
 	 * We might eventually allow upgrades from checksum to no-checksum
 	 * clusters.
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 605280ed8fb..100df16384f 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -56,6 +56,7 @@ extern PGDLLIMPORT int CommitDelay;
 extern PGDLLIMPORT int CommitSiblings;
 extern PGDLLIMPORT bool track_wal_io_timing;
 extern PGDLLIMPORT int wal_decode_buffer_size;
+extern PGDLLIMPORT int data_checksums;
 
 extern PGDLLIMPORT int CheckPointSegments;
 
@@ -117,7 +118,7 @@ extern PGDLLIMPORT int wal_level;
  * of the bits make it to disk, but the checksum wouldn't match.  Also WAL-log
  * them if forced by wal_log_hints=on.
  */
-#define XLogHintBitIsNeeded() (DataChecksumsEnabled() || wal_log_hints)
+#define XLogHintBitIsNeeded() (wal_log_hints || DataChecksumsNeedWrite())
 
 /* Do we need to WAL-log information required only for Hot Standby and logical replication? */
 #define XLogStandbyInfoActive() (wal_level >= WAL_LEVEL_REPLICA)
@@ -230,7 +231,16 @@ extern XLogRecPtr GetXLogWriteRecPtr(void);
 
 extern uint64 GetSystemIdentifier(void);
 extern char *GetMockAuthenticationNonce(void);
-extern bool DataChecksumsEnabled(void);
+extern bool DataChecksumsNeedWrite(void);
+extern bool DataChecksumsNeedVerify(void);
+extern bool DataChecksumsOnInProgress(void);
+extern bool DataChecksumsOffInProgress(void);
+extern void SetDataChecksumsOnInProgress(bool immediate_checkpoint);
+extern void SetDataChecksumsOn(bool immediate_checkpoint);
+extern void SetDataChecksumsOff(bool immediate_checkpoint);
+extern bool AbsorbDataChecksumsBarrier(int target_state);
+extern const char *show_data_checksums(void);
+extern void InitLocalDataChecksumVersion(void);
 extern bool GetDefaultCharSignedness(void);
 extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
 extern Size XLOGShmemSize(void);
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index 34deb2fe5f0..faaa0e62d38 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,6 +25,7 @@
 #include "lib/stringinfo.h"
 #include "pgtime.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/relfilelocator.h"
 
 
@@ -289,6 +290,12 @@ typedef struct xl_restore_point
 	char		rp_name[MAXFNAMELEN];
 } xl_restore_point;
 
+/* Information logged when data checksum level is changed */
+typedef struct xl_checksum_state
+{
+	uint32		new_checksumtype;
+} xl_checksum_state;
+
 /* Overwrite of prior contrecord */
 typedef struct xl_overwrite_contrecord
 {
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 63e834a6ce4..a8877fb87d1 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -62,6 +62,9 @@ typedef struct CheckPoint
 	 * set to InvalidTransactionId.
 	 */
 	TransactionId oldestActiveXid;
+
+	/* data checksums at the time of the checkpoint  */
+	uint32		data_checksum_version;
 } CheckPoint;
 
 /* XLOG info values for XLOG rmgr */
@@ -80,6 +83,7 @@ typedef struct CheckPoint
 /* 0xC0 is used in Postgres 9.5-11 */
 #define XLOG_OVERWRITE_CONTRECORD		0xD0
 #define XLOG_CHECKPOINT_REDO			0xE0
+#define XLOG_CHECKSUMS					0xF0
 
 
 /*
@@ -219,7 +223,7 @@ typedef struct ControlFileData
 	bool		float8ByVal;	/* float8, int8, etc pass-by-value? */
 
 	/* Are data pages protected by checksums? Zero if no checksum version */
-	uint32		data_checksum_version;
+	uint32		data_checksum_version;	/* persistent */
 
 	/*
 	 * True if the default signedness of char is "signed" on a platform where
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 34b7fddb0e7..faf7df6ead6 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12381,6 +12381,25 @@
   proname => 'jsonb_subscript_handler', prorettype => 'internal',
   proargtypes => 'internal', prosrc => 'jsonb_subscript_handler' },
 
+# data checksum management functions
+{ oid => '9258',
+  descr => 'disable data checksums',
+  proname => 'pg_disable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'bool', proallargtypes => '{bool}',
+  proargmodes => '{i}',
+  proargnames => '{fast}',
+  prosrc => 'disable_data_checksums' },
+
+{ oid => '9257',
+  descr => 'enable data checksums',
+  proname => 'pg_enable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'int4 int4 bool', proallargtypes => '{int4,int4,bool}',
+  proargmodes => '{i,i,i}',
+  proargnames => '{cost_delay,cost_limit,fast}',
+  prosrc => 'enable_data_checksums' },
+
 # collation management functions
 { oid => '3445', descr => 'import collations from operating system',
   proname => 'pg_import_system_collations', procost => '100',
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 1cde4bd9bcf..d2aa148533b 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -162,4 +162,21 @@
 #define PROGRESS_COPY_TYPE_PIPE 3
 #define PROGRESS_COPY_TYPE_CALLBACK 4
 
+/* Progress parameters for PROGRESS_DATACHECKSUMS */
+#define PROGRESS_DATACHECKSUMS_PHASE		0
+#define PROGRESS_DATACHECKSUMS_DBS_TOTAL	1
+#define PROGRESS_DATACHECKSUMS_DBS_DONE		2
+#define PROGRESS_DATACHECKSUMS_RELS_TOTAL	3
+#define PROGRESS_DATACHECKSUMS_RELS_DONE	4
+#define PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL	5
+#define PROGRESS_DATACHECKSUMS_BLOCKS_DONE	6
+
+/* Phases of datachecksumsworker operation */
+#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING			0
+#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING			1
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING			2
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL	3
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT 4
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BARRIER	5
+
 #endif
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 9a7d733ddef..581fbae2ee0 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -367,6 +367,9 @@ typedef enum BackendType
 	B_WAL_SUMMARIZER,
 	B_WAL_WRITER,
 
+	B_DATACHECKSUMSWORKER_LAUNCHER,
+	B_DATACHECKSUMSWORKER_WORKER,
+
 	/*
 	 * Logger is not connected to shared memory and does not have a PGPROC
 	 * entry.
@@ -392,6 +395,9 @@ extern PGDLLIMPORT BackendType MyBackendType;
 #define AmWalSummarizerProcess()	(MyBackendType == B_WAL_SUMMARIZER)
 #define AmWalWriterProcess()		(MyBackendType == B_WAL_WRITER)
 #define AmIoWorkerProcess()			(MyBackendType == B_IO_WORKER)
+#define AmDataChecksumsWorkerProcess() \
+	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || \
+	 MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
 
 #define AmSpecialWorkerProcess() \
 	(AmAutoVacuumLauncherProcess() || \
diff --git a/src/include/postmaster/datachecksumsworker.h b/src/include/postmaster/datachecksumsworker.h
new file mode 100644
index 00000000000..2cd066fd0fe
--- /dev/null
+++ b/src/include/postmaster/datachecksumsworker.h
@@ -0,0 +1,51 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.h
+ *	  header file for data checksum helper background worker
+ *
+ *
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/postmaster/datachecksumsworker.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef DATACHECKSUMSWORKER_H
+#define DATACHECKSUMSWORKER_H
+
+/* Shared memory */
+extern Size DataChecksumsWorkerShmemSize(void);
+extern void DataChecksumsWorkerShmemInit(void);
+
+/* Possible operations the Datachecksumsworker can perform */
+typedef enum DataChecksumsWorkerOperation
+{
+	ENABLE_DATACHECKSUMS,
+	DISABLE_DATACHECKSUMS,
+	/* TODO: VERIFY_DATACHECKSUMS, */
+}			DataChecksumsWorkerOperation;
+
+/*
+ * Possible states for a database entry which has been processed. Exported
+ * here since we want to be able to reference this from injection point tests.
+ */
+typedef enum
+{
+	DATACHECKSUMSWORKER_SUCCESSFUL = 0,
+	DATACHECKSUMSWORKER_ABORTED,
+	DATACHECKSUMSWORKER_FAILED,
+	DATACHECKSUMSWORKER_RETRYDB,
+} DataChecksumsWorkerResult;
+
+/* Start the background processes for enabling or disabling checksums */
+void		StartDataChecksumsWorkerLauncher(DataChecksumsWorkerOperation op,
+											 int cost_delay,
+											 int cost_limit,
+											 bool fast);
+
+/* Background worker entrypoints */
+void		DataChecksumsWorkerLauncherMain(Datum arg);
+void		DataChecksumsWorkerMain(Datum arg);
+
+#endif							/* DATACHECKSUMSWORKER_H */
diff --git a/src/include/postmaster/proctypelist.h b/src/include/postmaster/proctypelist.h
index 242862451d8..3dc93b176d9 100644
--- a/src/include/postmaster/proctypelist.h
+++ b/src/include/postmaster/proctypelist.h
@@ -38,6 +38,8 @@ PG_PROCTYPE(B_BACKEND, gettext_noop("client backend"), BackendMain, true)
 PG_PROCTYPE(B_BG_WORKER, gettext_noop("background worker"), BackgroundWorkerMain, true)
 PG_PROCTYPE(B_BG_WRITER, gettext_noop("background writer"), BackgroundWriterMain, true)
 PG_PROCTYPE(B_CHECKPOINTER, gettext_noop("checkpointer"), CheckpointerMain, true)
+PG_PROCTYPE(B_DATACHECKSUMSWORKER_LAUNCHER, gettext_noop("datachecksum launcher"), NULL, false)
+PG_PROCTYPE(B_DATACHECKSUMSWORKER_WORKER, gettext_noop("datachecksum worker"), NULL, false)
 PG_PROCTYPE(B_DEAD_END_BACKEND, gettext_noop("dead-end client backend"), BackendMain, true)
 PG_PROCTYPE(B_INVALID, gettext_noop("unrecognized"), NULL, false)
 PG_PROCTYPE(B_IO_WORKER, gettext_noop("io worker"), IoWorkerMain, true)
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index abc2cf2a020..2fb242f029d 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -16,6 +16,7 @@
 
 #include "access/xlogdefs.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/off.h"
 
 /* GUC variable */
@@ -204,7 +205,6 @@ typedef PageHeaderData *PageHeader;
  * handling pages.
  */
 #define PG_PAGE_LAYOUT_VERSION		4
-#define PG_DATA_CHECKSUM_VERSION	1
 
 /* ----------------------------------------------------------------
  *						page support functions
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 25d13a798d1..0faaac14b1b 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -15,6 +15,21 @@
 
 #include "storage/block.h"
 
+/*
+ * Checksum version 0 is used for when data checksums are disabled (OFF).
+ * PG_DATA_CHECKSUM_VERSION defines that data checksums are enabled in the
+ * cluster and PG_DATA_CHECKSUM_INPROGRESS_{ON|OFF}_VERSION defines that data
+ * checksums are either currently being enabled or disabled.
+ */
+typedef enum ChecksumType
+{
+	PG_DATA_CHECKSUM_OFF = 0,
+	PG_DATA_CHECKSUM_VERSION,
+	PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION,
+	PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION,
+	PG_DATA_CHECKSUM_ANY_VERSION
+} ChecksumType;
+
 /*
  * Compute the checksum for a Postgres page.  The page must be aligned on a
  * 4-byte boundary.
diff --git a/src/include/storage/lwlocklist.h b/src/include/storage/lwlocklist.h
index 06a1ffd4b08..b8f7ba0be51 100644
--- a/src/include/storage/lwlocklist.h
+++ b/src/include/storage/lwlocklist.h
@@ -85,6 +85,7 @@ PG_LWLOCK(50, DSMRegistry)
 PG_LWLOCK(51, InjectionPoint)
 PG_LWLOCK(52, SerialControl)
 PG_LWLOCK(53, AioWorkerSubmissionQueue)
+PG_LWLOCK(54, DataChecksumsWorker)
 
 /*
  * There also exist several built-in LWLock tranches.  As with the predefined
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index c6f5ebceefd..d90d35b1d6f 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -463,11 +463,11 @@ extern PGDLLIMPORT PGPROC *PreparedXactProcs;
  * Background writer, checkpointer, WAL writer, WAL summarizer, and archiver
  * run during normal operation.  Startup process and WAL receiver also consume
  * 2 slots, but WAL writer is launched only after startup has exited, so we
- * only need 6 slots.
+ * only need 6 slots to cover these. The DataChecksums worker and launcher
+ * can consume 2 slots when data checksums are enabled or disabled.
  */
 #define MAX_IO_WORKERS          32
-#define NUM_AUXILIARY_PROCS		(6 + MAX_IO_WORKERS)
-
+#define NUM_AUXILIARY_PROCS		(8 + MAX_IO_WORKERS)
 
 /* configurable options */
 extern PGDLLIMPORT int DeadlockTimeout;
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index afeeb1ca019..c54c61e2cd8 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -54,6 +54,11 @@ typedef enum
 typedef enum
 {
 	PROCSIGNAL_BARRIER_SMGRRELEASE, /* ask smgr to close files */
+
+	PROCSIGNAL_BARRIER_CHECKSUM_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_ON,
 } ProcSignalBarrierType;
 
 /*
diff --git a/src/include/utils/backend_progress.h b/src/include/utils/backend_progress.h
index dda813ab407..c664e92dbfe 100644
--- a/src/include/utils/backend_progress.h
+++ b/src/include/utils/backend_progress.h
@@ -28,6 +28,7 @@ typedef enum ProgressCommandType
 	PROGRESS_COMMAND_CREATE_INDEX,
 	PROGRESS_COMMAND_BASEBACKUP,
 	PROGRESS_COMMAND_COPY,
+	PROGRESS_COMMAND_DATACHECKSUMS,
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	20
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 902a7954101..28c8d0bd3cf 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -18,6 +18,7 @@ SUBDIRS = \
 		  test_binaryheap \
 		  test_bitmapset \
 		  test_bloomfilter \
+		  test_checksums \
 		  test_copy_callbacks \
 		  test_custom_rmgrs \
 		  test_ddl_deparse \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 14fc761c4cf..88b8b369534 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -17,6 +17,7 @@ subdir('test_aio')
 subdir('test_binaryheap')
 subdir('test_bitmapset')
 subdir('test_bloomfilter')
+subdir('test_checksums')
 subdir('test_copy_callbacks')
 subdir('test_custom_rmgrs')
 subdir('test_ddl_deparse')
diff --git a/src/test/modules/test_checksums/.gitignore b/src/test/modules/test_checksums/.gitignore
new file mode 100644
index 00000000000..871e943d50e
--- /dev/null
+++ b/src/test/modules/test_checksums/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/modules/test_checksums/Makefile b/src/test/modules/test_checksums/Makefile
new file mode 100644
index 00000000000..a5b6259a728
--- /dev/null
+++ b/src/test/modules/test_checksums/Makefile
@@ -0,0 +1,40 @@
+#-------------------------------------------------------------------------
+#
+# Makefile for src/test/checksum
+#
+# Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/checksum/Makefile
+#
+#-------------------------------------------------------------------------
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+export enable_injection_points
+
+MODULE_big = test_checksums
+OBJS = \
+	$(WIN32RES) \
+	test_checksums.o
+PGFILEDESC = "test_checksums - test code for data checksums"
+
+EXTENSION = test_checksums
+DATA = test_checksums--1.0.sql
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_checksums
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
diff --git a/src/test/modules/test_checksums/README b/src/test/modules/test_checksums/README
new file mode 100644
index 00000000000..0f0317060b3
--- /dev/null
+++ b/src/test/modules/test_checksums/README
@@ -0,0 +1,22 @@
+src/test/checksum/README
+
+Regression tests for data checksums
+===================================
+
+This directory contains a test suite for enabling data checksums
+in a running cluster.
+
+Running the tests
+=================
+
+    make check
+
+or
+
+    make installcheck
+
+NOTE: This creates a temporary installation (in the case of "check"),
+with multiple nodes, be they master or standby(s) for the purpose of
+the tests.
+
+NOTE: This requires the --enable-tap-tests argument to configure.
diff --git a/src/test/modules/test_checksums/meson.build b/src/test/modules/test_checksums/meson.build
new file mode 100644
index 00000000000..ffc737ca87a
--- /dev/null
+++ b/src/test/modules/test_checksums/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+test_checksums_sources = files(
+	'test_checksums.c',
+)
+
+test_checksums = shared_module('test_checksums',
+	test_checksums_sources,
+	kwargs: pg_test_mod_args,
+)
+test_install_libs += test_checksums
+
+test_install_data += files(
+	'test_checksums.control',
+	'test_checksums--1.0.sql',
+)
+
+tests += {
+  'name': 'test_checksums',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'env': {
+       'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+	},
+    'tests': [
+      't/001_basic.pl',
+      't/002_restarts.pl',
+      't/003_standby_restarts.pl',
+      't/004_offline.pl',
+      't/005_injection.pl',
+      't/006_pgbench_single.pl',
+      't/007_pgbench_standby.pl',
+    ],
+  },
+}
diff --git a/src/test/modules/test_checksums/t/001_basic.pl b/src/test/modules/test_checksums/t/001_basic.pl
new file mode 100644
index 00000000000..728a5c4510c
--- /dev/null
+++ b/src/test/modules/test_checksums/t/001_basic.pl
@@ -0,0 +1,63 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are turned off
+test_checksum_state($node, 'off');
+
+# Enable data checksums and wait for the state transition to 'on'
+enable_data_checksums($node, wait => 'on');
+
+# Run a dummy query just to make sure we can read back data
+my $result =
+  $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1 ");
+is($result, '9999', 'ensure checksummed pages can be read back');
+
+# Enable data checksums again which should be a no-op so we explicitly don't
+# wait for any state transition as none should happen here
+enable_data_checksums($node);
+test_checksum_state($node, 'on');
+# ..and make sure we can still read/write data
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums again and wait for the state transition
+disable_data_checksums($node, wait => 'on');
+
+# Test reading data again
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '10000', 'ensure previously checksummed pages can be read back');
+
+# Re-enable checksums and make sure that the underlying data has changed to
+# ensure that checksums will be different.
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+enable_data_checksums($node, wait => 'on');
+
+# Run a dummy query just to make sure we can read back the data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/002_restarts.pl b/src/test/modules/test_checksums/t/002_restarts.pl
new file mode 100644
index 00000000000..6c17f304eac
--- /dev/null
+++ b/src/test/modules/test_checksums/t/002_restarts.pl
@@ -0,0 +1,110 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with a
+# restart which breaks processing.
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Initialize result storage for queries
+my $result;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+test_checksum_state($node, 'off');
+
+SKIP:
+{
+	skip 'Data checksum delay tests not enabled in PG_TEST_EXTRA', 6
+	  if (!$ENV{PG_TEST_EXTRA}
+		|| $ENV{PG_TEST_EXTRA} !~ /\bchecksum_extended\b/);
+
+	# Create a barrier for checksumming to block on, in this case a pre-
+	# existing temporary table which is kept open while processing is started.
+	# We can accomplish this by setting up an interactive psql process which
+	# keeps the temporary table created as we enable checksums in another psql
+	# process.
+	#
+	# This is a similar test to the synthetic variant in 005_injection.pl
+	# which fakes this scenario.
+	my $bsession = $node->background_psql('postgres');
+	$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+	# In another session, make sure we can see the blocking temp table but
+	# start processing anyways and check that we are blocked with a proper
+	# wait event.
+	$result = $node->safe_psql('postgres',
+		"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';"
+	);
+	is($result, 't', 'ensure we can see the temporary table');
+
+	# Enabling data checksums shouldn't work as the process is blocked on the
+	# temporary table held open by $bsession. Ensure that we reach inprogress-
+	# on before we do more tests.
+	enable_data_checksums($node, wait => 'inprogress-on');
+
+	# Wait for processing to finish and the worker waiting for leftover temp
+	# relations to be able to actually finish
+	$result = $node->poll_query_until(
+		'postgres',
+		"SELECT wait_event FROM pg_catalog.pg_stat_activity "
+		  . "WHERE backend_type = 'datachecksum worker';",
+		'ChecksumEnableTemptableWait');
+
+	# The datachecksumsworker waits for temporary tables to disappear for 3
+	# seconds before retrying, so sleep for 4 seconds to be guaranteed to see
+	# a retry cycle
+	sleep(4);
+
+	# Re-check the wait event to ensure we are blocked on the right thing.
+	$result = $node->safe_psql('postgres',
+			"SELECT wait_event FROM pg_catalog.pg_stat_activity "
+		  . "WHERE backend_type = 'datachecksum worker';");
+	is($result, 'ChecksumEnableTemptableWait',
+		'ensure the correct wait condition is set');
+	test_checksum_state($node, 'inprogress-on');
+
+	# Stop the cluster while bsession is still attached.  We can't close the
+	# session first since the brief period between closing and stopping might
+	# be enough for checksums to get enabled.
+	$node->stop;
+	$bsession->quit;
+	$node->start;
+
+	# Ensure the checksums aren't enabled across the restart.  This leaves the
+	# cluster in the same state as before we entered the SKIP block.
+	test_checksum_state($node, 'off');
+}
+
+enable_data_checksums($node, wait => 'on');
+
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '9999', 'ensure checksummed pages can be read back');
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+disable_data_checksums($node, wait => 1);
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/003_standby_restarts.pl b/src/test/modules/test_checksums/t/003_standby_restarts.pl
new file mode 100644
index 00000000000..f724d4ea74c
--- /dev/null
+++ b/src/test/modules/test_checksums/t/003_standby_restarts.pl
@@ -0,0 +1,114 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# streaming replication
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+# Initialize primary node
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+$node_primary->init(allows_streaming => 1, no_data_checksums => 1);
+$node_primary->start;
+
+my $slotname = 'physical_slot';
+$node_primary->safe_psql('postgres',
+	"SELECT pg_create_physical_replication_slot('$slotname')");
+
+# Take backup
+my $backup_name = 'my_backup';
+$node_primary->backup($backup_name);
+
+# Create streaming standby linking to primary
+my $node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $backup_name,
+	has_streaming => 1);
+$node_standby_1->append_conf(
+	'postgresql.conf', qq[
+primary_slot_name = '$slotname'
+]);
+$node_standby_1->start;
+
+# Create some content on the primary to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Wait for standbys to catch up
+$node_primary->wait_for_catchup($node_standby_1, 'replay',
+	$node_primary->lsn('insert'));
+
+# Check that checksums are turned off on all nodes
+test_checksum_state($node_primary, 'off');
+test_checksum_state($node_standby_1, 'off');
+
+# ---------------------------------------------------------------------------
+# Enable checksums for the cluster, and make sure that both the primary and
+# standby change state.
+#
+
+# Ensure that the primary switches to "inprogress-on"
+enable_data_checksums($node_primary, wait => 'inprogress-on');
+# Wait for checksum enable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Ensure that the standby has switched to "inprogress-on" or "on".  Normally it
+# would be "inprogress-on", but it is theoretically possible for the primary to
+# complete the checksum enabling *and* have the standby replay that record
+# before we reach the check below.
+my $result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting = 'off' FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'f');
+is($result, 1, 'ensure standby has absorbed the inprogress-on barrier');
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+
+is(($result eq 'inprogress-on' || $result eq 'on'),
+	1, 'ensure checksums are on, or in progress, on standby_1');
+
+# Insert some more data which should be checksummed on INSERT
+$node_primary->safe_psql('postgres',
+	"INSERT INTO t VALUES (generate_series(1, 10000));");
+
+# Wait for checksums enabled on the primary and standby
+wait_for_checksum_state($node_primary, 'on');
+wait_for_checksum_state($node_standby_1, 'on');
+
+$result =
+  $node_primary->safe_psql('postgres', "SELECT count(a) FROM t WHERE a > 1");
+is($result, '19998', 'ensure we can safely read all data with checksums');
+
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+#
+# Disable checksums and ensure it's propagated to standby and that we can
+# still read all data
+#
+
+# Disable checksums and wait for the operation to be replayed
+disable_data_checksums($node_primary);
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+# Ensure that the primary abd standby has switched to off
+wait_for_checksum_state($node_primary, 'off');
+wait_for_checksum_state($node_standby_1, 'off');
+# Doublecheck reading data without errors
+$result =
+  $node_primary->safe_psql('postgres', "SELECT count(a) FROM t WHERE a > 1");
+is($result, "19998", 'ensure we can safely read all data without checksums');
+
+$node_standby_1->stop;
+$node_primary->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/004_offline.pl b/src/test/modules/test_checksums/t/004_offline.pl
new file mode 100644
index 00000000000..e9fbcf77eab
--- /dev/null
+++ b/src/test/modules/test_checksums/t/004_offline.pl
@@ -0,0 +1,82 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums offline from various states
+# of checksum processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+test_checksum_state($node, 'off');
+
+# Enable checksums offline using pg_checksums
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are enabled
+test_checksum_state($node, 'on');
+
+# Run a dummy query just to make sure we can read back some data
+my $result =
+  $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '9999', 'ensure checksummed pages can be read back');
+
+# Disable checksums offline again using pg_checksums
+$node->stop;
+$node->checksum_disable_offline;
+$node->start;
+
+# Ensure that checksums are disabled
+test_checksum_state($node, 'off');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+enable_data_checksums($node, wait => 'inprogress-on');
+
+# Turn the cluster off and enable checksums offline, then start back up
+$bsession->quit;
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are now enabled even though processing wasn't
+# restarted
+test_checksum_state($node, 'on');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '9999', 'ensure checksummed pages can be read back');
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/005_injection.pl b/src/test/modules/test_checksums/t/005_injection.pl
new file mode 100644
index 00000000000..ae801cd336f
--- /dev/null
+++ b/src/test/modules/test_checksums/t/005_injection.pl
@@ -0,0 +1,126 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# injection point tests injecting failures into the processing
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+# ---------------------------------------------------------------------------
+# Test cluster setup
+#
+
+# Initiate testcluster
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Set up test environment
+$node->safe_psql('postgres', 'CREATE EXTENSION test_checksums;');
+
+# ---------------------------------------------------------------------------
+# Inducing failures and crashes in processing
+
+# Force enabling checksums to fail by marking one of the databases as having
+# failed in processing.
+disable_data_checksums($node, wait => 1);
+$node->safe_psql('postgres', 'SELECT dcw_inject_fail_database(true);');
+enable_data_checksums($node, wait => 'off');
+$node->safe_psql('postgres', 'SELECT dcw_inject_fail_database(false);');
+
+# Force the server to crash after enabling data checksums but before issuing
+# the checkpoint.  Since the switch has been WAL logged the server should come
+# up with checksums enabled after replay.
+test_checksum_state($node, 'off');
+$node->safe_psql('postgres', 'SELECT dc_crash_before_checkpoint();');
+enable_data_checksums($node, fast => 'true');
+my $ret = wait_for_cluster_crash($node);
+ok($ret == 1, "Cluster crash detection timeout");
+ok(!$node->is_alive, "Cluster crashed due to abort() before checkpointing");
+$node->_update_pid(-1);
+$node->start;
+test_checksum_state($node, 'on');
+
+# Another test just like the previous, but for disabling data checksums (and
+# crashing just before checkpointing).  The previous injection points were all
+# detached from through the crash so they need to be reattached.
+$node->safe_psql('postgres', 'SELECT dc_crash_before_checkpoint();');
+disable_data_checksums($node);
+$ret = wait_for_cluster_crash($node);
+ok($ret == 1, "Cluster crash detection timeout");
+ok(!$node->is_alive, "Cluster crashed due to abort() before checkpointing");
+$node->_update_pid(-1);
+$node->start;
+test_checksum_state($node, 'off');
+
+# Now inject a crash before inserting the WAL record for data checksum state
+# change, when the server comes back up again the state should not have been
+# set to the new value since the process didn't succeed.
+$node->safe_psql('postgres', 'SELECT dc_crash_before_xlog();');
+enable_data_checksums($node);
+$ret = wait_for_cluster_crash($node);
+ok($ret == 1, "Cluster crash detection timeout");
+ok(!$node->is_alive, "Cluster crashed");
+$node->_update_pid(-1);
+$node->start;
+test_checksum_state($node, 'off');
+
+# This re-runs the same test again but with first disabling data checksums and
+# then enabling again, crashing right before inserting the WAL record.  When
+# it comes back up the checksums must not be enabled.
+$node->safe_psql('postgres', 'SELECT dc_crash_before_xlog();');
+enable_data_checksums($node);
+$ret = wait_for_cluster_crash($node);
+ok($ret == 1, "Cluster crash detection timeout");
+ok(!$node->is_alive, "Cluster crashed");
+$node->_update_pid(-1);
+$node->start;
+test_checksum_state($node, 'off');
+
+# ---------------------------------------------------------------------------
+# Timing and retry related tests
+#
+
+# Force the enable checksums processing to make multiple passes by removing
+# one database from the list in the first pass.  This will simulate a CREATE
+# DATABASE during processing.  Doing this via fault injection makes the test
+# not be subject to exact timing.
+$node->safe_psql('postgres', 'SELECT dcw_prune_dblist(true);');
+enable_data_checksums($node, wait => 'on');
+
+SKIP:
+{
+	skip 'Data checksum delay tests not enabled in PG_TEST_EXTRA', 4
+	  if (!$ENV{PG_TEST_EXTRA}
+		|| $ENV{PG_TEST_EXTRA} !~ /\bchecksum_extended\b/);
+
+	# Inject a delay in the barrier for enabling checksums
+	disable_data_checksums($node, wait => 1);
+	$node->safe_psql('postgres', 'SELECT dcw_inject_delay_barrier();');
+	enable_data_checksums($node, wait => 'on');
+
+	# Fake the existence of a temporary table at the start of processing, which
+	# will force the processing to wait and retry in order to wait for it to
+	# disappear.
+	disable_data_checksums($node, wait => 1);
+	$node->safe_psql('postgres', 'SELECT dcw_fake_temptable(true);');
+	enable_data_checksums($node, wait => 'on');
+}
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/006_pgbench_single.pl b/src/test/modules/test_checksums/t/006_pgbench_single.pl
new file mode 100644
index 00000000000..96f3b2cd8a6
--- /dev/null
+++ b/src/test/modules/test_checksums/t/006_pgbench_single.pl
@@ -0,0 +1,268 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# concurrent activity via pgbench runs
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+if (!$ENV{PG_TEST_EXTRA} || $ENV{PG_TEST_EXTRA} !~ /\bchecksum_extended\b/)
+{
+	plan skip_all => 'Extended tests not enabled';
+}
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+my $node;
+my $node_loglocation = 0;
+
+# The number of full test iterations which will be performed. The exact number
+# of tests performed and the wall time taken is non-deterministic as the test
+# performs a lot of randomized actions, but 10 iterations will be a long test
+# run regardless.
+my $TEST_ITERATIONS = 10;
+
+# Variables which record the current state of the cluster
+my $data_checksum_state = 'off';
+my $pgbench = undef;
+
+# determines whether enable_data_checksums/disable_data_checksums forces an
+# immediate checkpoint
+my @flip_modes = ('true', 'false');
+
+# Start a pgbench run in the background against the server specified via the
+# port passed as parameter.
+sub background_rw_pgbench
+{
+	my $port = shift;
+
+	# If a previous pgbench is still running, start by shutting it down.
+	if ($pgbench)
+	{
+		$pgbench->finish;
+	}
+
+	# Randomize the number of pgbench clients a bit (range 1-16)
+	my $clients = 1 + int(rand(15));
+
+	my @cmd = ('pgbench', '-p', $port, '-T', '600', '-c', $clients);
+
+	# Randomize whether we spawn connections or not
+	push(@cmd, '-C') if (cointoss);
+	# Finally add the database name to use
+	push(@cmd, 'postgres');
+
+	$pgbench = IPC::Run::start(
+		\@cmd,
+		'<' => '/dev/null',
+		'>' => '/dev/null',
+		'2>' => '/dev/null',
+		IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default));
+}
+
+# Invert the state of data checksums in the cluster, if data checksums are on
+# then disable them and vice versa. Also performs proper validation of the
+# before and after state.
+sub flip_data_checksums
+{
+	# First, make sure the cluster is in the state we expect it to be
+	test_checksum_state($node, $data_checksum_state);
+
+	if ($data_checksum_state eq 'off')
+	{
+		# Coin-toss to see if we are injecting a retry due to a temptable
+		$node->safe_psql('postgres', 'SELECT dcw_fake_temptable();')
+		  if cointoss();
+
+		# log LSN right before we start changing checksums
+		my $result =
+		  $node->safe_psql('postgres', "SELECT pg_current_wal_lsn()");
+		note("LSN before enabling: " . $result . "\n");
+
+		my $mode = $flip_modes[ int(rand(@flip_modes)) ];
+
+		# Ensure that the primary switches to "inprogress-on"
+		enable_data_checksums(
+			$node,
+			wait => 'inprogress-on',
+			'fast' => $mode);
+
+		random_sleep();
+
+		# Wait for checksums enabled on the primary
+		wait_for_checksum_state($node, 'on');
+
+		# log LSN right after the primary flips checksums to "on"
+		$result = $node->safe_psql('postgres', "SELECT pg_current_wal_lsn()");
+		note("LSN after enabling: " . $result . "\n");
+
+		random_sleep();
+
+		$node->safe_psql('postgres', 'SELECT dcw_fake_temptable(false);');
+		$data_checksum_state = 'on';
+	}
+	elsif ($data_checksum_state eq 'on')
+	{
+		random_sleep();
+
+		# log LSN right before we start changing checksums
+		my $result =
+		  $node->safe_psql('postgres', "SELECT pg_current_wal_lsn()");
+		note("LSN before disabling: " . $result . "\n");
+
+		my $mode = $flip_modes[ int(rand(@flip_modes)) ];
+
+		disable_data_checksums($node, 'fast' => $mode);
+
+		# Wait for checksums disabled on the primary
+		wait_for_checksum_state($node, 'off');
+
+		# log LSN right after the primary flips checksums to "off"
+		$result = $node->safe_psql('postgres', "SELECT pg_current_wal_lsn()");
+		note("LSN after disabling: " . $result . "\n");
+
+		random_sleep();
+
+		$data_checksum_state = 'off';
+	}
+	else
+	{
+		# This should only happen due to programmer error when hacking on the
+		# test code, but since that might pass subtly by let's ensure it gets
+		# caught with a test error if so.
+		bail('data_checksum_state variable has invalid state:'
+			  . $data_checksum_state);
+	}
+}
+
+# Create and start a cluster with one node
+$node = PostgreSQL::Test::Cluster->new('main');
+$node->init(allows_streaming => 1, no_data_checksums => 1);
+# max_connections need to be bumped in order to accommodate for pgbench clients
+# and log_statement is dialled down since it otherwise will generate enormous
+# amounts of logging. Page verification failures are still logged.
+$node->append_conf(
+	'postgresql.conf',
+	qq[
+max_connections = 100
+log_statement = none
+]);
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION test_checksums;');
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1, 100000) AS a;");
+# Initialize pgbench
+$node->command_ok([ 'pgbench', '-i', '-s', '100', '-q', 'postgres' ]);
+# Start the test suite with pgbench running.
+background_rw_pgbench($node->port);
+
+# Main test suite. This loop will start a pgbench run on the cluster and while
+# that's running flip the state of data checksums concurrently. It will then
+# randomly restart thec cluster (in fast or immediate) mode and then check for
+# the desired state.  The idea behind doing things randomly is to stress out
+# any timing related issues by subjecting the cluster for varied workloads.
+# A TODO is to generate a trace such that any test failure can be traced to
+# its order of operations for debugging.
+for (my $i = 0; $i < $TEST_ITERATIONS; $i++)
+{
+	note("iteration ", ($i + 1), " of ", $TEST_ITERATIONS);
+
+	if (!$node->is_alive)
+	{
+		random_sleep();
+
+		# Start, to do recovery, and stop
+		$node->start;
+		$node->stop('fast');
+
+		# Since the log isn't being written to now, parse the log and check
+		# for instances of checksum verification failures.
+		my $log = PostgreSQL::Test::Utils::slurp_file($node->logfile,
+			$node_loglocation);
+		unlike(
+			$log,
+			qr/page verification failed/,
+			"no checksum validation errors in primary log (during WAL recovery)"
+		);
+		$node_loglocation = -s $node->logfile;
+
+		# Randomize the WAL size, to trigger checkpoints less/more often
+		my $sb = 64 + int(rand(1024));
+		$node->append_conf('postgresql.conf', qq[max_wal_size = $sb]);
+
+		$node->start;
+
+		# Start a pgbench in the background against the primary
+		background_rw_pgbench($node->port);
+	}
+
+	$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+
+	flip_data_checksums();
+	random_sleep();
+	my $result =
+	  $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+	is($result, '100000', 'ensure data pages can be read back on primary');
+
+	random_sleep();
+
+	# Potentially powercycle the node
+	if (cointoss())
+	{
+		$node->stop(stopmode());
+
+		PostgreSQL::Test::Utils::system_log("pg_controldata",
+			$node->data_dir);
+
+		my $log = PostgreSQL::Test::Utils::slurp_file($node->logfile,
+			$node_loglocation);
+		unlike(
+			$log,
+			qr/page verification failed/,
+			"no checksum validation errors in primary log (outside WAL recovery)"
+		);
+		$node_loglocation = -s $node->logfile;
+	}
+
+	random_sleep();
+}
+
+# Make sure the node is running
+if (!$node->is_alive)
+{
+	$node->start;
+}
+
+# Testrun is over, ensure that data reads back as expected and perform a final
+# verification of the data checksum state.
+my $result =
+  $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '100000', 'ensure data pages can be read back on primary');
+test_checksum_state($node, $data_checksum_state);
+
+# Perform one final pass over the logs and hunt for unexpected errors
+my $log =
+  PostgreSQL::Test::Utils::slurp_file($node->logfile, $node_loglocation);
+unlike(
+	$log,
+	qr/page verification failed/,
+	"no checksum validation errors in primary log");
+$node_loglocation = -s $node->logfile;
+
+$node->teardown_node;
+
+done_testing();
diff --git a/src/test/modules/test_checksums/t/007_pgbench_standby.pl b/src/test/modules/test_checksums/t/007_pgbench_standby.pl
new file mode 100644
index 00000000000..8b8e031cbf6
--- /dev/null
+++ b/src/test/modules/test_checksums/t/007_pgbench_standby.pl
@@ -0,0 +1,398 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster,
+# comprising of a primary and a replicated standby, with concurrent activity
+# via pgbench runs
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+my $node_primary_slot = 'physical_slot';
+my $node_primary_backup = 'primary_backup';
+my $node_primary;
+my $node_primary_loglocation = 0;
+my $node_standby_1;
+my $node_standby_1_loglocation = 0;
+
+# The number of full test iterations which will be performed. The exact number
+# of tests performed and the wall time taken is non-deterministic as the test
+# performs a lot of randomized actions, but 5 iterations will be a long test
+# run regardless.
+my $TEST_ITERATIONS = 5;
+
+# Variables which record the current state of the cluster
+my $data_checksum_state = 'off';
+
+my $pgbench_primary = undef;
+my $pgbench_standby = undef;
+
+# Variables holding state for managing the cluster and aux processes in
+# various ways
+my ($pgb_primary_stdin, $pgb_primary_stdout, $pgb_primary_stderr) =
+  ('', '', '');
+my ($pgb_standby_1_stdin, $pgb_standby_1_stdout, $pgb_standby_1_stderr) =
+  ('', '', '');
+
+if (!$ENV{PG_TEST_EXTRA} || $ENV{PG_TEST_EXTRA} !~ /\bchecksum_extended\b/)
+{
+	plan skip_all => 'Extended tests not enabled';
+}
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+# determines whether enable_data_checksums/disable_data_checksums forces an
+# immediate checkpoint
+my @flip_modes = ('true', 'false');
+
+# Start a pgbench run in the background against the server specified via the
+# port passed as parameter
+sub background_pgbench
+{
+	my ($port, $standby) = @_;
+
+	# Terminate any currently running pgbench process before continuing
+	$pgbench_primary->finish if $pgbench_primary;
+
+	my $clients = 1 + int(rand(15));
+
+	my @cmd = ('pgbench', '-p', $port, '-T', '600', '-c', $clients);
+	# Randomize whether we spawn connections or not
+	push(@cmd, '-C') if (cointoss());
+	# If we run on a standby it needs to be a read-only benchmark
+	push(@cmd, '-S') if ($standby);
+	# Finally add the database name to use
+	push(@cmd, 'postgres');
+
+	$pgbench_primary = IPC::Run::start(
+		[ 'pgbench', '-p', $port, '-T', '600', '-c', $clients, 'postgres' ],
+		'<' => '/dev/null',
+		'>' => '/dev/null',
+		'2>' => '/dev/null',
+		IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default));
+}
+
+# Invert the state of data checksums in the cluster, if data checksums are on
+# then disable them and vice versa. Also performs proper validation of the
+# before and after state.
+sub flip_data_checksums
+{
+	test_checksum_state($node_primary, $data_checksum_state);
+	test_checksum_state($node_standby_1, $data_checksum_state);
+
+	if ($data_checksum_state eq 'off')
+	{
+		# Coin-toss to see if we are injecting a retry due to a temptable
+		$node_primary->safe_psql('postgres',
+			'SELECT dcw_fake_temptable(true);')
+		  if cointoss();
+
+		# log LSN right before we start changing checksums
+		my $result =
+		  $node_primary->safe_psql('postgres', "SELECT pg_current_wal_lsn()");
+		note("LSN before enabling: " . $result . "\n");
+
+		my $mode = $flip_modes[ int(rand(@flip_modes)) ];
+
+		# Ensure that the primary switches to "inprogress-on"
+		enable_data_checksums(
+			$node_primary,
+			wait => 'inprogress-on',
+			'fast' => $mode);
+		random_sleep();
+		# Wait for checksum enable to be replayed
+		$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+		# Ensure that the standby has switched to "inprogress-on" or "on".
+		# Normally it would be "inprogress-on", but it is theoretically
+		# possible for the primary to complete the checksum enabling *and* have
+		# the standby replay that record before we reach the check below.
+		$result = $node_standby_1->poll_query_until(
+			'postgres',
+			"SELECT setting = 'off' "
+			  . "FROM pg_catalog.pg_settings "
+			  . "WHERE name = 'data_checksums';",
+			'f');
+		is($result, 1,
+			'ensure standby has absorbed the inprogress-on barrier');
+		random_sleep();
+		$result = $node_standby_1->safe_psql('postgres',
+				"SELECT setting "
+			  . "FROM pg_catalog.pg_settings "
+			  . "WHERE name = 'data_checksums';");
+
+		is(($result eq 'inprogress-on' || $result eq 'on'),
+			1, 'ensure checksums are on, or in progress, on standby_1');
+
+		# Wait for checksums enabled on the primary and standby
+		wait_for_checksum_state($node_primary, 'on');
+
+		# log LSN right after the primary flips checksums to "on"
+		$result =
+		  $node_primary->safe_psql('postgres', "SELECT pg_current_wal_lsn()");
+		note("LSN after enabling: " . $result . "\n");
+
+		random_sleep();
+		wait_for_checksum_state($node_standby_1, 'on');
+
+		$node_primary->safe_psql('postgres',
+			'SELECT dcw_fake_temptable(false);');
+		$data_checksum_state = 'on';
+	}
+	elsif ($data_checksum_state eq 'on')
+	{
+		random_sleep();
+
+		# log LSN right before we start changing checksums
+		my $result =
+		  $node_primary->safe_psql('postgres', "SELECT pg_current_wal_lsn()");
+		note("LSN before disabling: " . $result . "\n");
+
+		my $mode = $flip_modes[ int(rand(@flip_modes)) ];
+
+		disable_data_checksums($node_primary, 'fast' => $mode);
+		$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+		# Wait for checksums disabled on the primary and standby
+		wait_for_checksum_state($node_primary, 'off');
+		wait_for_checksum_state($node_standby_1, 'off');
+
+		# log LSN right after the primary flips checksums to "off"
+		$result =
+		  $node_primary->safe_psql('postgres', "SELECT pg_current_wal_lsn()");
+		note("LSN after disabling: " . $result . "\n");
+
+		random_sleep();
+		wait_for_checksum_state($node_standby_1, 'off');
+
+		$data_checksum_state = 'off';
+	}
+	else
+	{
+		# This should only happen due to programmer error when hacking on the
+		# test code, but since that might pass subtly by let's ensure it gets
+		# caught with a test error if so.
+		is(1, 0, 'data_checksum_state variable has invalid state');
+	}
+}
+
+# Create and start a cluster with one primary and one standby node, and ensure
+# they are caught up and in sync.
+$node_primary = PostgreSQL::Test::Cluster->new('main');
+$node_primary->init(allows_streaming => 1, no_data_checksums => 1);
+# max_connections need to be bumped in order to accommodate for pgbench clients
+# and log_statement is dialled down since it otherwise will generate enormous
+# amounts of logging. Page verification failures are still logged.
+$node_primary->append_conf(
+	'postgresql.conf',
+	qq[
+max_connections = 30
+log_statement = none
+]);
+$node_primary->start;
+$node_primary->safe_psql('postgres', 'CREATE EXTENSION test_checksums;');
+# Create some content to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1, 100000) AS a;");
+$node_primary->safe_psql('postgres',
+	"SELECT pg_create_physical_replication_slot('$node_primary_slot');");
+$node_primary->backup($node_primary_backup);
+
+$node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $node_primary_backup,
+	has_streaming => 1);
+$node_standby_1->append_conf(
+	'postgresql.conf', qq[
+primary_slot_name = '$node_primary_slot'
+]);
+$node_standby_1->start;
+
+# Initialize pgbench and wait for the objects to be created on the standby
+$node_primary->command_ok([ 'pgbench', '-i', '-s', '100', '-q', 'postgres' ]);
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Start the test suite with pgbench running on all nodes
+background_pgbench($node_standby_1->port, 1);
+background_pgbench($node_primary->port, 0);
+
+# Main test suite. This loop will start a pgbench run on the cluster and while
+# that's running flip the state of data checksums concurrently. It will then
+# randomly restart the cluster (in fast or immediate) mode and then check for
+# the desired state.  The idea behind doing things randomly is to stress out
+# any timing related issues by subjecting the cluster for varied workloads.
+# A TODO is to generate a trace such that any test failure can be traced to
+# its order of operations for debugging.
+for (my $i = 0; $i < $TEST_ITERATIONS; $i++)
+{
+	note("iteration ", ($i + 1), " of ", $TEST_ITERATIONS);
+
+	if (!$node_primary->is_alive)
+	{
+		random_sleep();
+
+		# start, to do recovery, and stop
+		$node_primary->start;
+		$node_primary->stop('fast');
+
+		# Since the log isn't being written to now, parse the log and check
+		# for instances of checksum verification failures.
+		my $log = PostgreSQL::Test::Utils::slurp_file($node_primary->logfile,
+			$node_primary_loglocation);
+		unlike(
+			$log,
+			qr/page verification failed/,
+			"no checksum validation errors in primary log (during WAL recovery)"
+		);
+		$node_primary_loglocation = -s $node_primary->logfile;
+
+		# randomize the WAL size, to trigger checkpoints less/more often
+		my $sb = 64 + int(rand(960));
+		$node_primary->append_conf('postgresql.conf', qq[max_wal_size = $sb]);
+
+		note("changing primary max_wal_size to " . $sb);
+
+		$node_primary->start;
+
+		# Start a pgbench in the background against the primary
+		background_pgbench($node_primary->port, 0);
+	}
+
+	if (!$node_standby_1->is_alive)
+	{
+		random_sleep();
+
+		$node_standby_1->start;
+		$node_standby_1->stop('fast');
+
+		# Since the log isn't being written to now, parse the log and check
+		# for instances of checksum verification failures.
+		my $log =
+		  PostgreSQL::Test::Utils::slurp_file($node_standby_1->logfile,
+			$node_standby_1_loglocation);
+		unlike(
+			$log,
+			qr/page verification failed/,
+			"no checksum validation errors in standby_1 log (during WAL recovery)"
+		);
+		$node_standby_1_loglocation = -s $node_standby_1->logfile;
+
+		# randomize the WAL size, to trigger checkpoints less/more often
+		my $sb = 64 + int(rand(960));
+		$node_standby_1->append_conf('postgresql.conf',
+			qq[max_wal_size = $sb]);
+
+		note("changing standby max_wal_size to " . $sb);
+
+		$node_standby_1->start;
+
+		# Start a select-only pgbench in the background on the standby
+		background_pgbench($node_standby_1->port, 1);
+	}
+
+	$node_primary->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+
+	flip_data_checksums();
+	random_sleep();
+	my $result = $node_primary->safe_psql('postgres',
+		"SELECT count(*) FROM t WHERE a > 1");
+	is($result, '100000', 'ensure data pages can be read back on primary');
+	random_sleep();
+	$node_primary->wait_for_catchup($node_standby_1, 'write');
+
+	random_sleep();
+
+	# Potentially powercycle the cluster (the nodes independently)
+	# XXX should maybe try stopping nodes in the opposite order too?
+	if (cointoss())
+	{
+		$node_primary->stop(stopmode());
+
+		# print the contents of the control file on the primary
+		PostgreSQL::Test::Utils::system_log("pg_controldata",
+			$node_primary->data_dir);
+
+		# slurp the file after shutdown, so that it doesn't interfere with the recovery
+		my $log = PostgreSQL::Test::Utils::slurp_file($node_primary->logfile,
+			$node_primary_loglocation);
+		unlike(
+			$log,
+			qr/page verification failed/,
+			"no checksum validation errors in primary log (outside WAL recovery)"
+		);
+		$node_primary_loglocation = -s $node_primary->logfile;
+	}
+
+	random_sleep();
+
+	if (cointoss())
+	{
+		$node_standby_1->stop(stopmode());
+
+		# print the contents of the control file on the standby
+		PostgreSQL::Test::Utils::system_log("pg_controldata",
+			$node_standby_1->data_dir);
+
+		# slurp the file after shutdown, so that it doesn't interfere with the recovery
+		my $log =
+		  PostgreSQL::Test::Utils::slurp_file($node_standby_1->logfile,
+			$node_standby_1_loglocation);
+		unlike(
+			$log,
+			qr/page verification failed/,
+			"no checksum validation errors in standby_1 log (outside WAL recovery)"
+		);
+		$node_standby_1_loglocation = -s $node_standby_1->logfile;
+	}
+}
+
+# make sure the nodes are running
+if (!$node_primary->is_alive)
+{
+	$node_primary->start;
+}
+
+if (!$node_standby_1->is_alive)
+{
+	$node_standby_1->start;
+}
+
+# Testrun is over, ensure that data reads back as expected and perform a final
+# verification of the data checksum state.
+my $result =
+  $node_primary->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '100000', 'ensure data pages can be read back on primary');
+test_checksum_state($node_primary, $data_checksum_state);
+test_checksum_state($node_standby_1, $data_checksum_state);
+
+# Perform one final pass over the logs and hunt for unexpected errors
+my $log = PostgreSQL::Test::Utils::slurp_file($node_primary->logfile,
+	$node_primary_loglocation);
+unlike(
+	$log,
+	qr/page verification failed/,
+	"no checksum validation errors in primary log");
+$node_primary_loglocation = -s $node_primary->logfile;
+$log = PostgreSQL::Test::Utils::slurp_file($node_standby_1->logfile,
+	$node_standby_1_loglocation);
+unlike(
+	$log,
+	qr/page verification failed/,
+	"no checksum validation errors in standby_1 log");
+$node_standby_1_loglocation = -s $node_standby_1->logfile;
+
+$node_standby_1->teardown_node;
+$node_primary->teardown_node;
+
+done_testing();
diff --git a/src/test/modules/test_checksums/t/DataChecksums/Utils.pm b/src/test/modules/test_checksums/t/DataChecksums/Utils.pm
new file mode 100644
index 00000000000..cf670be944c
--- /dev/null
+++ b/src/test/modules/test_checksums/t/DataChecksums/Utils.pm
@@ -0,0 +1,283 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+=pod
+
+=head1 NAME
+
+DataChecksums::Utils - Utility functions for testing data checksums in a running cluster
+
+=head1 SYNOPSIS
+
+  use PostgreSQL::Test::Cluster;
+  use DataChecksums::Utils qw( .. );
+
+  # Create, and start, a new cluster
+  my $node = PostgreSQL::Test::Cluster->new('primary');
+  $node->init;
+  $node->start;
+
+  test_checksum_state($node, 'off');
+
+  enable_data_checksums($node);
+
+  wait_for_checksum_state($node, 'on');
+
+
+=cut
+
+package DataChecksums::Utils;
+
+use strict;
+use warnings FATAL => 'all';
+use Exporter 'import';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+our @EXPORT = qw(
+  cointoss
+  disable_data_checksums
+  enable_data_checksums
+  random_sleep
+  stopmode
+  test_checksum_state
+  wait_for_checksum_state
+  wait_for_cluster_crash
+);
+
+=pod
+
+=head1 METHODS
+
+=over
+
+=item test_checksum_state(node, state)
+
+Test that the current value of the data checksum GUC in the server running
+at B<node> matches B<state>.  If the values differ, a test failure is logged.
+Returns True if the values match, otherwise False.
+
+=cut
+
+sub test_checksum_state
+{
+	my ($postgresnode, $state) = @_;
+
+	my $result = $postgresnode->safe_psql('postgres',
+		"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+	);
+	is($result, $state, 'ensure checksums are set to ' . $state);
+	return $result eq $state;
+}
+
+=item wait_for_checksum_state(node, state)
+
+Test the value of the data checksum GUC in the server running at B<node>
+repeatedly until it matches B<state> or times out.  Processing will run for
+$PostgreSQL::Test::Utils::timeout_default seconds before timing out.  If the
+values differ when the process times out, False is returned and a test failure
+is logged, otherwise True.
+
+=cut
+
+sub wait_for_checksum_state
+{
+	my ($postgresnode, $state) = @_;
+
+	my $res = $postgresnode->poll_query_until(
+		'postgres',
+		"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+		$state);
+	is($res, 1, 'ensure data checksums are transitioned to ' . $state);
+	return $res == 1;
+}
+
+=item wait_for_cluster_crash(node, params)
+
+Repeatedly test if the cluster running at B<node> for responds to connections
+and return when it no longer does so, or when it times out.  Processing will
+run for $PostgreSQL::Test::Utils::timeout_default seconds unless a timeout
+value is specified as a parameter.  Returns True if the cluster crashed, else
+False if the process timed out.
+
+=over
+
+=item timeout
+
+Approximate number of seconds to wait for cluster to crash, default is
+$PostgreSQL::Test::Utils::timeout_default.  There are no real-time guarantee
+that the total process time won't exceed the timeout.
+
+=back
+
+=cut
+
+sub wait_for_cluster_crash
+{
+	my $postgresnode = shift;
+	my %params = @_;
+	my $crash = 0;
+
+	$params{timeout} = $PostgreSQL::Test::Utils::timeout_default
+	  unless (defined($params{timeout}));
+
+	for (my $naps = 0; $naps < $params{timeout}; $naps++)
+	{
+		if (!$postgresnode->is_alive)
+		{
+			$crash = 1;
+			last;
+		}
+		sleep(1);
+	}
+
+	return $crash == 1;
+}
+
+=item enable_data_checksums($node, %params)
+
+Function for enabling data checksums in the cluster running at B<node>.
+
+=over
+
+=item cost_delay
+
+The B<cost_delay> to use when enabling data checksums, default is 0.
+
+=item cost_limit
+
+The B<cost_limit> to use when enabling data checksums, default is 100.
+
+=item fast
+
+If set to C<true> an immediate checkpoint will be issued after data
+checksums are enabled.  Setting this to false will lead to slower tests.
+The default is C<true>.
+
+=item wait
+
+If defined, the function will wait for the state defined in this parameter,
+waiting timing out, before returning.  The function will wait for
+$PostgreSQL::Test::Utils::timeout_default seconds before timing out.
+
+=back
+
+=cut
+
+sub enable_data_checksums
+{
+	my $postgresnode = shift;
+	my %params = @_;
+
+	# Set sane defaults for the parameters
+	$params{cost_delay} = 0 unless (defined($params{cost_delay}));
+	$params{cost_limit} = 100 unless (defined($params{cost_limit}));
+	$params{fast} = 'true' unless (defined($params{fast}));
+
+	my $query = <<'EOQ';
+SELECT pg_enable_data_checksums(%s, %s, %s);
+EOQ
+
+	$postgresnode->safe_psql(
+		'postgres',
+		sprintf($query,
+			$params{cost_delay}, $params{cost_limit}, $params{fast}));
+
+	wait_for_checksum_state($postgresnode, $params{wait})
+	  if (defined($params{wait}));
+}
+
+=item disable_data_checksums($node, %params)
+
+Function for disabling data checksums in the cluster running at B<node>.
+
+=over
+
+=item wait
+
+If defined, the function will wait for the state to turn to B<off>, or
+waiting timing out, before returning.  The function will wait for
+$PostgreSQL::Test::Utils::timeout_default seconds before timing out.
+Unlike in C<enable_data_checksums> the value of the parameter is discarded.
+
+=over
+
+=item fast
+
+If set to C<true> the checkpoint after disabling will be set to immediate, else
+it will be deferred.  The default if no value is set is B<true>.
+
+=back
+
+=cut
+
+sub disable_data_checksums
+{
+	my $postgresnode = shift;
+	my %params = @_;
+
+	# Set sane defaults for the parameters
+	$params{fast} = 'true' unless (defined($params{fast}));
+
+	my $query = <<'EOQ';
+SELECT pg_disable_data_checksums(%s);
+EOQ
+
+	$postgresnode->safe_psql('postgres', sprintf($query, $params{fast}));
+
+	wait_for_checksum_state($postgresnode, 'off') if (defined($params{wait}));
+}
+
+=item cointoss
+
+Helper for retrieving a binary value with random distribution for deciding
+whether to turn things off during testing.
+
+=back
+
+=cut
+
+sub cointoss
+{
+	return int(rand() < 0.5);
+}
+
+=item random_sleep(max)
+
+Helper for injecting random sleeps here and there in the testrun. The sleep
+duration will be in the range (0,B<max>), but won't be predictable in order to
+avoid sleep patterns that manage to avoid race conditions and timing bugs.
+The default B<max> is 3 seconds.
+
+=back
+
+=cut
+
+sub random_sleep
+{
+	my $max = shift;
+	sleep(int(rand(defined($max) ? $max : 3))) if cointoss;
+}
+
+=item stopmode
+
+Small helper function for randomly selecting a valid stopmode.
+
+=back
+
+=cut
+
+sub stopmode
+{
+	return 'immediate' if (cointoss);
+	return 'fast';
+}
+
+=pod
+
+=back
+
+=cut
+
+1;
diff --git a/src/test/modules/test_checksums/test_checksums--1.0.sql b/src/test/modules/test_checksums/test_checksums--1.0.sql
new file mode 100644
index 00000000000..aa086d5c430
--- /dev/null
+++ b/src/test/modules/test_checksums/test_checksums--1.0.sql
@@ -0,0 +1,28 @@
+/* src/test/modules/test_checksums/test_checksums--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_checksums" to load this file. \quit
+
+CREATE FUNCTION dcw_inject_delay_barrier(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION dcw_inject_fail_database(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION dcw_prune_dblist(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION dcw_fake_temptable(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION dc_crash_before_checkpoint(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION dc_crash_before_xlog(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_checksums/test_checksums.c b/src/test/modules/test_checksums/test_checksums.c
new file mode 100644
index 00000000000..c182f2c868b
--- /dev/null
+++ b/src/test/modules/test_checksums/test_checksums.c
@@ -0,0 +1,225 @@
+/*--------------------------------------------------------------------------
+ *
+ * test_checksums.c
+ *		Test data checksums
+ *
+ * Copyright (c) 2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *		src/test/modules/test_checksums/test_checksums.c
+ *
+ * -------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "funcapi.h"
+#include "miscadmin.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/latch.h"
+#include "utils/injection_point.h"
+#include "utils/wait_event.h"
+
+#define USEC_PER_SEC    1000000
+
+PG_MODULE_MAGIC;
+
+extern PGDLLEXPORT void dc_delay_barrier(const char *name, const void *private_data, void *arg);
+extern PGDLLEXPORT void dc_fail_database(const char *name, const void *private_data, void *arg);
+extern PGDLLEXPORT void dc_dblist(const char *name, const void *private_data, void *arg);
+extern PGDLLEXPORT void dc_fake_temptable(const char *name, const void *private_data, void *arg);
+
+extern PGDLLEXPORT void crash(const char *name, const void *private_data, void *arg);
+
+/*
+ * Test for delaying emission of procsignalbarriers.
+ */
+void
+dc_delay_barrier(const char *name, const void *private_data, void *arg)
+{
+	(void) name;
+	(void) private_data;
+
+	(void) WaitLatch(MyLatch,
+					 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+					 (3 * 1000),
+					 WAIT_EVENT_PG_SLEEP);
+}
+
+PG_FUNCTION_INFO_V1(dcw_inject_delay_barrier);
+Datum
+dcw_inject_delay_barrier(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksums-enable-checksums-delay",
+							 "test_checksums",
+							 "dc_delay_barrier",
+							 NULL,
+							 0);
+	else
+		InjectionPointDetach("datachecksums-enable-checksums-delay");
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
+
+void
+dc_fail_database(const char *name, const void *private_data, void *arg)
+{
+	static bool first_pass = true;
+	DataChecksumsWorkerResult *res = (DataChecksumsWorkerResult *) arg;
+
+	if (first_pass)
+		*res = DATACHECKSUMSWORKER_FAILED;
+	first_pass = false;
+}
+
+PG_FUNCTION_INFO_V1(dcw_inject_fail_database);
+Datum
+dcw_inject_fail_database(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksumsworker-fail-db",
+							 "test_checksums",
+							 "dc_fail_database",
+							 NULL,
+							 0);
+	else
+		InjectionPointDetach("datachecksumsworker-fail-db");
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
+
+/*
+ * Test to remove an entry from the Databaselist to force re-processing since
+ * not all databases could be processed in the first iteration of the loop.
+ */
+void
+dc_dblist(const char *name, const void *private_data, void *arg)
+{
+	static bool first_pass = true;
+	List	   *DatabaseList = (List *) arg;
+
+	if (first_pass)
+		DatabaseList = list_delete_last(DatabaseList);
+	first_pass = false;
+}
+
+PG_FUNCTION_INFO_V1(dcw_prune_dblist);
+Datum
+dcw_prune_dblist(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksumsworker-initial-dblist",
+							 "test_checksums",
+							 "dc_dblist",
+							 NULL,
+							 0);
+	else
+		InjectionPointDetach("datachecksumsworker-initial-dblist");
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
+
+/*
+ * Test to force waiting for existing temptables.
+ */
+void
+dc_fake_temptable(const char *name, const void *private_data, void *arg)
+{
+	static bool first_pass = true;
+	int		   *numleft = (int *) arg;
+
+	if (first_pass)
+		*numleft = 1;
+	first_pass = false;
+}
+
+PG_FUNCTION_INFO_V1(dcw_fake_temptable);
+Datum
+dcw_fake_temptable(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksumsworker-fake-temptable-wait",
+							 "test_checksums",
+							 "dc_fake_temptable",
+							 NULL,
+							 0);
+	else
+		InjectionPointDetach("datachecksumsworker-fake-temptable-wait");
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
+
+void
+crash(const char *name, const void *private_data, void *arg)
+{
+	abort();
+}
+
+/*
+ * dc_crash_before_checkpoint
+ *
+ * Ensure that the server crashes just before the checkpoint is issued after
+ * enabling or disabling checksums.
+ */
+PG_FUNCTION_INFO_V1(dc_crash_before_checkpoint);
+Datum
+dc_crash_before_checkpoint(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	InjectionPointAttach("datachecksums-enable-checksums-pre-checkpoint",
+						 "test_checksums", "crash", NULL, 0);
+	InjectionPointAttach("datachecksums-disable-checksums-pre-checkpoint",
+						 "test_checksums", "crash", NULL, 0);
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
+
+/*
+ * dc_crash_before_xlog
+ *
+ * Ensure that the server crashes right before it is about insert the xlog
+ * record XLOG_CHECKSUMS.
+ */
+PG_FUNCTION_INFO_V1(dc_crash_before_xlog);
+Datum
+dc_crash_before_xlog(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksums-xlogchecksums-pre-xloginsert",
+							 "test_checksums", "crash", NULL, 0);
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
diff --git a/src/test/modules/test_checksums/test_checksums.control b/src/test/modules/test_checksums/test_checksums.control
new file mode 100644
index 00000000000..84b4cc035a7
--- /dev/null
+++ b/src/test/modules/test_checksums/test_checksums.control
@@ -0,0 +1,4 @@
+comment = 'Test code for data checksums'
+default_version = '1.0'
+module_pathname = '$libdir/test_checksums'
+relocatable = true
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 35413f14019..3af7944acea 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3872,6 +3872,51 @@ sub advance_wal
 	}
 }
 
+=item $node->checksum_enable_offline()
+
+Enable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_enable_offline
+{
+	my ($self) = @_;
+
+	print "# Enabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-e');
+	return;
+}
+
+=item checksum_disable_offline
+
+Disable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_disable_offline
+{
+	my ($self) = @_;
+
+	print "# Disabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-d');
+	return;
+}
+
+sub checksum_verify_offline
+{
+	my ($self) = @_;
+
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-c');
+	return;
+}
+
 =pod
 
 =back
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2bf968ae3d3..9c4409a12a1 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2081,6 +2081,42 @@ pg_stat_progress_create_index| SELECT s.pid,
     s.param15 AS partitions_done
    FROM (pg_stat_get_progress_info('CREATE INDEX'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
+pg_stat_progress_data_checksums| SELECT s.pid,
+    s.datid,
+    d.datname,
+        CASE s.param1
+            WHEN 0 THEN 'enabling'::text
+            WHEN 1 THEN 'disabling'::text
+            WHEN 2 THEN 'waiting'::text
+            WHEN 3 THEN 'waiting on temporary tables'::text
+            WHEN 4 THEN 'waiting on checkpoint'::text
+            WHEN 5 THEN 'done'::text
+            ELSE NULL::text
+        END AS phase,
+        CASE s.param2
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param2
+        END AS databases_total,
+    s.param3 AS databases_done,
+        CASE s.param4
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param4
+        END AS relations_total,
+        CASE s.param5
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param5
+        END AS relations_done,
+        CASE s.param6
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param6
+        END AS blocks_total,
+        CASE s.param7
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param7
+        END AS blocks_done
+   FROM (pg_stat_get_progress_info('DATACHECKSUMS'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)))
+  ORDER BY s.datid;
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,
diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 67e1860e984..c9feff8331e 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -51,6 +51,22 @@ client backend|relation|vacuum
 client backend|temp relation|normal
 client backend|wal|init
 client backend|wal|normal
+datachecksum launcher|relation|bulkread
+datachecksum launcher|relation|bulkwrite
+datachecksum launcher|relation|init
+datachecksum launcher|relation|normal
+datachecksum launcher|relation|vacuum
+datachecksum launcher|temp relation|normal
+datachecksum launcher|wal|init
+datachecksum launcher|wal|normal
+datachecksum worker|relation|bulkread
+datachecksum worker|relation|bulkwrite
+datachecksum worker|relation|init
+datachecksum worker|relation|normal
+datachecksum worker|relation|vacuum
+datachecksum worker|temp relation|normal
+datachecksum worker|wal|init
+datachecksum worker|wal|normal
 io worker|relation|bulkread
 io worker|relation|bulkwrite
 io worker|relation|init
@@ -95,7 +111,7 @@ walsummarizer|wal|init
 walsummarizer|wal|normal
 walwriter|wal|init
 walwriter|wal|normal
-(79 rows)
+(95 rows)
 \a
 -- ensure that both seqscan and indexscan plans are allowed
 SET enable_seqscan TO on;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 2ca7b75af57..7328a685df4 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -417,6 +417,7 @@ CheckPointStmt
 CheckpointStatsData
 CheckpointerRequest
 CheckpointerShmemStruct
+ChecksumType
 Chromosome
 CkptSortItem
 CkptTsStatus
@@ -610,6 +611,10 @@ DataPageDeleteStack
 DataTypesUsageChecks
 DataTypesUsageVersionCheck
 DatabaseInfo
+DataChecksumsWorkerDatabase
+DataChecksumsWorkerResult
+DataChecksumsWorkerResultEntry
+DataChecksumsWorkerShmemStruct
 DateADT
 DateTimeErrorExtra
 Datum
@@ -4252,6 +4257,7 @@ xl_btree_split
 xl_btree_unlink_page
 xl_btree_update
 xl_btree_vacuum
+xl_checksum_state
 xl_clog_truncate
 xl_commit_ts_truncate
 xl_dbase_create_file_copy_rec
-- 
2.39.3 (Apple Git-146)

#64Tomas Vondra
tomas@vondra.me
In reply to: Daniel Gustafsson (#63)
1 attachment(s)
Re: Changing the state of data checksums in a running cluster

Hi Daniel,

I took a look at the patch again, focusing mostly on the docs and
comments. I think the code is OK, I haven't noticed anything serious.

testing
-------

I'm running the TAP tests - so far it looks fine, I've done 2000
iterations of the "single" test, now running ~2000 iterations of the
"standby" test. No issues/failures so far.

The question is whether we can/should make it even more "random", by
doing restarts in more places etc. I might give it a try, if I happen to
have some free time. But no promises, I'm not sure how feasible it is.
Making it "truly random" means it's hard to deduce what should be the
actual state of checksums, etc.

review
------

Attached is a patch adding FIXME/XXX comments to a bunch of places,
which I think makes it clearer which place I'm talking about. I'll
briefly go over the items, and maybe explain them a bit more.

1) func-admin.sgml

- This is missing documentation for the "fast" parameters for both
functions (enable/disable).

- The paragraph stars with "Initiates data checksums for ..." but that
doesn't sound right to me. I think you can initiate enabling/disabling,
not "data checksums".

- I think the docs should at least briefly describe the impact on the
cluster, and also on a standby, due to having to write everything into
WAL, waiting for checkpoints etc. And maybe discuss how to mitigate that
in some way. More about the standby stuff later.

2) glossary.sgml

- This describes "checksum worker" as process that enables or disables
checksums in a specific database, but we don't need any per-database
processes when disabling, no?

3) monitoring.sgml

- I'm not sure what "regardles" implies here. Does that mean we simply
don't hide/reset the counters?

- I added a brief explanation about using the "launcher" row for overall
progress, and per-database workers for "current progress".

- Do we want to refer to "datachecksumsworker"? Isn't it a bit too
low-level detail?

- The table of phases is missing "waiting" and "done" phases. IMHO if
the progress view can return it, it should be in the docs.

4) wal.sgml

- I added a sentence explaining that both enabling and disabling happens
in phases, with checkpoints in between. I think that's helpful for users
and DBAs.

- The section only described "enabling checksums", but it probably
should explain the process for disabling too. Added a para.

- Again, I think we should explain the checkpoints, restartpoints,
impact on standby ... somewhere. Not sure if this is the right place.

5) xlog.c

- Some minor corrections (typos, ...).

- Isn't the claim that PG_DATA_CHECKSUM_ON_VERSION is the only place
where we check InitialDataChecksumTransition stale? AFAIK we now check
this in AbsorbDataChecksumsBarrier for all transitions, no?

6) datachecksumsworker.c

- I understand what the comments at the beginning of the file say, but I
suspect it's partially due to already knowing the code. There's a couple
places that might be more explicit, I think. For example:

- One of the items in the synchronization/correctness section states
that "Backends SHALL NOT violate local data_checksums state" but what
does "violating local data_checksums state" mean? What even is "local
state in this context"? Should we explain/define that, or would that be
unnecessarily detailed?

- The section only talks about "enabling checksums" but also discusses
all four possible states. Maybe it should talk about disabling too, as
that requires the same synchronization/correctness.

- Maybe it'd be good make it more explicit at which point the process
waits on a barrier, which backend initiates that (and which backends are
required to absorb the signal). It kinda is there, but only indirectly.

- Another idea I had is that maybe it'd help to have some visualization
of the process (with data_checksum states, barriers, ...) e.g. in the
form of an ASCII image.

open questions
--------------

For me the main remaining question is impact people should expect in
production systems, and maybe ways to mitigate that.

In single-node systems this is entirely fine, I think. There will be
checkpoints, but those will be "spread" and it's just the checksum
worker waiting for that to complete.

It'll write everything into WAL, but that's fairly well understood /
should be expected. We should probably mention that in the sgml docs, so
that people are not surprised their WAL archive gets huge.

I'm much more concerned about streaming replicas, because there it
forces a restart point - and it *blocks redo* until it completes. Which
means there'll be replication lag, and for synchronous standbys this
would even block progress on the primary.

We should very clearly document this. But I'm also wondering if we might
mitigate this in some way / reduce the impact.

I see some similarities to shutdown checkpoints, which can take a lot of
time if there happen to be a lot of dirty data, increasing disruption
during planned restarts (when no one can connect). A common mitigation
is to run CHECKPOINT shortly before the restart, to flush most of the
dirty data while still allowing new connections.

Maybe we could do something like that for checksum changes? I don't know
how exactly we could do that, but let's say we can predict when roughly
to expect the next state change. And we'd ensure the standby starts
flushing stuff before that, so that creating the restartpoint is cheap.
Or maybe we'd (gradually?) lower max_wal_size on the standby, to reduce
the amount of WAL as we're getting closer to the end?

regards

--
Tomas Vondra

Attachments:

checksums-review.txttext/plain; charset=UTF-8; name=checksums-review.txtDownload
From 40d26a0406a213207101a837999e8b57df25fc1d Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Wed, 5 Nov 2025 13:26:41 +0100
Subject: [PATCH v20251105 2/3] review

---
 doc/src/sgml/func/func-admin.sgml            | 20 ++++++++++++-----
 doc/src/sgml/glossary.sgml                   |  2 +-
 doc/src/sgml/monitoring.sgml                 | 23 ++++++++++++++++++--
 doc/src/sgml/wal.sgml                        | 18 +++++++++++++++
 src/backend/access/transam/xlog.c            | 17 +++++++++------
 src/backend/postmaster/datachecksumsworker.c | 14 ++++++++++++
 src/backend/storage/ipc/ipci.c               |  1 -
 src/include/postmaster/datachecksumsworker.h |  2 +-
 src/tools/pgindent/typedefs.list             |  2 ++
 9 files changed, 82 insertions(+), 17 deletions(-)

diff --git a/doc/src/sgml/func/func-admin.sgml b/doc/src/sgml/func/func-admin.sgml
index f3a8782ede0..4d6f6a7e486 100644
--- a/doc/src/sgml/func/func-admin.sgml
+++ b/doc/src/sgml/func/func-admin.sgml
@@ -3008,11 +3008,11 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
         <indexterm>
          <primary>pg_enable_data_checksums</primary>
         </indexterm>
-        <function>pg_enable_data_checksums</function> ( <optional><parameter>cost_delay</parameter> <type>int</type>, <parameter>cost_limit</parameter> <type>int</type></optional> )
+        <function>pg_enable_data_checksums</function> ( <optional><parameter>cost_delay</parameter> <type>int</type>, <parameter>cost_limit</parameter> <type>int</type></optional>, <parameter>fast</parameter> <type>bool</type></optional> )
         <returnvalue>void</returnvalue>
        </para>
        <para>
-        Initiates data checksums for the cluster. This will switch the data
+        Initiates the process of enabling data checksums for the cluster. This will switch the data
         checksums mode to <literal>inprogress-on</literal> as well as start a
         background worker that will process all pages in the database and
         enable checksums on them. When all data pages have had checksums
@@ -3023,7 +3023,9 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
         If <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter> are
         specified, the speed of the process is throttled using the same principles as
         <link linkend="runtime-config-resource-vacuum-cost">Cost-based Vacuum Delay</link>.
-       </para></entry>
+       </para>
+       <para>FIXME missing documentation of the "fast" parameter</para>
+       </entry>
       </row>
 
       <row>
@@ -3031,7 +3033,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
         <indexterm>
          <primary>pg_disable_data_checksums</primary>
         </indexterm>
-        <function>pg_disable_data_checksums</function> ()
+        <function>pg_disable_data_checksums</function> ( <parameter>fast</parameter> <type>bool</type></optional> )
         <returnvalue>void</returnvalue>
        </para>
        <para>
@@ -3042,12 +3044,20 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
         changed to <literal>off</literal>.  At this point the data pages will
         still have checksums recorded but they are not updated when pages are
         modified.
-       </para></entry>
+       </para>
+       <para>FIXME missing documentation of the "fast" parameter</para>
+       </entry>
       </row>
      </tbody>
     </tgroup>
    </table>
 
+   <para>
+    FIXME I think this should briefly explain how this interacts with checkpoints (through
+    the fast parameters). It should probably also discuss how this affects streaming standby
+    due to forcing a restart point, etc. And maybe comment on possible mitigations?
+   </para>
+
   </sect2>
 
   </sect1>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 9bac0c96348..3ba0e8c6c5c 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -580,7 +580,7 @@
    <glossdef>
     <para>
      An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
-     which enables or disables data checksums in a specific database.
+     which enables data checksums in a specific database.
     </para>
    </glossdef>
   </glossentry>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index b56e220f3d8..7efa1af746a 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3554,7 +3554,10 @@ description | Waiting for a newly initialized WAL file to reach durable storage
        database (or on a shared object).
        Detected failures are reported regardless of the
        <xref linkend="guc-data-checksums"/> setting.
-      </para></entry>
+      </para>
+      <para>XXX I'm not sure what "regardless" means in this context. I guess
+      it means we don't reset the counters when disabling checksums?</para>
+     </entry>
      </row>
 
      <row>
@@ -6959,6 +6962,9 @@ FROM pg_stat_get_backend_idset() AS backendid;
    <structname>pg_stat_progress_data_checksums</structname> view will contain
    a row for the launcher process, and one row for each worker process which
    is currently calculating checksums for the data pages in one database.
+   The launcher provides overview of the overall progress (how many database
+   have been processed, how many remain), while the workers track progress for
+   currently processed databases.
   </para>
 
   <table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
@@ -6984,7 +6990,8 @@ FROM pg_stat_get_backend_idset() AS backendid;
         <structfield>pid</structfield> <type>integer</type>
        </para>
        <para>
-        Process ID of a datachecksumworker process.
+        Process ID of a datachecksumsworker process.
+        FIXME Is datachecksumsworker defined somewhere? Does it refer to the launcher too?
        </para>
       </entry>
      </row>
@@ -7127,6 +7134,12 @@ FROM pg_stat_get_backend_idset() AS backendid;
        The command is currently disabling data checksums on the cluster.
       </entry>
      </row>
+     <row>
+      <entry><literal>waiting</literal></entry>
+      <entry>
+       FIXME
+      </entry>
+     </row>
      <row>
       <entry><literal>waiting on temporary tables</literal></entry>
       <entry>
@@ -7141,6 +7154,12 @@ FROM pg_stat_get_backend_idset() AS backendid;
        state before finishing.
       </entry>
      </row>
+     <row>
+      <entry><literal>done</literal></entry>
+      <entry>
+       FIXME
+      </entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index 0ada90ca0b1..8ef16d7769f 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -284,6 +284,11 @@
     <link linkend="functions-admin-checksum">functions</link>.
    </para>
 
+   <para>
+    Both enabling and disabling checksums happens in two phases, separated by
+    a checkpoint to ensure durability.
+   </para>
+
    <para>
     Enabling checksums will put the cluster checksum mode in
     <literal>inprogress-on</literal> mode.  During this time, checksums will be
@@ -314,6 +319,14 @@
     no support for resuming work from where it was interrupted.
    </para>
 
+   <para>
+    Disabling checksums will put the cluster checksum mode in
+    <literal>inprogress-off</literal> mode.  During this time, checksums will be
+    written but not verified. After all processes acknowledge the change,
+    the mode will automatically switch to <literal>off</literal>. In this case
+    rewriting all blocks is not needed, but checkpoints are still required.
+   </para>
+
    <note>
     <para>
      Enabling checksums can cause significant I/O to the system, as most of the
@@ -324,6 +337,11 @@
     </para>
    </note>
 
+   <para>
+    XXX Maybe this is the place that should mention checkpoints/restartpoints,
+    how it impacts systems/replication and how to mitigate that?
+   </para>
+
   </sect2>
  </sect1>
 
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index f7633f47551..807b82eed4f 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -607,7 +607,7 @@ typedef struct ChecksumBarrierCondition
 	int			barrier_ne[MAX_BARRIER_CONDITIONS];
 	/* The number of elements in the barrier_ne set */
 	int			barrier_ne_sz;
-}			ChecksumBarrierCondition;
+} ChecksumBarrierCondition;
 
 static const ChecksumBarrierCondition checksum_barriers[] =
 {
@@ -618,7 +618,6 @@ static const ChecksumBarrierCondition checksum_barriers[] =
 	{-1}
 };
 
-
 /*
  * Calculate the amount of space left on the page after 'endptr'. Beware
  * multiple evaluation!
@@ -694,10 +693,10 @@ static TimeLineID LocalMinRecoveryPointTLI;
 static bool updateMinRecoveryPoint = true;
 
 /*
- * Local state fror Controlfile data_checksum_version.  After initialization
+ * Local state for Controlfile data_checksum_version.  After initialization
  * this is only updated when absorbing a procsignal barrier during interrupt
  * processing.  The reason for keeping a copy in backend-private memory is to
- * avoid locking for interrogating checksum state.  Possible values are the
+ * avoid locking for interrogating the checksum state.  Possible values are the
  * checksum versions defined in storage/bufpage.h as well as zero when data
  * checksums are disabled.
  */
@@ -712,6 +711,10 @@ static uint32 LocalDataChecksumVersion = 0;
  * processing the barrier.  This may happen if the process is spawned between
  * the update of XLogCtl->data_checksum_version and the barrier being emitted.
  * This can only happen on the very first barrier so mark that with this flag.
+ *
+ * XXX Is PG_DATA_CHECKSUM_ON_VERSION still the only transition with an assert?
+ * I think it was replaced by checking checksum_barriers for every transition,
+ * with elog(ERROR), no?
  */
 static bool InitialDataChecksumTransition = true;
 
@@ -4335,7 +4338,7 @@ InitControlFile(uint64 sysidentifier, uint32 data_checksum_version)
 
 	/*
 	 * Set the data_checksum_version value into XLogCtl, which is where all
-	 * processes get the current value from. (Maybe it should go just there?)
+	 * processes get the current value from.
 	 */
 	XLogCtl->data_checksum_version = data_checksum_version;
 }
@@ -4921,7 +4924,7 @@ SetDataChecksumsOff(bool immediate_checkpoint)
 	uint64		barrier;
 	int			flags;
 
-	Assert(ControlFile);
+	Assert(ControlFile != NULL);
 
 	SpinLockAcquire(&XLogCtl->info_lck);
 
@@ -5030,7 +5033,7 @@ SetDataChecksumsOff(bool immediate_checkpoint)
 bool
 AbsorbDataChecksumsBarrier(int target_state)
 {
-	const		ChecksumBarrierCondition *condition = checksum_barriers;
+	const ChecksumBarrierCondition *condition = checksum_barriers;
 	int			current = LocalDataChecksumVersion;
 	bool		found = false;
 
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
index 3deb57a96de..67045b9014d 100644
--- a/src/backend/postmaster/datachecksumsworker.c
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -52,6 +52,9 @@
  *    - Data checksums SHALL NOT be considered enabled cluster-wide until all
  *      currently connected backends have the local state "enabled"
  *
+ * FIXME I'm not 100% sure I understand what the two above points say. What does
+ * "violate local data_checksums state" means"?
+ *
  * There are two levels of synchronization required for enabling data checksums
  * in an online cluster: (i) changing state in the active backends ("on",
  * "off", "inprogress-on" and "inprogress-off"), and (ii) ensuring no
@@ -60,6 +63,10 @@
  * latter with ensuring that any concurrent activity cannot break the data
  * checksum contract during processing.
  *
+ * FIXME This para talks about "enabling" but then mentions all four states,
+ * including "inprogress-off" and "off". Maybe it should talk about "changing
+ * data_checksums" instead?
+ *
  * Synchronizing the state change is done with procsignal barriers, where the
  * WAL logging backend updating the global state in the controlfile will wait
  * for all other backends to absorb the barrier. Barrier absorption will happen
@@ -88,6 +95,10 @@
  *   enables data checksums cluster-wide, there are four sets of backends where
  *   Bd shall be an empty set:
  *
+ * FIXME Maybe mention which process initiates the procsignalbarrier?
+ * FIXME Don't we actually wait or the barrier before we start rewriting data?
+ * I think Bd has to be empty prior to that, otherwise it might break checksums.
+ *
  *   Bg: Backend updating the global state and emitting the procsignalbarrier
  *   Bd: Backends in "off" state
  *   Be: Backends in "on" state
@@ -124,6 +135,8 @@
  *   stop writing data checksums as no backend is enforcing data checksum
  *   validation any longer.
  *
+ * XXX Maybe it'd make sense to "visualize" the progress between the states
+ * and barriers in some way. Say, by doing
  *
  * Potential optimizations
  * -----------------------
@@ -148,6 +161,7 @@
  *     to enable checksums on a cluster which is in inprogress-on state and
  *     may have checksummed pages (make pg_checksums be able to resume an
  *     online operation).
+ *   * Restartability (not necessarily with page granularity).
  *
  *
  * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 44213d140ae..9014e90f1c7 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -31,7 +31,6 @@
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/bgwriter.h"
 #include "postmaster/datachecksumsworker.h"
-#include "postmaster/postmaster.h"
 #include "postmaster/walsummarizer.h"
 #include "replication/logicallauncher.h"
 #include "replication/origin.h"
diff --git a/src/include/postmaster/datachecksumsworker.h b/src/include/postmaster/datachecksumsworker.h
index 2cd066fd0fe..0daef709ec8 100644
--- a/src/include/postmaster/datachecksumsworker.h
+++ b/src/include/postmaster/datachecksumsworker.h
@@ -24,7 +24,7 @@ typedef enum DataChecksumsWorkerOperation
 	ENABLE_DATACHECKSUMS,
 	DISABLE_DATACHECKSUMS,
 	/* TODO: VERIFY_DATACHECKSUMS, */
-}			DataChecksumsWorkerOperation;
+} DataChecksumsWorkerOperation;
 
 /*
  * Possible states for a database entry which has been processed. Exported
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 9d6b4e57cf3..3049e6018b3 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -417,6 +417,7 @@ CheckPointStmt
 CheckpointStatsData
 CheckpointerRequest
 CheckpointerShmemStruct
+ChecksumBarrierCondition
 ChecksumType
 Chromosome
 CkptSortItem
@@ -587,6 +588,7 @@ CustomScan
 CustomScanMethods
 CustomScanState
 CycleCtr
+DataChecksumsWorkerOperation
 DBState
 DCHCacheEntry
 DEADLOCK_INFO
-- 
2.51.0

#65Tomas Vondra
tomas@vondra.me
In reply to: Tomas Vondra (#64)
3 attachment(s)
Re: Changing the state of data checksums in a running cluster

Hi,

We had some off-list discussions about this patch (me, Daniel and
Andres), and Andres mentioned he suspects the patch may be breaking PITR
in some way. We didn't have any example of that, but PITR seems like a
pretty fundamental feature, so I took it seriously and decided to do
some stress testing. And yeah, there are issues :-(

I did a similar stress testing in the past, which eventually evolved
into the two TAP tests in the current patch. Those TAP tests run either
a single node or primary-standby cluster, and flip checksums on/off
while restarting the instance(s). And verify the cluster behaves OK,
with no checksum failures, unexpected states, etc.

I chose to do something similar to test PITR - run a single node with
pgbench (or some other write activity), flip checksums on/off in a loop,
while taking basebackups. And then validate the basebackups are valid,
and can be used to start a new instance. I planned to extend this to
more elaborate tests with proper PITR using a WAL archive, etc. I didn't
get that far - this simple test already hit a couple issues.

I'm attaching sets of test scripts, and the scripts used to validate
backups. The scripts are fairly simple, but need some changes to run -
adjusting some paths, etc.

1) basebackup-short.tgz - basic basebackup test

2) basebackup-long.tgz - long-runnning basebackup test (throttled)

3) validate.tgz - validate backups from (1) and (2)

The test scripts expect "scale" parameter for pgbench. I used 500, but
that's mostly arbitrary - maybe a different value would hit the issues
more often. Not sure.

While testing I ran into two issues while validating the backups (which
is essentially about performing the usual basebackup redo).

1) assert in a checkpointer

TRAP: failed Assert("TransactionIdIsValid(initial)"), File:
"procarray.c", Line: 1707, PID: 2649754
postgres: checkpointer (ExceptionalCondition+0x56) [0x55d64ec38f96]
postgres: checkpointer (+0x5341d8) [0x55d64eac41d8]
postgres: checkpointer (GetOldestTransactionIdConsideredRunning+0xc)
[0x55d64eac598c]
postgres: checkpointer (CreateRestartPoint+0x725) [0x55d64e7dd2f5]
postgres: checkpointer (CheckpointerMain+0x3ec) [0x55d64ea38a8c]
postgres: checkpointer (postmaster_child_launch+0x102) [0x55d64ea3b592]
postgres: checkpointer (+0x4ad74a) [0x55d64ea3d74a]
postgres: checkpointer (PostmasterMain+0xce7) [0x55d64ea40b97]
postgres: checkpointer (main+0x1ca) [0x55d64e713b1a]
/lib/x86_64-linux-gnu/libc.so.6(+0x29ca8) [0x7f55f1a33ca8]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85) [0x7f55f1a33d65]
postgres: checkpointer (_start+0x21) [0x55d64e713e81]

I was really confused by this initially, because why would this break
tracking of running transactions in a snapshot, etc.? But that's really
a red herring - the real issue is calling CreateRestartPoint().

This happens because we need to ensure that a standby does not get
confused about checksums state, so redo of XLOG_CHECKSUMS forces
creating a restart point whenever the checksum state changes. Which has
some negative consequences, as discussed in my previous message.

But AFAICS in this case the root cause is calling this during a regular
redo, not just on a standby. Presumably there's a way to distinguish
these two cases, and skip the restart point creation on a simple redo.
Alternatively, maybe the standby should do something different instead
of creating a restart point. (More about that later.)

2) unexpected state during redo of XLOG_CHECKSUMS record

An example of the failure:

LOG: redo starts at 36/5B028668
FATAL: incorrect data checksum state 3 for target state 1
CONTEXT: WAL redo at 37/E136E408 for XLOG/CHECKSUMS: on
ERROR: incorrect data checksum state 3 for target state 1
ERROR: incorrect data checksum state 3 for target state 1
ERROR: incorrect data checksum state 3 for target state 1
ERROR: incorrect data checksum state 3 for target state 1
ERROR: incorrect data checksum state 3 for target state 1
LOG: startup process (PID 2649028) exited with exit code 1

I was really confused about this at first, because I haven't seen this
during earlier testing (which did redo after a crash), so why should it
happen here?

But the reason is very simple - the basebackup can run for a while,
spanning multiple checkpoints, and also multiple changes of checksum
state (enable/disable). Furthermore, basebackup copies the pg_control
file last, and that's where we get the checksum state from.

For the earlier crash/restart test that's not the case, because each
checksum change creates a checkpoint, and the redo starts from that
point. There can't be multiple XLOG_CHECKSUMS records to apply.

So after a crash we start from the last checkpoint (i.e. from the
initial checksum state), and then there can be only a single
XLOG_CHECKSUMS record to reapply.

But with a basebackup we get the checksum state from the pg_control
copied at the end (so likely the final state, not the initial one), and
there may be multiple XLOG_CHECKSUMS in between (and multiple
checkpoints). So what happens is that we start from the final state, and
then try to apply the earlier XLOG_CHECKSUMS states. Hence the failure.

This seems like a pretty fundamental issue - we can't guarantee a backup
won't run too long, or anything like that. A full backup may easily run
for multiple hours or even days.

I suspect it should be possible to hit an issue similar to those we
observed on the standby in earlier testing (which is why we ended up
creating startpoints on a standby) with "future" blocks.

Imagine a basebackup starts, we disable checksums, and one of the blocks
gets written to the disk without a checksum before the basebackup copies
that file. Then during redo of the backup, we start with checksums=on
(assume we don't have the issue with incorrect state). If we attempt to
read the page before the XLOG_CHECKSUMS, that'll fail, because the
on-disk page does not have a valid checksum, and we need that to read
the page LSN.

I only speculate this can happen, I haven't actually seen / reproduced
it. But maybe it's fixed thanks to FPI, or something like that? It
didn't help on the standby, though. And in that case allowing a random
mix of pages with/without checksums in a basebackup seems problematic.

What could we do about the root cause? We discussed this with Daniel and
we've been stuck for quite a while. But then it occurred to us maybe we
can simply "pause" the checksum state change while there's backup in
progress. We already enable/disable FPW based on this, so why couldn't
we check XLogCtl->Insert.runningBackups, and only advance to the next
checksum state if (runningBackups==0)?

That would mean a single backup does not need to worry about seeing a
mix of blocks written with different checksum states, and it also means
the final pg_control file has the correct checksum state, because it is
not allowed to change during the basebackup.

Of course, this would mean checksum changes may take longer. A corner
case is that database with a basebackup running 100% of the time won't
be able to change checksums on-line. But to me that seems acceptable, if
communicated / documented clearly.

It also occurred to me something like this might help with the standby
case too. On the standby, the problem happens when it skips checkpoints
when creating a restart point, in which case redo may go too far back,
with an incompatible checksum state. Maybe if a standby reported LSN of
the last restartpoint, the primary could use that to decide whether it's
safe to update the checksum state. Of course, this is tricky, because
standbys may be disconnected, there's cascading replications, etc.

regards

--
Tomas Vondra

Attachments:

validate.tgzapplication/x-compressed-tar; name=validate.tgzDownload
basebackup-long.tgzapplication/x-compressed-tar; name=basebackup-long.tgzDownload
basebackup-short.tgzapplication/x-compressed-tar; name=basebackup-short.tgzDownload
#66Tomas Vondra
tomas@vondra.me
In reply to: Tomas Vondra (#65)
Re: Changing the state of data checksums in a running cluster

On 11/10/25 02:26, Tomas Vondra wrote:

What could we do about the root cause? We discussed this with Daniel and
we've been stuck for quite a while. But then it occurred to us maybe we
can simply "pause" the checksum state change while there's backup in
progress. We already enable/disable FPW based on this, so why couldn't
we check XLogCtl->Insert.runningBackups, and only advance to the next
checksum state if (runningBackups==0)?

That would mean a single backup does not need to worry about seeing a
mix of blocks written with different checksum states, and it also means
the final pg_control file has the correct checksum state, because it is
not allowed to change during the basebackup.

Of course, this would mean checksum changes may take longer. A corner
case is that database with a basebackup running 100% of the time won't
be able to change checksums on-line. But to me that seems acceptable, if
communicated / documented clearly.

After thinking about this approach a bit, I realized the basebackup may
also run on the standby. Which means the checksum process won't see it
by checking XLogCtl->Insert.runningBackups. It will merrily proceed,
breaking the standby backup just as described earlier ...

Not sure what would be a good fix. One option is to "pause" the redo,
which is what the patch already does (by forcing an immediate checkpoint
whenever checksum state changes). We could pause redo until the backup
completes. But of course, that'd be terrible - especially for syncrep. I
hoped we'd find a better approach, and pausing redo for longer goes in
the opposite direction.

On the other hand, we already have similar issue with full_page_writes.
The backup on standby is not allowed to start if fpw=off, and if the
setting changes while the backup is running, the backup fails:

pg_basebackup: error: backup failed: ERROR: WAL generated with
"full_page_writes=off" was replayed during online backup
HINT: This means that the backup being taken on the standby is corrupt
and should not be used. Enable "full_page_writes" and run CHECKPOINT on
the primary, and then try an online backup again.

Maybe this would be acceptable for checksums too ...

It's not exactly the same, of course. We don't really expect people to
change fpw in a running cluster.

regards

--
Tomas Vondra

#67Andreas Karlsson
andreas@proxel.se
In reply to: Tomas Vondra (#65)
Re: Changing the state of data checksums in a running cluster

Hi,

I have been following these discussions but not read the patch in detail.

This patch makes me worried especially with the new issues recently
uncovered. This was already a quite big patch and to fix these issues it
will likely have to become even bigger and given how this would become a
very rarely stressed code paths I wonder if we can actually ever become
confident that the patch works in all edge cases.

Something like this need to be easy to understand for us to have any
hope at all to be comfortable in the correctness. Can we actually do that?

Andreas

#68Tomas Vondra
tomas@vondra.me
In reply to: Andreas Karlsson (#67)
Re: Changing the state of data checksums in a running cluster

On 11/19/25 22:03, Andreas Karlsson wrote:

Hi,

I have been following these discussions but not read the patch in detail.

This patch makes me worried especially with the new issues recently
uncovered. This was already a quite big patch and to fix these issues it
will likely have to become even bigger and given how this would become a
very rarely stressed code paths I wonder if we can actually ever become
confident that the patch works in all edge cases.

Something like this need to be easy to understand for us to have any
hope at all to be comfortable in the correctness. Can we actually do that?

How's this different from any other complex patch? We get more familiar
with the problem during review, identify issues, improve the patch to
address them. And then again and again.

Of course, it'd be great to have a perfect understanding of the problem
from the very beginning, but that's not always possible. And I can't
guarantee we'll find/fix all issues.

regards

--
Tomas Vondra

#69Andreas Karlsson
andreas@proxel.se
In reply to: Tomas Vondra (#68)
Re: Changing the state of data checksums in a running cluster

On 11/20/25 11:34 AM, Tomas Vondra wrote:

On 11/19/25 22:03, Andreas Karlsson wrote:

Hi,

I have been following these discussions but not read the patch in detail.

This patch makes me worried especially with the new issues recently
uncovered. This was already a quite big patch and to fix these issues it
will likely have to become even bigger and given how this would become a
very rarely stressed code paths I wonder if we can actually ever become
confident that the patch works in all edge cases.

Something like this need to be easy to understand for us to have any
hope at all to be comfortable in the correctness. Can we actually do that?

How's this different from any other complex patch? We get more familiar
with the problem during review, identify issues, improve the patch to
address them. And then again and again.

The difference I see is in how rarely anyone actually switches checksum
state in a production database, especially now that we enabled them by
default. A complex and rarely stressed code path is a minefield.

Andreas

#70Tomas Vondra
tomas@vondra.me
In reply to: Andreas Karlsson (#69)
Re: Changing the state of data checksums in a running cluster

On 11/21/25 01:44, Andreas Karlsson wrote:

On 11/20/25 11:34 AM, Tomas Vondra wrote:

On 11/19/25 22:03, Andreas Karlsson wrote:

Hi,

I have been following these discussions but not read the patch in
detail.

This patch makes me worried especially with the new issues recently
uncovered. This was already a quite big patch and to fix these issues it
will likely have to become even bigger and given how this would become a
very rarely stressed code paths I wonder if we can actually ever become
confident that the patch works in all edge cases.

Something like this need to be easy to understand for us to have any
hope at all to be comfortable in the correctness. Can we actually do
that?

How's this different from any other complex patch? We get more familiar
with the problem during review, identify issues, improve the patch to
address them. And then again and again.

The difference I see is in how rarely anyone actually switches checksum
state in a production database, especially now that we enabled them by
default. A complex and rarely stressed code path is a minefield.

True. Hence the stress testing I've been doing - and indeed, that made
us discover the various issues reported in this thread.

Still, isn't that similar to error paths in various other patches? Those
also tend to be rarely exercised in practice. I think the right way to
address that is more testing. Of course, there's a difference between
"regular bugs" and "design problems". Some of the issues are more about
the design/architecture not considering something important.

I don't know if / when this will be ready for commit. Maybe never, who
knows. I prefer going step by step. We know about a couple issues, we
need to figure out what to do about those. Then we can reconsider.

FWIW I'm not sure the number of people currently enabling checksums on
production databases is a good metric of how important the patch is.
Maybe more people would like to do that, but can't accept the downtime.

regards

--
Tomas Vondra

#71Bruce Momjian
bruce@momjian.us
In reply to: Tomas Vondra (#70)
Re: Changing the state of data checksums in a running cluster

On Fri, Nov 21, 2025 at 01:17:09PM +0100, Tomas Vondra wrote:

True. Hence the stress testing I've been doing - and indeed, that made
us discover the various issues reported in this thread.

Still, isn't that similar to error paths in various other patches? Those
also tend to be rarely exercised in practice. I think the right way to
address that is more testing. Of course, there's a difference between
"regular bugs" and "design problems". Some of the issues are more about
the design/architecture not considering something important.

I don't know if / when this will be ready for commit. Maybe never, who
knows. I prefer going step by step. We know about a couple issues, we
need to figure out what to do about those. Then we can reconsider.

FWIW I'm not sure the number of people currently enabling checksums on
production databases is a good metric of how important the patch is.
Maybe more people would like to do that, but can't accept the downtime.

I think it is a worth-while feature. We would have had it years ago
except that people asked for re-start-ability after a crash, and since
we don't have restart logic at the relation level, the patch got too
complex and was abandoned.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Do not let urgent matters crowd out time for investment in the future.

#72Andres Freund
andres@anarazel.de
In reply to: Andreas Karlsson (#69)
Re: Changing the state of data checksums in a running cluster

Hi,

On 2025-11-21 01:44:31 +0100, Andreas Karlsson wrote:

On 11/20/25 11:34 AM, Tomas Vondra wrote:

On 11/19/25 22:03, Andreas Karlsson wrote:

I have been following these discussions but not read the patch in detail.

This patch makes me worried especially with the new issues recently
uncovered. This was already a quite big patch and to fix these issues it
will likely have to become even bigger and given how this would become a
very rarely stressed code paths I wonder if we can actually ever become
confident that the patch works in all edge cases.

Something like this need to be easy to understand for us to have any
hope at all to be comfortable in the correctness. Can we actually do that?

How's this different from any other complex patch? We get more familiar
with the problem during review, identify issues, improve the patch to
address them. And then again and again.

The difference I see is in how rarely anyone actually switches checksum
state in a production database, especially now that we enabled them by
default. A complex and rarely stressed code path is a minefield.

FWIW, I think this is actually a good feature build the infrastructure for
features (i.e. dynamically reconfiguring the cluster while running) like this,
precisely because it isn't *constantly* used.

Greetings,

Andres Freund

#73Daniel Gustafsson
daniel@yesql.se
In reply to: Tomas Vondra (#64)
1 attachment(s)
Re: Changing the state of data checksums in a running cluster

On 5 Nov 2025, at 15:05, Tomas Vondra <tomas@vondra.me> wrote:

I took a look at the patch again, focusing mostly on the docs and
comments.

While I am still working on a few ideas on how to handle the PITR issue, I
didn't want to leave this review hanging longer (and it would be nice to get it
out of the way and not mix it with other issues).

1) func-admin.sgml

- This is missing documentation for the "fast" parameters for both
functions (enable/disable).

I had originally omitted them intentionally since they were intended for
testing, but I agree that they should be documented as they might be useful
outside of testng as well.

- The paragraph stars with "Initiates data checksums for ..." but that
doesn't sound right to me. I think you can initiate enabling/disabling,
not "data checksums".

Fair point.

- I think the docs should at least briefly describe the impact on the
cluster, and also on a standby, due to having to write everything into
WAL, waiting for checkpoints etc. And maybe discuss how to mitigate that
in some way. More about the standby stuff later.

Agreed, but I think it's better done in a one central place like the data
checksums section in wal.sgml (which your XXX also mention). In the preamble
to the function table I've added a mention of the system performance impact
with a link to the aforementioned section.

2) glossary.sgml

- This describes "checksum worker" as process that enables or disables
checksums in a specific database, but we don't need any per-database
processes when disabling, no?

That's very true, the Worker Launcher will receive the operation and if it is
to disable it will proceed without launching any workers since it's cluster
wide. Thinking about it, maybe DataChecksumsWorkerLauncher isn't a very good
name, DataChecksumsController or DataChecksumsCoordinator might be a better
choice?

Naming seems to be hard, who knew..

3) monitoring.sgml

- I'm not sure what "regardles" implies here. Does that mean we simply
don't hide/reset the counters?

Correct, I've expanded this para to mention this.

- I added a brief explanation about using the "launcher" row for overall
progress, and per-database workers for "current progress".

+1

- Do we want to refer to "datachecksumsworker"? Isn't it a bit too
low-level detail?

I think so, we should stick to "launcher process" and "worker process" and be
conistent about it.

- The table of phases is missing "waiting" and "done" phases. IMHO if
the progress view can return it, it should be in the docs.

Nice catch. The code can't actually return "waiting" since it was broken up
into three different wait phases, but one of them wasn't documented or added to
the view properly. Fixed.

4) wal.sgml

- I added a sentence explaining that both enabling and disabling happens
in phases, with checkpoints in between. I think that's helpful for users
and DBAs.

+1

- The section only described "enabling checksums", but it probably
should explain the process for disabling too. Added a para.

+1

- Again, I think we should explain the checkpoints, restartpoints,
impact on standby ... somewhere. Not sure if this is the right place.

I've added a subsection in the main Data Checksums section for this.

5) xlog.c

- Some minor corrections (typos, ...).

Thanks, I did some additional minor wordsmithing around these.

- Isn't the claim that PG_DATA_CHECKSUM_ON_VERSION is the only place
where we check InitialDataChecksumTransition stale? AFAIK we now check
this in AbsorbDataChecksumsBarrier for all transitions, no?

Correct, fixed.

6) datachecksumsworker.c

- One of the items in the synchronization/correctness section states
that "Backends SHALL NOT violate local data_checksums state" but what
does "violating local data_checksums state" mean? What even is "local
state in this context"? Should we explain/define that, or would that be
unnecessarily detailed?

By "local state" I was referring to the data checksum state that a backend
knows about. I've reworded this to hopefully be a little clearer.

- The section only talks about "enabling checksums" but also discusses
all four possible states. Maybe it should talk about disabling too, as
that requires the same synchronization/correctness.

- Maybe it'd be good make it more explicit at which point the process
waits on a barrier, which backend initiates that (and which backends are
required to absorb the signal). It kinda is there, but only indirectly.

I've tried to address these, but they might still be off since I am very
Stockholm syndromed into this.

- Another idea I had is that maybe it'd help to have some visualization
of the process (with data_checksum states, barriers, ...) e.g. in the
form of an ASCII image.

I instead opted for a SVG image in the docs which illustrates the states and
the transitions. My Graphviz skills aren't that great so it doesn't look all
that pretty yet, but it's something to iterate on at least.

I'm much more concerned about streaming replicas, because there it
forces a restart point - and it *blocks redo* until it completes. Which
means there'll be replication lag, and for synchronous standbys this
would even block progress on the primary.

We should very clearly document this.

Indeed. I've started on such a section.

I see some similarities to shutdown checkpoints, which can take a lot of
time if there happen to be a lot of dirty data, increasing disruption
during planned restarts (when no one can connect). A common mitigation
is to run CHECKPOINT shortly before the restart, to flush most of the
dirty data while still allowing new connections.

Maybe we could do something like that for checksum changes? I don't know
how exactly we could do that, but let's say we can predict when roughly
to expect the next state change.

Each worker knows how far it has come within its database, but is unaware about
other databases; the launcher knows how far it has come across all databases,
but is unaware about the relative size of each database. Maybe there still is
a heuristic that can be teased out of imperfect knowledge.

And we'd ensure the standby starts
flushing stuff before that, so that creating the restartpoint is cheap.
Or maybe we'd (gradually?) lower max_wal_size on the standby, to reduce
the amount of WAL as we're getting closer to the end?

That's an interesting idea, do you know if we have processes taking a similar
approach today?

The attached is a rebase with the above fixes along with a few more smaller
fixups and cleanups noticed along the way, nothing which change any
functionality though.

--
Daniel Gustafsson

Attachments:

v20251201-0001-Online-enabling-and-disabling-of-data-chec.patchapplication/octet-stream; name=v20251201-0001-Online-enabling-and-disabling-of-data-chec.patch; x-unix-mode=0644Download
From 30c84caf66ccb1014fec41d070eda8144034f4cf Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Fri, 15 Aug 2025 14:48:02 +0200
Subject: [PATCH v20251201] Online enabling and disabling of data checksums

This allows data checksums to be enabled, or disabled, in a running
cluster without restricting access to the cluster during processing.

Data checksums could prior to this only be enabled during initdb or
when the cluster is offline using the pg_checksums app. This commit
introduce functionality to enable, or disable, data checksums while
the cluster is running regardless of how it was initialized.

A background worker launcher process is responsible for launching a
dynamic per-database background worker which will mark all buffers
dirty for all relation with storage in order for them to have data
checksums calcuated on write.  Once all relations in all databases
have been processed, the data_checksums state will be set to on and
the cluster will at that point be identical to one which had data
checksums enabled during initialization or via offline processing.

When data checksums are being enabled, concurrent I/O operations
from backends other than the data checksums worker will write the
checksums but not verify them on reading.  Only when all backends
have absorbed the procsignalbarrier for setting data_checksums to
on will they also start verifying checksums on reading.  The same
process is repeated during disabling; all backends write checksums
but do not verify them until the barrier for setting the state to
off has been absorbed by all.  This in-progress state is used to
ensure there are no false negatives (or positives) due to reading
a checksum which is not in sync with the page.

A new testmodule, test_checksums, is introduced with an extensive
set of tests covering both online and offline data checksum mode
changes.  The tests for online processing are gated begind the
PG_TEST_EXTRA flag to some degree due to being very time consuming
to run.

This work is based on an earlier version of this patch which was
reviewed by among others Heikki Linnakangas, Robert Haas, Andres
Freund, Tomas Vondra, Michael Banck and Andrey Borodin.  During
the work on this new version, Tomas Vondra has given invaluable
assistance with not only coding and reviewing but very in-depth
testing.

Author: Daniel Gustafsson <daniel@yesql.se>
Author: Magnus Hagander <magnus@hagander.net>
Co-authored-by: Tomas Vondra <tomas@vondra.me>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com
Discussion: https://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de
Discussion: https://postgr.es/m/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com
---
 doc/src/sgml/func/func-admin.sgml             |   85 +
 doc/src/sgml/glossary.sgml                    |   24 +
 doc/src/sgml/images/Makefile                  |    1 +
 doc/src/sgml/images/datachecksums.gv          |   14 +
 doc/src/sgml/images/datachecksums.svg         |   81 +
 doc/src/sgml/monitoring.sgml                  |  235 ++-
 doc/src/sgml/ref/pg_checksums.sgml            |    6 +
 doc/src/sgml/regress.sgml                     |   12 +
 doc/src/sgml/wal.sgml                         |  121 +-
 src/backend/access/rmgrdesc/xlogdesc.c        |   24 +
 src/backend/access/transam/xlog.c             |  675 +++++++-
 src/backend/access/transam/xlogfuncs.c        |   57 +
 src/backend/access/transam/xlogrecovery.c     |   13 +
 src/backend/backup/basebackup.c               |    6 +-
 src/backend/catalog/system_functions.sql      |   20 +
 src/backend/catalog/system_views.sql          |   20 +
 src/backend/postmaster/Makefile               |    1 +
 src/backend/postmaster/auxprocess.c           |   19 +
 src/backend/postmaster/bgworker.c             |    7 +
 src/backend/postmaster/datachecksumsworker.c  | 1491 +++++++++++++++++
 src/backend/postmaster/meson.build            |    1 +
 src/backend/postmaster/postmaster.c           |    5 +
 src/backend/replication/logical/decode.c      |    1 +
 src/backend/storage/ipc/ipci.c                |    3 +
 src/backend/storage/ipc/procsignal.c          |   14 +
 src/backend/storage/page/README               |    4 +-
 src/backend/storage/page/bufpage.c            |   10 +-
 src/backend/utils/activity/pgstat_backend.c   |    2 +
 src/backend/utils/activity/pgstat_io.c        |    2 +
 .../utils/activity/wait_event_names.txt       |    4 +
 src/backend/utils/adt/pgstatfuncs.c           |    8 +-
 src/backend/utils/init/miscinit.c             |    3 +-
 src/backend/utils/init/postinit.c             |   20 +-
 src/backend/utils/misc/guc_parameters.dat     |    5 +-
 src/backend/utils/misc/guc_tables.c           |    9 +-
 src/bin/pg_checksums/pg_checksums.c           |    4 +-
 src/bin/pg_controldata/pg_controldata.c       |    2 +
 src/bin/pg_upgrade/controldata.c              |    9 +
 src/include/access/xlog.h                     |   14 +-
 src/include/access/xlog_internal.h            |    7 +
 src/include/catalog/pg_control.h              |    6 +-
 src/include/catalog/pg_proc.dat               |   19 +
 src/include/commands/progress.h               |   17 +
 src/include/miscadmin.h                       |    6 +
 src/include/postmaster/datachecksumsworker.h  |   51 +
 src/include/postmaster/proctypelist.h         |    2 +
 src/include/storage/bufpage.h                 |    2 +-
 src/include/storage/checksum.h                |   15 +
 src/include/storage/lwlocklist.h              |    1 +
 src/include/storage/proc.h                    |    6 +-
 src/include/storage/procsignal.h              |    5 +
 src/include/utils/backend_progress.h          |    1 +
 src/test/modules/Makefile                     |    1 +
 src/test/modules/meson.build                  |    1 +
 src/test/modules/test_checksums/.gitignore    |    2 +
 src/test/modules/test_checksums/Makefile      |   40 +
 src/test/modules/test_checksums/README        |   22 +
 src/test/modules/test_checksums/meson.build   |   36 +
 .../modules/test_checksums/t/001_basic.pl     |   63 +
 .../modules/test_checksums/t/002_restarts.pl  |  110 ++
 .../test_checksums/t/003_standby_restarts.pl  |  114 ++
 .../modules/test_checksums/t/004_offline.pl   |   82 +
 .../modules/test_checksums/t/005_injection.pl |  126 ++
 .../test_checksums/t/006_pgbench_single.pl    |  268 +++
 .../test_checksums/t/007_pgbench_standby.pl   |  398 +++++
 .../test_checksums/t/DataChecksums/Utils.pm   |  283 ++++
 .../test_checksums/test_checksums--1.0.sql    |   28 +
 .../modules/test_checksums/test_checksums.c   |  225 +++
 .../test_checksums/test_checksums.control     |    4 +
 src/test/perl/PostgreSQL/Test/Cluster.pm      |   45 +
 src/test/regress/expected/rules.out           |   36 +
 src/test/regress/expected/stats.out           |   18 +-
 src/tools/pgindent/typedefs.list              |    8 +
 73 files changed, 5032 insertions(+), 48 deletions(-)
 create mode 100644 doc/src/sgml/images/datachecksums.gv
 create mode 100644 doc/src/sgml/images/datachecksums.svg
 create mode 100644 src/backend/postmaster/datachecksumsworker.c
 create mode 100644 src/include/postmaster/datachecksumsworker.h
 create mode 100644 src/test/modules/test_checksums/.gitignore
 create mode 100644 src/test/modules/test_checksums/Makefile
 create mode 100644 src/test/modules/test_checksums/README
 create mode 100644 src/test/modules/test_checksums/meson.build
 create mode 100644 src/test/modules/test_checksums/t/001_basic.pl
 create mode 100644 src/test/modules/test_checksums/t/002_restarts.pl
 create mode 100644 src/test/modules/test_checksums/t/003_standby_restarts.pl
 create mode 100644 src/test/modules/test_checksums/t/004_offline.pl
 create mode 100644 src/test/modules/test_checksums/t/005_injection.pl
 create mode 100644 src/test/modules/test_checksums/t/006_pgbench_single.pl
 create mode 100644 src/test/modules/test_checksums/t/007_pgbench_standby.pl
 create mode 100644 src/test/modules/test_checksums/t/DataChecksums/Utils.pm
 create mode 100644 src/test/modules/test_checksums/test_checksums--1.0.sql
 create mode 100644 src/test/modules/test_checksums/test_checksums.c
 create mode 100644 src/test/modules/test_checksums/test_checksums.control

diff --git a/doc/src/sgml/func/func-admin.sgml b/doc/src/sgml/func/func-admin.sgml
index 1b465bc8ba7..f866799f9da 100644
--- a/doc/src/sgml/func/func-admin.sgml
+++ b/doc/src/sgml/func/func-admin.sgml
@@ -2979,4 +2979,89 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8');
 
   </sect2>
 
+  <sect2 id="functions-admin-checksum">
+   <title>Data Checksum Functions</title>
+
+   <para>
+    The functions shown in <xref linkend="functions-checksums-table" /> can
+    be used to enable or disable data checksums in a running cluster.
+   </para>
+   <para>
+    Changing data checksums can be done in a cluster with concurrent activity
+    without blocking queries, but overall system performance will be affected.
+    See <xref linkend="checksums" /> for further details on how changing the
+    data checksums mode can affect a system and possible mitigations for how
+    to reduce the impact.
+   </para>
+
+   <table id="functions-checksums-table">
+    <title>Data Checksum Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_enable_data_checksums</primary>
+        </indexterm>
+        <function>pg_enable_data_checksums</function> ( <optional><parameter>cost_delay</parameter> <type>int</type>, <parameter>cost_limit</parameter> <type>int</type>, <parameter>fast</parameter> <type>bool</type></optional> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Initiates the process of enabling data checksums for the cluster. This
+        will set the data checksums mode to <literal>inprogress-on</literal>
+        as well as start a background worker that will process all pages in all
+        databases and enable data checksums on them.  When all data checksums
+        have been calculated, and written, for all pages the cluster will
+        automatically set data checksums mode to <literal>on</literal>.
+       </para>
+       <para>
+        If <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter> are
+        specified, the process is throttled using the same principles as
+        <link linkend="runtime-config-resource-vacuum-cost">Cost-based Vacuum Delay</link>.
+        If <parameter>fast</parameter> is specified as <literal>true</literal>
+        then a fast checkpoint will be issued when data checksums have been
+        enabled, which may cause a spike in I/O.
+       </para>
+       </entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_disable_data_checksums</primary>
+        </indexterm>
+        <function>pg_disable_data_checksums</function> ( <optional><parameter>fast</parameter> <type>bool</type></optional> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Disables data checksum calculation and validation for the cluster. This
+        will set the data checksum mode to <literal>inprogress-off</literal>
+        while data checksums are being disabled.  When all active backends have
+        stopped validating data checksums, the data checksum mode will be
+        set to <literal>off</literal>.
+       </para>
+       <para>
+        If <parameter>fast</parameter> is specified as <literal>true</literal>
+        then a fast checkpoint will be issued when data checksums have been
+        enabled, which may cause a spike in I/O.
+       </para>
+       </entry>
+      </row>
+     </tbody>
+    </tgroup>
+   </table>
+
+  </sect2>
+
   </sect1>
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index a76cf5c383f..138dcd45fd6 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -199,6 +199,8 @@
      (but not the autovacuum workers),
      the <glossterm linkend="glossary-background-writer">background writer</glossterm>,
      the <glossterm linkend="glossary-checkpointer">checkpointer</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker">data checksums worker</glossterm>,
+     the <glossterm linkend="glossary-data-checksums-worker-launcher">data checksums worker launcher</glossterm>,
      the <glossterm linkend="glossary-logger">logger</glossterm>,
      the <glossterm linkend="glossary-startup-process">startup process</glossterm>,
      the <glossterm linkend="glossary-wal-archiver">WAL archiver</glossterm>,
@@ -574,6 +576,28 @@
    <glosssee otherterm="glossary-data-directory" />
   </glossentry>
 
+  <glossentry id="glossary-data-checksums-worker">
+   <glossterm>Data Checksums Worker</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which enables data checksums in a specific database.
+    </para>
+   </glossdef>
+  </glossentry>
+
+  <glossentry id="glossary-data-checksums-worker-launcher">
+   <glossterm>Data Checksums Worker Launcher</glossterm>
+   <glossdef>
+    <para>
+     An <glossterm linkend="glossary-auxiliary-proc">auxiliary process</glossterm>
+     which starts <glossterm linkend="glossary-data-checksums-worker"> processes</glossterm>
+     for enabling data checksums in each database, or disables data checksums
+     cluster-wide.
+    </para>
+   </glossdef>
+  </glossentry>
+
   <glossentry id="glossary-database">
    <glossterm>Database</glossterm>
    <glossdef>
diff --git a/doc/src/sgml/images/Makefile b/doc/src/sgml/images/Makefile
index fd55b9ad23f..e805487344d 100644
--- a/doc/src/sgml/images/Makefile
+++ b/doc/src/sgml/images/Makefile
@@ -3,6 +3,7 @@
 # see README in this directory about image handling
 
 ALL_IMAGES = \
+	datachecksums.svg \
 	genetic-algorithm.svg \
 	gin.svg \
 	pagelayout.svg \
diff --git a/doc/src/sgml/images/datachecksums.gv b/doc/src/sgml/images/datachecksums.gv
new file mode 100644
index 00000000000..dff3ff7340a
--- /dev/null
+++ b/doc/src/sgml/images/datachecksums.gv
@@ -0,0 +1,14 @@
+digraph G {
+    A -> B [label="SELECT pg_enable_data_checksums()"];
+    B -> C;
+    D -> A;
+    C -> D [label="SELECT pg_disable_data_checksums()"];
+    E -> A [label=" --no-data-checksums"];
+    E -> C [label=" --data-checksums"];
+
+    A [label="off"];
+    B [label="inprogress-on"];
+    C [label="on"];
+    D [label="inprogress-off"];
+    E [label="initdb"];
+}
diff --git a/doc/src/sgml/images/datachecksums.svg b/doc/src/sgml/images/datachecksums.svg
new file mode 100644
index 00000000000..8c58f42922e
--- /dev/null
+++ b/doc/src/sgml/images/datachecksums.svg
@@ -0,0 +1,81 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!-- Generated by graphviz version 14.0.5 (20251129.0259)
+ -->
+<!-- Title: G Pages: 1 -->
+<svg width="409pt" height="383pt"
+ viewBox="0.00 0.00 409.00 383.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
+<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 378.5)">
+<title>G</title>
+<polygon fill="white" stroke="none" points="-4,4 -4,-378.5 404.74,-378.5 404.74,4 -4,4"/>
+<!-- A -->
+<g id="node1" class="node">
+<title>A</title>
+<ellipse fill="none" stroke="black" cx="80.12" cy="-268" rx="27" ry="18"/>
+<text xml:space="preserve" text-anchor="middle" x="80.12" y="-262.95" font-family="Times,serif" font-size="14.00">off</text>
+</g>
+<!-- B -->
+<g id="node2" class="node">
+<title>B</title>
+<ellipse fill="none" stroke="black" cx="137.12" cy="-179.5" rx="61.59" ry="18"/>
+<text xml:space="preserve" text-anchor="middle" x="137.12" y="-174.45" font-family="Times,serif" font-size="14.00">inprogress&#45;on</text>
+</g>
+<!-- A&#45;&gt;B -->
+<g id="edge1" class="edge">
+<title>A&#45;&gt;B</title>
+<path fill="none" stroke="black" d="M76.5,-249.68C75.22,-239.14 75.3,-225.77 81.12,-215.5 84.2,-210.08 88.49,-205.38 93.35,-201.34"/>
+<polygon fill="black" stroke="black" points="95.22,-204.31 101.33,-195.66 91.16,-198.61 95.22,-204.31"/>
+<text xml:space="preserve" text-anchor="middle" x="187.62" y="-218.7" font-family="Times,serif" font-size="14.00">SELECT pg_enable_data_checksums()</text>
+</g>
+<!-- C -->
+<g id="node3" class="node">
+<title>C</title>
+<ellipse fill="none" stroke="black" cx="137.12" cy="-106.5" rx="27" ry="18"/>
+<text xml:space="preserve" text-anchor="middle" x="137.12" y="-101.45" font-family="Times,serif" font-size="14.00">on</text>
+</g>
+<!-- B&#45;&gt;C -->
+<g id="edge2" class="edge">
+<title>B&#45;&gt;C</title>
+<path fill="none" stroke="black" d="M137.12,-161.31C137.12,-153.73 137.12,-144.6 137.12,-136.04"/>
+<polygon fill="black" stroke="black" points="140.62,-136.04 137.12,-126.04 133.62,-136.04 140.62,-136.04"/>
+</g>
+<!-- D -->
+<g id="node4" class="node">
+<title>D</title>
+<ellipse fill="none" stroke="black" cx="63.12" cy="-18" rx="63.12" ry="18"/>
+<text xml:space="preserve" text-anchor="middle" x="63.12" y="-12.95" font-family="Times,serif" font-size="14.00">inprogress&#45;off</text>
+</g>
+<!-- C&#45;&gt;D -->
+<g id="edge4" class="edge">
+<title>C&#45;&gt;D</title>
+<path fill="none" stroke="black" d="M124.23,-90.43C113.36,-77.73 97.58,-59.28 84.77,-44.31"/>
+<polygon fill="black" stroke="black" points="87.78,-42.44 78.62,-37.12 82.46,-46.99 87.78,-42.44"/>
+<text xml:space="preserve" text-anchor="middle" x="214.75" y="-57.2" font-family="Times,serif" font-size="14.00">SELECT pg_disable_data_checksums()</text>
+</g>
+<!-- D&#45;&gt;A -->
+<g id="edge3" class="edge">
+<title>D&#45;&gt;A</title>
+<path fill="none" stroke="black" d="M62.52,-36.28C61.62,-68.21 60.54,-138.57 66.12,-197.5 67.43,-211.24 70.27,-226.28 73.06,-238.85"/>
+<polygon fill="black" stroke="black" points="69.64,-239.59 75.32,-248.54 76.46,-238 69.64,-239.59"/>
+</g>
+<!-- E -->
+<g id="node5" class="node">
+<title>E</title>
+<ellipse fill="none" stroke="black" cx="198.12" cy="-356.5" rx="32.41" ry="18"/>
+<text xml:space="preserve" text-anchor="middle" x="198.12" y="-351.45" font-family="Times,serif" font-size="14.00">initdb</text>
+</g>
+<!-- E&#45;&gt;A -->
+<g id="edge5" class="edge">
+<title>E&#45;&gt;A</title>
+<path fill="none" stroke="black" d="M179.16,-341.6C159.64,-327.29 129.05,-304.86 107.03,-288.72"/>
+<polygon fill="black" stroke="black" points="109.23,-286 99.1,-282.91 105.09,-291.64 109.23,-286"/>
+<text xml:space="preserve" text-anchor="middle" x="208.57" y="-307.2" font-family="Times,serif" font-size="14.00"> &#45;&#45;no&#45;data&#45;checksums</text>
+</g>
+<!-- E&#45;&gt;C -->
+<g id="edge6" class="edge">
+<title>E&#45;&gt;C</title>
+<path fill="none" stroke="black" d="M227.13,-348.04C242.29,-342.72 259.95,-334.06 271.12,-320.5 301.5,-283.62 316.36,-257.78 294.12,-215.5 268.41,-166.6 209.42,-135.53 171.52,-119.85"/>
+<polygon fill="black" stroke="black" points="172.96,-116.65 162.37,-116.21 170.37,-123.16 172.96,-116.65"/>
+<text xml:space="preserve" text-anchor="middle" x="350.87" y="-218.7" font-family="Times,serif" font-size="14.00"> &#45;&#45;data&#45;checksums</text>
+</g>
+</g>
+</svg>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index e0556b6baac..4873409c4ae 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3587,9 +3587,14 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Number of data page checksum failures detected in this
-       database (or on a shared object), or NULL if data checksums are
-       disabled.
-      </para></entry>
+       database (or on a shared object).  Detected failures are not reset if
+       the <xref linkend="guc-data-checksums"/> setting changes.  Clusters
+       which are initialized without data checksums will show this as
+       <literal>0</literal>. In <productname>PostgreSQL</productname> version
+       18 and earlier, this was set to <literal>NULL</literal> for clusters
+       with data checksums disabled.
+      </para>
+     </entry>
      </row>
 
      <row>
@@ -3598,8 +3603,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para>
       <para>
        Time at which the last data page checksum failure was detected in
-       this database (or on a shared object), or NULL if data checksums are
-       disabled.
+       this database (or on a shared object). Last failure is reported
+       regardless of the <xref linkend="guc-data-checksums"/> setting.
       </para></entry>
      </row>
 
@@ -6982,6 +6987,226 @@ FROM pg_stat_get_backend_idset() AS backendid;
 
  </sect2>
 
+ <sect2 id="data-checksum-progress-reporting">
+  <title>Data Checksum Progress Reporting</title>
+
+  <indexterm>
+   <primary>pg_stat_progress_data_checksums</primary>
+  </indexterm>
+
+  <para>
+   When data checksums are being enabled on a running cluster, the
+   <structname>pg_stat_progress_data_checksums</structname> view will contain
+   a row for the launcher process, and one row for each worker process which
+   is currently calculating and writing checksums for the data pages in a database.
+   The launcher provides overview of the overall progress (how many database
+   have been processed, how many remain), while the workers track progress for
+   currently processed databases.
+  </para>
+
+  <table id="pg-stat-progress-data-checksums-view" xreflabel="pg_stat_progress_data_checksums">
+   <title><structname>pg_stat_progress_data_checksums</structname> View</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        Column Type
+       </para>
+       <para>
+        Description>
+       </para>
+      </entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>pid</structfield> <type>integer</type>
+       </para>
+       <para>
+        Process ID of the data checksum process, launcher or worker.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>datid</structfield> <type>oid</type>
+       </para>
+       <para>
+        OID of this database, or 0 for the launcher process relation
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>datname</structfield> <type>name</type>
+       </para>
+       <para>
+        Name of this database, or <literal>NULL</literal> for the
+        launcher process.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>phase</structfield> <type>text</type>
+       </para>
+       <para>
+        Current processing phase, see <xref linkend="datachecksum-phases"/>
+        for description of the phases.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of databases which will be processed. Only the
+        launcher process has this value set, the worker processes have this
+        set to <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>databases_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of databases which have been processed. Only the launcher
+        process has this value set, the worker processes have this set to
+        <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The total number of relations which will be processed, or
+        <literal>NULL</literal> if the worker process hasn't
+        calculated the number of relations yet. The launcher process has
+        this set to <literal>NULL</literal> since it isn't responsible for
+        processing relations, only launching worker processes.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>relations_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of relations which have been processed. The launcher
+        process has this set to <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>blocks_total</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks in the current relation which will be processed,
+        or <literal>NULL</literal> if the worker process hasn't
+        calculated the number of blocks yet. The launcher process has
+        this set to <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry">
+       <para role="column_definition">
+        <structfield>blocks_done</structfield> <type>integer</type>
+       </para>
+       <para>
+        The number of blocks in the current relation which have been processed.
+        The launcher process has this set to <literal>NULL</literal>.
+       </para>
+      </entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+  <table id="datachecksum-phases">
+   <title>Data Checksum Phases</title>
+   <tgroup cols="2">
+    <colspec colname="col1" colwidth="1*"/>
+    <colspec colname="col2" colwidth="2*"/>
+    <thead>
+     <row>
+      <entry>Phase</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+    <tbody>
+     <row>
+      <entry><literal>enabling</literal></entry>
+      <entry>
+       The command is currently enabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>disabling</literal></entry>
+      <entry>
+       The command is currently disabling data checksums on the cluster.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>done</literal></entry>
+      <entry>
+       The command is done and the data checksum state in the cluster has
+       changed.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on barrier</literal></entry>
+      <entry>
+       The command is currently waiting for the current active backends to
+       acknowledge the change in data checksum state.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on checkpoint</literal></entry>
+      <entry>
+       The command is currently waiting for a checkpoint to update the checksum
+       state before finishing.
+      </entry>
+     </row>
+     <row>
+      <entry><literal>waiting on temporary tables</literal></entry>
+      <entry>
+       The command is currently waiting for all temporary tables which existed
+       at the time the command was started to be removed.
+      </entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+ </sect2>
+
  </sect1>
 
  <sect1 id="dynamic-trace">
diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index e9e393495df..e764b8be04d 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -45,6 +45,12 @@ PostgreSQL documentation
    exit status is nonzero if the operation failed.
   </para>
 
+  <para>
+   When enabling checksums, if checksums were in the process of being enabled
+   when the cluster was shut down, <application>pg_checksums</application>
+   will still process all relations regardless of the online processing.
+  </para>
+
   <para>
    When verifying checksums, every file in the cluster is scanned. When
    enabling checksums, each relation file block with a changed checksum is
diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index fd1e142d559..da488184533 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -275,6 +275,18 @@ make check-world PG_TEST_EXTRA='kerberos ldap ssl load_balance libpq_encryption'
 </programlisting>
    The following values are currently supported:
    <variablelist>
+    <varlistentry>
+     <term><literal>checksum_extended</literal></term>
+     <listitem>
+      <para>
+       Runs additional tests for enabling data checksums which inject delays
+       and re-tries in the processing, as well as tests that run pgbench
+       concurrently and randomly restarts the cluster.  Some of these test
+       suites requires injection points enabled in the installation.
+      </para>
+     </listitem>
+    </varlistentry>
+
     <varlistentry>
      <term><literal>kerberos</literal></term>
      <listitem>
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index f3b86b26be9..4b5df81caf2 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -246,9 +246,10 @@
   <para>
    Checksums can be disabled when the cluster is initialized using <link
    linkend="app-initdb-data-checksums"><application>initdb</application></link>.
-   They can also be enabled or disabled at a later time as an offline
-   operation. Data checksums are enabled or disabled at the full cluster
-   level, and cannot be specified individually for databases or tables.
+   They can also be enabled or disabled at a later time either as an offline
+   operation or online in a running cluster allowing concurrent access. Data
+   checksums are enabled or disabled at the full cluster level, and cannot be
+   specified individually for databases or tables.
   </para>
 
   <para>
@@ -265,7 +266,7 @@
   </para>
 
   <sect2 id="checksums-offline-enable-disable">
-   <title>Off-line Enabling of Checksums</title>
+   <title>Offline Enabling of Checksums</title>
 
    <para>
     The <link linkend="app-pgchecksums"><application>pg_checksums</application></link>
@@ -274,6 +275,118 @@
    </para>
 
   </sect2>
+
+  <sect2 id="checksums-online-enable-disable">
+   <title>Online Enabling of Checksums</title>
+
+   <para>
+    Checksums can be enabled or disabled online, by calling the appropriate
+    <link linkend="functions-admin-checksum">functions</link>.
+   </para>
+
+   <para>
+    Both enabling and disabling data checksums happens in two phases, separated
+    by a checkpoint to ensure durability.  The different states, and their
+    transitions, are illustrated in <xref linkend="data-checksums-states-figure"/>
+    and discussed in further detail in this section.
+   </para>
+
+   <para>
+    <figure id="data-checksums-states-figure">
+     <title>data checksums states</title>
+     <mediaobject>
+      <imageobject>
+       <imagedata fileref="images/datachecksums.svg" format="SVG" width="100%"/>
+      </imageobject>
+     </mediaobject>
+    </figure>
+   </para>
+
+   <para>
+    Enabling checksums will put the cluster checksum mode in
+    <literal>inprogress-on</literal> mode.  During this time, checksums will be
+    written but not verified. In addition to this, a background worker process
+    is started that enables checksums on all existing data in the cluster. Once
+    this worker has completed processing all databases in the cluster, the
+    checksum mode will automatically switch to <literal>on</literal>. The
+    processing will consume two background worker processes, make sure that
+    <varname>max_worker_processes</varname> allows for at least two more
+    additional processes.
+   </para>
+
+   <para>
+    The process will initially wait for all open transactions to finish before
+    it starts, so that it can be certain that there are no tables that have been
+    created inside a transaction that has not committed yet and thus would not
+    be visible to the process enabling checksums. It will also, for each database,
+    wait for all pre-existing temporary tables to get removed before it finishes.
+    If long-lived temporary tables are used in the application it may be necessary
+    to terminate these application connections to allow the process to complete.
+   </para>
+
+   <para>
+    If the cluster is stopped while in <literal>inprogress-on</literal> mode,
+    for any reason, then the checksum enable process must be restarted manually.
+    To do this, re-execute the function <function>pg_enable_data_checksums()</function>
+    once the cluster has been restarted. The process will start over, there is
+    no support for resuming work from where it was interrupted.  If the cluster
+    is stopped while in <literal>inprogress-off</literal>, then the checksum
+    state will be set to <literal>off</literal> when the cluster is
+    restarted.
+   </para>
+
+   <para>
+    Disabling data checksums will set the data checksum mode to
+    <literal>inprogress-off</literal>.  During this time, checksums will be
+    written but not verified.  After all processes acknowledge the change,
+    the mode will automatically be set to <literal>off</literal>.
+   </para>
+
+   <sect3 id="checksums-online-system-impact">
+    <title>Impact on system of online operations</title>
+    <para>
+     Enabling data checksums can cause significant I/O to the system, as all of the
+     database pages will need to be rewritten, and will be written both to the
+     data files and the WAL. The impact may be limited by throttling using the
+     <parameter>cost_delay</parameter> and <parameter>cost_limit</parameter>
+     parameters of the <function>pg_enable_data_checksums</function> function.
+    </para>
+
+    <para>
+     <itemizedlist>
+      <listitem><para>
+       I/O: all pages need to have data checksums calculated and written which
+       will generate a lot of dirty pages that will need to be flushed to disk,
+       as well as WAL logged.
+      </para></listitem>
+      <listitem><para>
+       Replication: When the standby receives the data checksum state change
+       in the WAL stream it will issue a <glossterm linkend="glossary-restartpoint">
+       restartpoint</glossterm> in order to flush the current state into the
+       <filename>pg_control</filename> file.  The restartpoint will flush the
+       current state to disk and will block redo until finished.  This in turn
+       will induce replication lag, which on synchronous standbys also blocks
+       the primary.  Reducing <xref linkend="guc-max-wal-size"/> before the
+       process is started can help with reducing the time it takes for the
+       restartpoint to finish.
+      </para></listitem>
+      <listitem><para>
+       Shutdown/Restart: If the server is shut down or restarted when data
+       checksums are being enabled, the process will not resume and all pages
+       need to be recalculated and rewritten.  Enabling data checksums should
+       be done when there is no need for regular maintenance or during a
+       service window.
+      </para></listitem>
+     </itemizedlist>
+    </para>
+
+    <para>
+     Rewriting all pages is not needed when disabling data checksums, but
+     checkpoints are still required.
+    </para>
+   </sect3>
+
+  </sect2>
  </sect1>
 
   <sect1 id="wal-intro">
diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index cd6c2a2f650..c50d654db30 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -18,6 +18,7 @@
 #include "access/xlog.h"
 #include "access/xlog_internal.h"
 #include "catalog/pg_control.h"
+#include "storage/bufpage.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -167,6 +168,26 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
 		memcpy(&wal_level, rec, sizeof(int));
 		appendStringInfo(buf, "wal_level %s", get_wal_level_string(wal_level));
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state xlrec;
+
+		memcpy(&xlrec, rec, sizeof(xl_checksum_state));
+		switch (xlrec.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_VERSION:
+				appendStringInfoString(buf, "on");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				appendStringInfoString(buf, "inprogress-off");
+				break;
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				appendStringInfoString(buf, "inprogress-on");
+				break;
+			default:
+				appendStringInfoString(buf, "off");
+		}
+	}
 }
 
 const char *
@@ -218,6 +239,9 @@ xlog_identify(uint8 info)
 		case XLOG_CHECKPOINT_REDO:
 			id = "CHECKPOINT_REDO";
 			break;
+		case XLOG_CHECKSUMS:
+			id = "CHECKSUMS";
+			break;
 	}
 
 	return id;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 22d0a2e8c3a..792eadf9387 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -287,6 +287,11 @@ static XLogRecPtr RedoRecPtr;
  */
 static bool doPageWrites;
 
+/*
+ * Force creating a restartpoint on the next CHECKPOINT after XLOG_CHECKSUMS.
+ */
+static bool checksumRestartPoint = false;
+
 /*----------
  * Shared-memory data structures for XLOG control
  *
@@ -551,6 +556,9 @@ typedef struct XLogCtlData
 	 */
 	XLogRecPtr	lastFpwDisableRecPtr;
 
+	/* last data_checksum_version we've seen */
+	uint32		data_checksum_version;
+
 	slock_t		info_lck;		/* locks shared variables shown above */
 } XLogCtlData;
 
@@ -574,6 +582,43 @@ static WALInsertLockPadded *WALInsertLocks = NULL;
  */
 static ControlFileData *ControlFile = NULL;
 
+/*
+ * This must match largest number of sets in barrier_eq and barrier_ne in the
+ * below checksum_barriers definition.
+ */
+#define MAX_BARRIER_CONDITIONS 2
+
+/*
+ * Configuration of conditions which must match when absorbing a procsignal
+ * barrier during data checksum enable/disable operations.  A single function
+ * is used for absorbing all barriers, and the set of conditions to use is
+ * looked up in the checksum_barriers struct.  The struct member for the target
+ * state defines which state the backend must currently be in, and which it
+ * must not be in.
+ */
+typedef struct ChecksumBarrierCondition
+{
+	/* The target state of the barrier */
+	int			target;
+	/* A set of states in which at least one MUST match the current state */
+	int			barrier_eq[MAX_BARRIER_CONDITIONS];
+	/* The number of elements in the barrier_eq set */
+	int			barrier_eq_sz;
+	/* A set of states which all MUST NOT match the current state */
+	int			barrier_ne[MAX_BARRIER_CONDITIONS];
+	/* The number of elements in the barrier_ne set */
+	int			barrier_ne_sz;
+} ChecksumBarrierCondition;
+
+static const ChecksumBarrierCondition checksum_barriers[] =
+{
+	{PG_DATA_CHECKSUM_OFF, {PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION, PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION}, 2, {PG_DATA_CHECKSUM_VERSION}, 1},
+	{PG_DATA_CHECKSUM_VERSION, {PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION}, 1, {0}, 0},
+	{PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION, {PG_DATA_CHECKSUM_ANY_VERSION}, 1, {PG_DATA_CHECKSUM_VERSION}, 1},
+	{PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION, {PG_DATA_CHECKSUM_VERSION}, 1, {0}, 0},
+	{-1}
+};
+
 /*
  * Calculate the amount of space left on the page after 'endptr'. Beware
  * multiple evaluation!
@@ -648,6 +693,34 @@ static XLogRecPtr LocalMinRecoveryPoint;
 static TimeLineID LocalMinRecoveryPointTLI;
 static bool updateMinRecoveryPoint = true;
 
+/*
+ * Local state for Controlfile data_checksum_version.  After initialization
+ * this is only updated when absorbing a procsignal barrier during interrupt
+ * processing.  The reason for keeping a copy in backend-private memory is to
+ * avoid locking for interrogating the data checksum state.  Possible values
+ * are the data checksum versions defined in storage/bufpage.h as well as zero
+ * when data checksums are disabled.
+ */
+static uint32 LocalDataChecksumVersion = 0;
+
+/*
+ * Flag to remember if the procsignalbarrier being absorbed for checksums is
+ * the first one.  The first procsignalbarrier can in rare cases be for the
+ * state we've initialized, i.e. a duplicate.  This may happen for any
+ * data_checksum_version value when the process is spawned between the update
+ * of XLogCtl->data_checksum_version and the barrier being emitted.  This can
+ * only happen on the very first barrier so mark that with this flag.
+ */
+static bool InitialDataChecksumTransition = true;
+
+/*
+ * Variable backing the GUC, keep it in sync with LocalDataChecksumVersion.
+ * See SetLocalDataChecksumVersion().
+ */
+int			data_checksums = 0;
+
+static void SetLocalDataChecksumVersion(uint32 data_checksum_version);
+
 /* For WALInsertLockAcquire/Release functions */
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
@@ -716,6 +789,8 @@ static void WALInsertLockAcquireExclusive(void);
 static void WALInsertLockRelease(void);
 static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);
 
+static void XLogChecksums(uint32 new_type);
+
 /*
  * Insert an XLOG record represented by an already-constructed chain of data
  * chunks.  This is a low-level routine; to construct the WAL record header
@@ -830,9 +905,10 @@ XLogInsertRecord(XLogRecData *rdata,
 		 * only happen just after a checkpoint, so it's better to be slow in
 		 * this case and fast otherwise.
 		 *
-		 * Also check to see if fullPageWrites was just turned on or there's a
-		 * running backup (which forces full-page writes); if we weren't
-		 * already doing full-page writes then go back and recompute.
+		 * Also check to see if fullPageWrites was just turned on, there's a
+		 * running backup or if checksums are enabled (all of which forces
+		 * full-page writes); if we weren't already doing full-page writes
+		 * then go back and recompute.
 		 *
 		 * If we aren't doing full-page writes then RedoRecPtr doesn't
 		 * actually affect the contents of the XLOG record, so we'll update
@@ -845,7 +921,9 @@ XLogInsertRecord(XLogRecData *rdata,
 			Assert(RedoRecPtr < Insert->RedoRecPtr);
 			RedoRecPtr = Insert->RedoRecPtr;
 		}
-		doPageWrites = (Insert->fullPageWrites || Insert->runningBackups > 0);
+		doPageWrites = (Insert->fullPageWrites ||
+						Insert->runningBackups > 0 ||
+						DataChecksumsNeedWrite());
 
 		if (doPageWrites &&
 			(!prevDoPageWrites ||
@@ -4252,6 +4330,12 @@ InitControlFile(uint64 sysidentifier, uint32 data_checksum_version)
 	ControlFile->wal_log_hints = wal_log_hints;
 	ControlFile->track_commit_timestamp = track_commit_timestamp;
 	ControlFile->data_checksum_version = data_checksum_version;
+
+	/*
+	 * Set the data_checksum_version value into XLogCtl, which is where all
+	 * processes get the current value from.
+	 */
+	XLogCtl->data_checksum_version = data_checksum_version;
 }
 
 static void
@@ -4587,9 +4671,9 @@ ReadControlFile(void)
 
 	CalculateCheckpointSegments();
 
-	/* Make the initdb settings visible as GUC variables, too */
-	SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no",
-					PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
+	elog(LOG, "ReadControlFile checkpoint %X/%08X redo %X/%08X",
+		 LSN_FORMAT_ARGS(ControlFile->checkPoint),
+		 LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo));
 }
 
 /*
@@ -4623,13 +4707,430 @@ GetMockAuthenticationNonce(void)
 }
 
 /*
- * Are checksums enabled for data pages?
+ * DataChecksumsNeedWrite
+ *		Returns whether data checksums must be written or not
+ *
+ * Returns true iff data checksums are enabled or are in the process of being
+ * enabled. During "inprogress-on" and "inprogress-off" states checksums must
+ * be written even though they are not verified (see datachecksumsworker.c for
+ * a longer discussion).
+ *
+ * This function is intended for callsites which are about to write a data page
+ * to storage, and need to know whether to re-calculate the checksum for the
+ * page header. Calling this function must be performed as close to the write
+ * operation as possible to keep the critical section short.
  */
 bool
-DataChecksumsEnabled(void)
+DataChecksumsNeedWrite(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION ||
+			LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * DataChecksumsNeedVerify
+ *		Returns whether data checksums must be verified or not
+ *
+ * Data checksums are only verified if they are fully enabled in the cluster.
+ * During the "inprogress-on" and "inprogress-off" states they are only
+ * updated, not verified (see datachecksumsworker.c for a longer discussion).
+ *
+ * This function is intended for callsites which have read data and are about
+ * to perform checksum validation based on the result of this.  Calling this
+ * function must be performed as close to the validation call as possible to
+ * keep the critical section short. This is in order to protect against time of
+ * check/time of use situations around data checksum validation.
+ */
+bool
+DataChecksumsNeedVerify(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION);
+}
+
+/*
+ * DataChecksumsOnInProgress
+ *		Returns whether data checksums are being enabled
+ *
+ * Most operations don't need to worry about the "inprogress" states, and
+ * should use DataChecksumsNeedVerify() or DataChecksumsNeedWrite(). The
+ * "inprogress-on" state for enabling checksums is used when the checksum
+ * worker is setting checksums on all pages, it can thus be used to check for
+ * aborted checksum processing which need to be restarted.
+ */
+inline bool
+DataChecksumsOnInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+}
+
+/*
+ * DataChecksumsOffInProgress
+ *		Returns whether data checksums are being disabled
+ *
+ * The "inprogress-off" state for disabling checksums is used for when the
+ * worker resets the catalog state.  DataChecksumsNeedVerify() or
+ * DataChecksumsNeedWrite() should be used for deciding whether to read/write
+ * checksums.
+ */
+bool
+DataChecksumsOffInProgress(void)
+{
+	return (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+}
+
+/*
+ * SetDataChecksumsOnInProgress
+ *		Sets the data checksum state to "inprogress-on" to enable checksums
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on". See
+ * SetDataChecksumsOn below for a description on how this state change works.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOnInProgress(bool immediate_checkpoint)
+{
+	uint64		barrier;
+	int			flags;
+
+	Assert(ControlFile != NULL);
+
+	/*
+	 * The state transition is performed in a critical section with
+	 * checkpoints held off to provide crash safety.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state change in all backends to ensure that all backends are in
+	 * "inprogress-on". Once done we know that all backends are writing data
+	 * checksums.
+	 */
+	WaitForProcSignalBarrier(barrier);
+
+	/*
+	 * force checkpoint to persist the current checksum state in control file
+	 * etc.
+	 *
+	 * XXX is this needed? There's already a checkpoint at the end of
+	 * ProcessAllDatabases, maybe this is redundant?
+	 */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_FAST;
+	RequestCheckpoint(flags);
+}
+
+/*
+ * SetDataChecksumsOn
+ *		Enables data checksums cluster-wide
+ *
+ * Enabling data checksums is performed using two barriers, the first one to
+ * set the state to "inprogress-on" (done by SetDataChecksumsOnInProgress())
+ * and the second one to set the state to "on" (done here). Below is a short
+ * description of the processing, a more detailed write-up can be found in
+ * datachecksumsworker.c.
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on".  This state
+ * requires data checksums to be written but not verified. This ensures that
+ * all data pages can be checksummed without the risk of false negatives in
+ * validation during the process.  When all existing pages are guaranteed to
+ * have checksums, and all new pages will be initiated with checksums, the
+ * state can be changed to "on". Once the state is "on" checksums will be both
+ * written and verified. See datachecksumsworker.c for a longer discussion on
+ * how data checksums can be enabled in a running cluster.
+ *
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOn(bool immediate_checkpoint)
 {
+	uint64		barrier;
+	int			flags;
+
 	Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+
+	/*
+	 * The only allowed state transition to "on" is from "inprogress-on" since
+	 * that state ensures that all pages will have data checksums written.
+	 */
+	if (XLogCtl->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		SpinLockRelease(&XLogCtl->info_lck);
+		elog(ERROR, "checksums not in \"inprogress-on\" mode");
+	}
+
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	INJECTION_POINT("datachecksums-enable-checksums-delay", NULL);
+	START_CRIT_SECTION();
+
+	XLogChecksums(PG_DATA_CHECKSUM_VERSION);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state transition of "on" in all backends. When done we know that
+	 * data checksums are enabled in all backends and data checksums are both
+	 * written and verified.
+	 */
+	WaitForProcSignalBarrier(barrier);
+
+	INJECTION_POINT("datachecksums-enable-checksums-pre-checkpoint", NULL);
+
+	/* XXX is this needed? */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_FAST;
+	RequestCheckpoint(flags);
+}
+
+/*
+ * SetDataChecksumsOff
+ *		Disables data checksums cluster-wide
+ *
+ * Disabling data checksums must be performed with two sets of barriers, each
+ * carrying a different state. The state is first set to "inprogress-off"
+ * during which checksums are still written but not verified. This ensures that
+ * backends which have yet to observe the state change from "on" won't get
+ * validation errors on concurrently modified pages. Once all backends have
+ * changed to "inprogress-off", the barrier for moving to "off" can be emitted.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOff(bool immediate_checkpoint)
+{
+	uint64		barrier;
+	int			flags;
+
+	Assert(ControlFile != NULL);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+
+	/* If data checksums are already disabled there is nothing to do */
+	if (XLogCtl->data_checksum_version == 0)
+	{
+		SpinLockRelease(&XLogCtl->info_lck);
+		return;
+	}
+
+	/*
+	 * If data checksums are currently enabled we first transition to the
+	 * "inprogress-off" state during which backends continue to write
+	 * checksums without verifying them. When all backends are in
+	 * "inprogress-off" the next transition to "off" can be performed, after
+	 * which all data checksum processing is disabled.
+	 */
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
+	{
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+		START_CRIT_SECTION();
+
+		XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+
+		END_CRIT_SECTION();
+		MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+		/*
+		 * Update local state in all backends to ensure that any backend in
+		 * "on" state is changed to "inprogress-off".
+		 */
+		WaitForProcSignalBarrier(barrier);
+
+		/*
+		 * force checkpoint to persist the current checksum state in control
+		 * file etc.
+		 *
+		 * XXX is this safe? What if the crash/shutdown happens while waiting
+		 * for the checkpoint? Also, should we persist the checksum first and
+		 * only then flip the flag in XLogCtl?
+		 */
+		INJECTION_POINT("datachecksums-disable-checksums-pre-checkpoint", NULL);
+		flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+		if (immediate_checkpoint)
+			flags = flags | CHECKPOINT_FAST;
+		RequestCheckpoint(flags);
+
+		/*
+		 * At this point we know that no backends are verifying data checksums
+		 * during reading. Next, we can safely move to state "off" to also
+		 * stop writing checksums.
+		 */
+	}
+	else
+	{
+		/*
+		 * Ending up here implies that the checksums state is "inprogress-on"
+		 * or "inprogress-off" and we can transition directly to "off" from
+		 * there.
+		 */
+		SpinLockRelease(&XLogCtl->info_lck);
+	}
+
+	/*
+	 * Ensure that we don't incur a checkpoint during disabling checksums.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(0);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = 0;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	WaitForProcSignalBarrier(barrier);
+
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_FAST;
+	RequestCheckpoint(flags);
+}
+
+/*
+ * AbsorbDataChecksumsBarrier
+ *		Generic function for absorbing data checksum state changes
+ *
+ * All procsignalbarriers regarding data checksum state changes are absorbed
+ * with this function.  The set of conditions required for the state change to
+ * be accepted are listed in the checksum_barriers struct, target_state is
+ * used to look up the relevant entry.
+ */
+bool
+AbsorbDataChecksumsBarrier(int target_state)
+{
+	const ChecksumBarrierCondition *condition = checksum_barriers;
+	int			current = LocalDataChecksumVersion;
+	bool		found = false;
+
+	/*
+	 * Find the barrier condition definition for the target state. Not finding
+	 * a condition would be a grave programmer error as the states are a
+	 * discrete set.
+	 */
+	while (condition->target != target_state && condition->target != -1)
+		condition++;
+	if (unlikely(condition->target == -1))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("invalid target state %i for data checksum barrier",
+					   target_state));
+
+	/*
+	 * The current state MUST be equal to one of the EQ states defined in this
+	 * barrier condition, or equal to the target_state if - and only if -
+	 * InitialDataChecksumTransition is true.
+	 */
+	for (int i = 0; i < condition->barrier_eq_sz; i++)
+	{
+		if (current == condition->barrier_eq[i] ||
+			condition->barrier_eq[i] == PG_DATA_CHECKSUM_ANY_VERSION)
+			found = true;
+	}
+	if (InitialDataChecksumTransition && current == target_state)
+		found = true;
+
+	/*
+	 * The current state MUST NOT be equal to any of the NE states defined in
+	 * this barrier condition.
+	 */
+	for (int i = 0; i < condition->barrier_ne_sz; i++)
+	{
+		if (current == condition->barrier_ne[i])
+			found = false;
+	}
+
+	/*
+	 * If the relevent state criteria aren't satisfied, throw an error which
+	 * will be caught by the procsignal machinery for a later retry.
+	 */
+	if (!found)
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("incorrect data checksum state %i for target state %i",
+					   current, target_state));
+
+	SetLocalDataChecksumVersion(target_state);
+	InitialDataChecksumTransition = false;
+	return true;
+}
+
+/*
+ * InitLocalControlData
+ *
+ * Set up backend local caches of controldata variables which may change at
+ * any point during runtime and thus require special cased locking. So far
+ * this only applies to data_checksum_version, but it's intended to be general
+ * purpose enough to handle future cases.
+ */
+void
+InitLocalDataChecksumVersion(void)
+{
+	SpinLockAcquire(&XLogCtl->info_lck);
+	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+	SpinLockRelease(&XLogCtl->info_lck);
+}
+
+void
+SetLocalDataChecksumVersion(uint32 data_checksum_version)
+{
+	LocalDataChecksumVersion = data_checksum_version;
+
+	data_checksums = data_checksum_version;
+}
+
+/* guc hook */
+const char *
+show_data_checksums(void)
+{
+	if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_VERSION)
+		return "on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+		return "inprogress-on";
+	else if (LocalDataChecksumVersion == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+		return "inprogress-off";
+	else
+		return "off";
 }
 
 /*
@@ -4904,6 +5405,7 @@ LocalProcessControlFile(bool reset)
 	Assert(reset || ControlFile == NULL);
 	ControlFile = palloc(sizeof(ControlFileData));
 	ReadControlFile();
+	SetLocalDataChecksumVersion(ControlFile->data_checksum_version);
 }
 
 /*
@@ -5073,6 +5575,11 @@ XLOGShmemInit(void)
 	XLogCtl->InstallXLogFileSegmentActive = false;
 	XLogCtl->WalWriterSleeping = false;
 
+	/* Use the checksum info from control file */
+	XLogCtl->data_checksum_version = ControlFile->data_checksum_version;
+
+	SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+
 	SpinLockInit(&XLogCtl->Insert.insertpos_lck);
 	SpinLockInit(&XLogCtl->info_lck);
 	pg_atomic_init_u64(&XLogCtl->logInsertResult, InvalidXLogRecPtr);
@@ -6214,6 +6721,47 @@ StartupXLOG(void)
 	pfree(endOfRecoveryInfo->recoveryStopReason);
 	pfree(endOfRecoveryInfo);
 
+	/*
+	 * If we reach this point with checksums in the state inprogress-on, it
+	 * means that data checksums were in the process of being enabled when the
+	 * cluster shut down. Since processing didn't finish, the operation will
+	 * have to be restarted from scratch since there is no capability to
+	 * continue where it was when the cluster shut down. Thus, revert the
+	 * state back to off, and inform the user with a warning message. Being
+	 * able to restart processing is a TODO, but it wouldn't be possible to
+	 * restart here since we cannot launch a dynamic background worker
+	 * directly from here (it has to be from a regular backend).
+	 */
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		XLogChecksums(0);
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = 0;
+		SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		ereport(WARNING,
+				(errmsg("data checksums state has been set of off"),
+				 errhint("If checksums were being enabled during shutdown then processing must be manually restarted.")));
+	}
+
+	/*
+	 * If data checksums were being disabled when the cluster was shut down,
+	 * we know that we have a state where all backends have stopped validating
+	 * checksums and we can move to off instead of prompting the user to
+	 * perform any action.
+	 */
+	if (XLogCtl->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION)
+	{
+		XLogChecksums(0);
+
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = 0;
+		SetLocalDataChecksumVersion(XLogCtl->data_checksum_version);
+		SpinLockRelease(&XLogCtl->info_lck);
+	}
+
 	/*
 	 * All done with end-of-recovery actions.
 	 *
@@ -6511,7 +7059,7 @@ GetRedoRecPtr(void)
 	XLogRecPtr	ptr;
 
 	/*
-	 * The possibly not up-to-date copy in XlogCtl is enough. Even if we
+	 * The possibly not up-to-date copy in XLogCtl is enough. Even if we
 	 * grabbed a WAL insertion lock to read the authoritative value in
 	 * Insert->RedoRecPtr, someone might update it just after we've released
 	 * the lock.
@@ -7075,6 +7623,12 @@ CreateCheckPoint(int flags)
 	checkPoint.fullPageWrites = Insert->fullPageWrites;
 	checkPoint.wal_level = wal_level;
 
+	/*
+	 * Get the current data_checksum_version value from xlogctl, valid at the
+	 * time of the checkpoint.
+	 */
+	checkPoint.data_checksum_version = XLogCtl->data_checksum_version;
+
 	if (shutdown)
 	{
 		XLogRecPtr	curInsert = XLogBytePosToRecPtr(Insert->CurrBytePos);
@@ -7330,6 +7884,9 @@ CreateCheckPoint(int flags)
 	ControlFile->minRecoveryPoint = InvalidXLogRecPtr;
 	ControlFile->minRecoveryPointTLI = 0;
 
+	/* make sure we start with the checksum version as of the checkpoint */
+	ControlFile->data_checksum_version = checkPoint.data_checksum_version;
+
 	/*
 	 * Persist unloggedLSN value. It's reset on crash recovery, so this goes
 	 * unused on non-shutdown checkpoints, but seems useful to store it always
@@ -7473,6 +8030,10 @@ CreateEndOfRecoveryRecord(void)
 	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
 	ControlFile->minRecoveryPoint = recptr;
 	ControlFile->minRecoveryPointTLI = xlrec.ThisTimeLineID;
+
+	/* start with the latest checksum version (as of the end of recovery) */
+	ControlFile->data_checksum_version = XLogCtl->data_checksum_version;
+
 	UpdateControlFile();
 	LWLockRelease(ControlFileLock);
 
@@ -7814,6 +8375,10 @@ CreateRestartPoint(int flags)
 			if (flags & CHECKPOINT_IS_SHUTDOWN)
 				ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY;
 		}
+
+		/* we shall start with the latest checksum version */
+		ControlFile->data_checksum_version = lastCheckPoint.data_checksum_version;
+
 		UpdateControlFile();
 	}
 	LWLockRelease(ControlFileLock);
@@ -8225,6 +8790,26 @@ XLogReportParameters(void)
 	}
 }
 
+/*
+ * Log the new state of checksums
+ */
+static void
+XLogChecksums(uint32 new_type)
+{
+	xl_checksum_state xlrec;
+	XLogRecPtr	recptr;
+
+	xlrec.new_checksumtype = new_type;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, sizeof(xl_checksum_state));
+
+	INJECTION_POINT("datachecksums-xlogchecksums-pre-xloginsert", &new_type);
+
+	recptr = XLogInsert(RM_XLOG_ID, XLOG_CHECKSUMS);
+	XLogFlush(recptr);
+}
+
 /*
  * Update full_page_writes in shared memory, and write an
  * XLOG_FPW_CHANGE record if necessary.
@@ -8659,6 +9244,74 @@ xlog_redo(XLogReaderState *record)
 	{
 		/* nothing to do here, just for informational purposes */
 	}
+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state state;
+		uint64		barrier;
+
+		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
+
+		/*
+		 * XXX Could this end up written to the control file prematurely? IIRC
+		 * that happens during checkpoint, so what if that gets triggered e.g.
+		 * because someone runs CHECKPOINT? If we then crash (or something
+		 * like that), could that confuse the instance?
+		 */
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = state.new_checksumtype;
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		/*
+		 * Block on a procsignalbarrier to await all processes having seen the
+		 * change to checksum status. Once the barrier has been passed we can
+		 * initiate the corresponding processing.
+		 */
+		switch (state.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			default:
+				Assert(state.new_checksumtype == 0);
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+		}
+
+		/*
+		 * Force creating a restartpoint for the first CHECKPOINT after seeing
+		 * XLOG_CHECKSUMS in WAL
+		 */
+		checksumRestartPoint = true;
+	}
+
+	if (checksumRestartPoint &&
+		(info == XLOG_CHECKPOINT_ONLINE ||
+		 info == XLOG_CHECKPOINT_REDO ||
+		 info == XLOG_CHECKPOINT_SHUTDOWN))
+	{
+		int			flags;
+
+		elog(LOG, "forcing creation of a restartpoint after XLOG_CHECKSUMS");
+
+		/* We explicitly want an immediate checkpoint here */
+		flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT | CHECKPOINT_FAST;
+		RequestCheckpoint(flags);
+
+		checksumRestartPoint = false;
+	}
 }
 
 /*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 3e45fce43ed..914cb3caf79 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -26,6 +26,7 @@
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "postmaster/datachecksumsworker.h"
 #include "replication/walreceiver.h"
 #include "storage/fd.h"
 #include "storage/latch.h"
@@ -748,3 +749,59 @@ pg_promote(PG_FUNCTION_ARGS)
 						   wait_seconds)));
 	PG_RETURN_BOOL(false);
 }
+
+/*
+ * Disables data checksums for the cluster, if applicable. Starts a background
+ * worker which turns off the data checksums.
+ */
+Datum
+disable_data_checksums(PG_FUNCTION_ARGS)
+{
+	bool		fast = PG_GETARG_BOOL(0);
+
+	ereport(LOG,
+			errmsg("disable_data_checksums, fast: %d", fast));
+
+	if (!superuser())
+		ereport(ERROR,
+				errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+				errmsg("must be superuser to change data checksum state"));
+
+	StartDataChecksumsWorkerLauncher(DISABLE_DATACHECKSUMS, 0, 0, fast);
+	PG_RETURN_VOID();
+}
+
+/*
+ * Enables data checksums for the cluster, if applicable.  Supports vacuum-
+ * like cost based throttling to limit system load. Starts a background worker
+ * which updates data checksums on existing data.
+ */
+Datum
+enable_data_checksums(PG_FUNCTION_ARGS)
+{
+	int			cost_delay = PG_GETARG_INT32(0);
+	int			cost_limit = PG_GETARG_INT32(1);
+	bool		fast = PG_GETARG_BOOL(2);
+
+	ereport(LOG,
+			errmsg("enable_data_checksums, cost_delay: %d cost_limit: %d fast: %d", cost_delay, cost_limit, fast));
+
+	if (!superuser())
+		ereport(ERROR,
+				errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+				errmsg("must be superuser to change data checksum state"));
+
+	if (cost_delay < 0)
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("cost delay cannot be a negative value"));
+
+	if (cost_limit <= 0)
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("cost limit must be greater than zero"));
+
+	StartDataChecksumsWorkerLauncher(ENABLE_DATACHECKSUMS, cost_delay, cost_limit, fast);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 21b8f179ba0..7291f618864 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -783,6 +783,10 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 		CheckPointTLI = ControlFile->checkPointCopy.ThisTimeLineID;
 		RedoStartLSN = ControlFile->checkPointCopy.redo;
 		RedoStartTLI = ControlFile->checkPointCopy.ThisTimeLineID;
+
+		elog(LOG, "InitWalRecovery checkpoint %X/%08X redo %X/%08X",
+			 LSN_FORMAT_ARGS(CheckPointLoc), LSN_FORMAT_ARGS(RedoStartLSN));
+
 		record = ReadCheckpointRecord(xlogprefetcher, CheckPointLoc,
 									  CheckPointTLI);
 		if (record != NULL)
@@ -1666,6 +1670,9 @@ PerformWalRecovery(void)
 	bool		reachedRecoveryTarget = false;
 	TimeLineID	replayTLI;
 
+	elog(LOG, "PerformWalRecovery checkpoint %X/%08X redo %X/%08X",
+		 LSN_FORMAT_ARGS(CheckPointLoc), LSN_FORMAT_ARGS(RedoStartLSN));
+
 	/*
 	 * Initialize shared variables for tracking progress of WAL replay, as if
 	 * we had just replayed the record before the REDO location (or the
@@ -1674,12 +1681,14 @@ PerformWalRecovery(void)
 	SpinLockAcquire(&XLogRecoveryCtl->info_lck);
 	if (RedoStartLSN < CheckPointLoc)
 	{
+		elog(LOG, "(RedoStartLSN < CheckPointLoc)");
 		XLogRecoveryCtl->lastReplayedReadRecPtr = InvalidXLogRecPtr;
 		XLogRecoveryCtl->lastReplayedEndRecPtr = RedoStartLSN;
 		XLogRecoveryCtl->lastReplayedTLI = RedoStartTLI;
 	}
 	else
 	{
+		elog(LOG, "(RedoStartLSN >= CheckPointLoc)");
 		XLogRecoveryCtl->lastReplayedReadRecPtr = xlogreader->ReadRecPtr;
 		XLogRecoveryCtl->lastReplayedEndRecPtr = xlogreader->EndRecPtr;
 		XLogRecoveryCtl->lastReplayedTLI = CheckPointTLI;
@@ -1691,6 +1700,10 @@ PerformWalRecovery(void)
 	XLogRecoveryCtl->recoveryPauseState = RECOVERY_NOT_PAUSED;
 	SpinLockRelease(&XLogRecoveryCtl->info_lck);
 
+	elog(LOG, "PerformWalRecovery lastReplayedReadRecPtr %X/%08X lastReplayedEndRecPtr %X/%08X",
+		 LSN_FORMAT_ARGS(XLogRecoveryCtl->lastReplayedReadRecPtr),
+		 LSN_FORMAT_ARGS(XLogRecoveryCtl->lastReplayedEndRecPtr));
+
 	/* Also ensure XLogReceiptTime has a sane value */
 	XLogReceiptTime = GetCurrentTimestamp();
 
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 2be4e069816..baf6c8cc2cc 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1613,7 +1613,8 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	 * enabled for this cluster, and if this is a relation file, then verify
 	 * the checksum.
 	 */
-	if (!noverify_checksums && DataChecksumsEnabled() &&
+	if (!noverify_checksums &&
+		DataChecksumsNeedWrite() &&
 		RelFileNumberIsValid(relfilenumber))
 		verify_checksum = true;
 
@@ -2007,6 +2008,9 @@ verify_page_checksum(Page page, XLogRecPtr start_lsn, BlockNumber blkno,
 	if (PageIsNew(page) || PageGetLSN(page) >= start_lsn)
 		return true;
 
+	if (!DataChecksumsNeedVerify())
+		return true;
+
 	/* Perform the actual checksum calculation. */
 	checksum = pg_checksum_page(page, blkno);
 
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 2d946d6d9e9..0bded82b84c 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -657,6 +657,22 @@ LANGUAGE INTERNAL
 STRICT VOLATILE PARALLEL UNSAFE
 AS 'pg_replication_origin_session_setup';
 
+CREATE OR REPLACE FUNCTION
+  pg_enable_data_checksums(cost_delay integer DEFAULT 0,
+                           cost_limit integer DEFAULT 100,
+						   fast boolean DEFAULT false)
+RETURNS void
+STRICT VOLATILE LANGUAGE internal
+PARALLEL RESTRICTED
+AS 'enable_data_checksums';
+
+CREATE OR REPLACE FUNCTION
+  pg_disable_data_checksums(fast boolean DEFAULT false)
+RETURNS void
+STRICT VOLATILE LANGUAGE internal
+PARALLEL RESTRICTED
+AS 'disable_data_checksums';
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
@@ -782,6 +798,10 @@ REVOKE EXECUTE ON FUNCTION pg_ls_logicalmapdir() FROM PUBLIC;
 
 REVOKE EXECUTE ON FUNCTION pg_ls_replslotdir(text) FROM PUBLIC;
 
+REVOKE EXECUTE ON FUNCTION pg_enable_data_checksums(integer, integer, boolean) FROM public;
+
+REVOKE EXECUTE ON FUNCTION pg_disable_data_checksums(boolean) FROM public;
+
 --
 -- We also set up some things as accessible to standard roles.
 --
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 086c4c8fb6f..6d452b10bce 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1374,6 +1374,26 @@ CREATE VIEW pg_stat_progress_copy AS
     FROM pg_stat_get_progress_info('COPY') AS S
         LEFT JOIN pg_database D ON S.datid = D.oid;
 
+CREATE VIEW pg_stat_progress_data_checksums AS
+    SELECT
+        S.pid AS pid, S.datid, D.datname AS datname,
+        CASE S.param1 WHEN 0 THEN 'enabling'
+                      WHEN 1 THEN 'disabling'
+                      WHEN 2 THEN 'waiting on temporary tables'
+                      WHEN 3 THEN 'waiting on checkpoint'
+					  WHEN 4 THEN 'waiting on barrier'
+                      WHEN 5 THEN 'done'
+                      END AS phase,
+        CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+        S.param3 AS databases_done,
+        CASE S.param4 WHEN -1 THEN NULL ELSE S.param4 END AS relations_total,
+        CASE S.param5 WHEN -1 THEN NULL ELSE S.param5 END AS relations_done,
+        CASE S.param6 WHEN -1 THEN NULL ELSE S.param6 END AS blocks_total,
+        CASE S.param7 WHEN -1 THEN NULL ELSE S.param7 END AS blocks_done
+    FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid
+    ORDER BY S.datid; -- return the launcher process first
+
 CREATE VIEW pg_user_mappings AS
     SELECT
         U.oid       AS umid,
diff --git a/src/backend/postmaster/Makefile b/src/backend/postmaster/Makefile
index 0f4435d2d97..0c36765acfe 100644
--- a/src/backend/postmaster/Makefile
+++ b/src/backend/postmaster/Makefile
@@ -18,6 +18,7 @@ OBJS = \
 	bgworker.o \
 	bgwriter.o \
 	checkpointer.o \
+	datachecksumsworker.o \
 	fork_process.o \
 	interrupt.o \
 	launch_backend.o \
diff --git a/src/backend/postmaster/auxprocess.c b/src/backend/postmaster/auxprocess.c
index a6d3630398f..5742a1dd724 100644
--- a/src/backend/postmaster/auxprocess.c
+++ b/src/backend/postmaster/auxprocess.c
@@ -15,6 +15,7 @@
 #include <unistd.h>
 #include <signal.h>
 
+#include "access/xlog.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/auxprocess.h"
@@ -68,6 +69,24 @@ AuxiliaryProcessMainCommon(void)
 
 	ProcSignalInit(NULL, 0);
 
+	/*
+	 * Initialize a local cache of the data_checksum_version, to be updated by
+	 * the procsignal-based barriers.
+	 *
+	 * This intentionally happens after initializing the procsignal, otherwise
+	 * we might miss a state change. This means we can get a barrier for the
+	 * state we've just initialized - but it can happen only once.
+	 *
+	 * The postmaster (which is what gets forked into the new child process)
+	 * does not handle barriers, therefore it may not have the current value
+	 * of LocalDataChecksumVersion value (it'll have the value read from the
+	 * control file, which may be arbitrarily old).
+	 *
+	 * NB: Even if the postmaster handled barriers, the value might still be
+	 * stale, as it might have changed after this process forked.
+	 */
+	InitLocalDataChecksumVersion();
+
 	/*
 	 * Auxiliary processes don't run transactions, but they may need a
 	 * resource owner anyway to manage buffer pins acquired outside
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 142a02eb5e9..ed3dc05406c 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -18,6 +18,7 @@
 #include "pgstat.h"
 #include "port/atomics.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/postmaster.h"
 #include "replication/logicallauncher.h"
 #include "replication/logicalworker.h"
@@ -135,6 +136,12 @@ static const struct
 	},
 	{
 		"SequenceSyncWorkerMain", SequenceSyncWorkerMain
+	},
+	{
+		"DataChecksumsWorkerLauncherMain", DataChecksumsWorkerLauncherMain
+	},
+	{
+		"DataChecksumsWorkerMain", DataChecksumsWorkerMain
 	}
 };
 
diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
new file mode 100644
index 00000000000..57311760b2b
--- /dev/null
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -0,0 +1,1491 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.c
+ *	  Background worker for enabling or disabling data checksums online
+ *
+ * When enabling data checksums on a database at initdb time or when shut down
+ * with pg_checksums, no extra process is required as each page is checksummed,
+ * and verified, when accessed.  When enabling checksums on an already running
+ * cluster, this worker will ensure that all pages are checksummed before
+ * verification of the checksums is turned on. In the case of disabling
+ * checksums, the state transition is performed only in the control file, no
+ * changes are performed on the data pages.
+ *
+ * Checksums can be either enabled or disabled cluster-wide, with on/off being
+ * the end state for data_checksums.
+ *
+ * Enabling checksums
+ * ------------------
+ * When enabling checksums in an online cluster, data_checksums will be set to
+ * "inprogress-on" which signals that write operations MUST compute and write
+ * the checksum on the data page, but during reading the checksum SHALL NOT be
+ * verified. This ensures that all objects created during checksumming will
+ * have checksums set, but no reads will fail due to incorrect checksum. The
+ * DataChecksumsWorker will compile a list of databases which exist at the
+ * start of checksumming, and all of these which haven't been dropped during
+ * the processing MUST have been processed successfully in order for checksums
+ * to be enabled. Any new relation created during processing will see the
+ * in-progress state and will automatically be checksummed.
+ *
+ * For each database, all relations which have storage are read and every data
+ * page is marked dirty to force a write with the checksum. This will generate
+ * a lot of WAL as the entire database is read and written.
+ *
+ * If the processing is interrupted by a cluster restart, it will be restarted
+ * from the beginning again as state isn't persisted.
+ *
+ * Disabling checksums
+ * -------------------
+ * When disabling checksums, data_checksums will be set to "inprogress-off"
+ * which signals that checksums are written but no longer verified. This ensure
+ * that backends which have yet to move from the "on" state will still be able
+ * to process data checksum validation.
+ *
+ * Synchronization and Correctness
+ * -------------------------------
+ * The processes involved in enabling, or disabling, data checksums in an
+ * online cluster must be properly synchronized with the normal backends
+ * serving concurrent queries to ensure correctness. Correctness is defined
+ * as the following:
+ *
+ *    - Backends SHALL NOT violate the data_checksums state they have agreed to
+ *      by acknowledging the procsignalbarrier:  This means that all backends
+ *      MUST calculate and write data checksums during all states except off;
+ *      MUST validate checksums only in the 'on' state.
+ *    - Data checksums SHALL NOT be considered enabled cluster-wide until all
+ *      currently connected backends have state "on": This means that all
+ *      backends must wait on the procsignalbarrier to be acknowledged by all
+ *      before proceeding to validate data checksums.
+ *
+ * There are two levels of synchronization required for changing data_checksums
+ * in an online cluster: (i) changing state in the active backends ("on",
+ * "off", "inprogress-on" and "inprogress-off"), and (ii) ensuring no
+ * incompatible objects and processes are left in a database when workers end.
+ * The former deals with cluster-wide agreement on data checksum state and the
+ * latter with ensuring that any concurrent activity cannot break the data
+ * checksum contract during processing.
+ *
+ * Synchronizing the state change is done with procsignal barriers, where the
+ * WAL logging backend updating the global state in the controlfile will wait
+ * for all other backends to absorb the barrier. Barrier absorption will happen
+ * during interrupt processing, which means that connected backends will change
+ * state at different times. To prevent data checksum state changes when
+ * writing and verifying checksums, interrupts shall be held off before
+ * interrogating state and resumed when the IO operation has been performed.
+ *
+ *   When Enabling Data Checksums
+ *   ----------------------------
+ *   A process which fails to observe data checksums being enabled can induce
+ *   two types of errors: failing to write the checksum when modifying the page
+ *   and failing to validate the data checksum on the page when reading it.
+ *
+ *   When processing starts all backends belong to one of the below sets, with
+ *   one set being empty:
+ *
+ *   Bd: Backends in "off" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   If processing is started in an online cluster then all backends are in Bd.
+ *   If processing was halted by the cluster shutting down, the controlfile
+ *   state "inprogress-on" will be observed on system startup and all backends
+ *   will be placed in Bd. Backends transition Bd -> Bi via a procsignalbarrier
+ *   which is emitted by the DataChecksumsLauncher.  When all backends have
+ *   acknowledged the barrier then Bd will be empty and the next phase can
+ *   begin: calculating and writing data checksums with DataChecksumsWorkers.
+ *   When the DataChecksumsWorker processes have finished writing checksums on
+ *   all pages and enables data checksums cluster-wide via another
+ *   procsignalbarrier, there are four sets of backends where Bd shall be an
+ *   empty set:
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   Backends in Bi and Be will write checksums when modifying a page, but only
+ *   backends in Be will verify the checksum during reading. The Bg backend is
+ *   blocked waiting for all backends in Bi to process interrupts and move to
+ *   Be. Any backend starting while Bg is waiting on the procsignalbarrier will
+ *   observe the global state being "on" and will thus automatically belong to
+ *   Be.  Checksums are enabled cluster-wide when Bi is an empty set. Bi and Be
+ *   are compatible sets while still operating based on their local state as
+ *   both write data checksums.
+ *
+ *   When Disabling Data Checksums
+ *   -----------------------------
+ *   A process which fails to observe that data checksums have been disabled
+ *   can induce two types of errors: writing the checksum when modifying the
+ *   page and validating a data checksum which is no longer correct due to
+ *   modifications to the page.
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bo: Backends in "inprogress-off" state
+ *
+ *   Backends transition from the Be state to Bd like so: Be -> Bo -> Bd
+ *
+ *   The goal is to transition all backends to Bd making the others empty sets.
+ *   Backends in Bo write data checksums, but don't validate them, such that
+ *   backends still in Be can continue to validate pages until the barrier has
+ *   been absorbed such that they are in Bo. Once all backends are in Bo, the
+ *   barrier to transition to "off" can be raised and all backends can safely
+ *   stop writing data checksums as no backend is enforcing data checksum
+ *   validation any longer.
+ *
+ * Potential optimizations
+ * -----------------------
+ * Below are some potential optimizations and improvements which were brought
+ * up during reviews of this feature, but which weren't implemented in the
+ * initial version. These are ideas listed without any validation on their
+ * feasibility or potential payoff. More discussion on these can be found on
+ * the -hackers threads linked to in the commit message of this feature.
+ *
+ *   * Launching datachecksumsworker for resuming operation from the startup
+ *     process: Currently users have to restart processing manually after a
+ *     restart since dynamic background worker cannot be started from the
+ *     postmaster. Changing the startup process could make restarting the
+ *     processing automatic on cluster restart.
+ *   * Avoid dirtying the page when checksums already match: Iff the checksum
+ *     on the page happens to already match we still dirty the page. It should
+ *     be enough to only do the log_newpage_buffer() call in that case.
+ *   * Invent a lightweight WAL record that doesn't contain the full-page
+ *     image but just the block number: On replay, the redo routine would read
+ *     the page from disk.
+ *   * Teach pg_checksums to avoid checksummed pages when pg_checksums is used
+ *     to enable checksums on a cluster which is in inprogress-on state and
+ *     may have checksummed pages (make pg_checksums be able to resume an
+ *     online operation).
+ *   * Restartability (not necessarily with page granularity).
+ *
+ *
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/postmaster/datachecksumsworker.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/heapam.h"
+#include "access/htup_details.h"
+#include "access/xact.h"
+#include "access/xloginsert.h"
+#include "catalog/indexing.h"
+#include "catalog/pg_class.h"
+#include "catalog/pg_database.h"
+#include "commands/progress.h"
+#include "commands/vacuum.h"
+#include "common/relpath.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "postmaster/bgworker.h"
+#include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/bufmgr.h"
+#include "storage/checksum.h"
+#include "storage/ipc.h"
+#include "storage/lmgr.h"
+#include "storage/lwlock.h"
+#include "storage/procarray.h"
+#include "storage/smgr.h"
+#include "tcop/tcopprot.h"
+#include "utils/fmgroids.h"
+#include "utils/injection_point.h"
+#include "utils/lsyscache.h"
+#include "utils/ps_status.h"
+#include "utils/syscache.h"
+
+/*
+ * Number of times we retry to open a database before giving up and consider
+ * it to have failed processing.
+ */
+#define DATACHECKSUMSWORKER_MAX_DB_RETRIES 5
+
+/*
+ * Signaling between backends calling pg_enable/disable_data_checksums, the
+ * checksums launcher process, and the checksums worker process.
+ *
+ * This struct is protected by DataChecksumsWorkerLock
+ */
+typedef struct DataChecksumsWorkerShmemStruct
+{
+	/*
+	 * These are set by pg_{enable|disable|verify}_data_checksums, to tell the
+	 * launcher what the target state is.
+	 */
+	DataChecksumsWorkerOperation launch_operation;
+	int			launch_cost_delay;
+	int			launch_cost_limit;
+	bool		launch_fast;
+
+	/*
+	 * Is a launcher process is currently running?
+	 *
+	 * This is set by the launcher process, after it has read the above
+	 * launch_* parameters.
+	 */
+	bool		launcher_running;
+
+	/*
+	 * These fields indicate the target state that the launcher is currently
+	 * working towards. They can be different from the corresponding launch_*
+	 * fields, if a new pg_enable/disable_data_checksums() call was made while
+	 * the launcher/worker was already running.
+	 *
+	 * The below members are set when the launcher starts, and are only
+	 * accessed read-only by the single worker. Thus, we can access these
+	 * without a lock. If multiple workers, or dynamic cost parameters, are
+	 * supported at some point then this would need to be revisited.
+	 */
+	DataChecksumsWorkerOperation operation;
+	int			cost_delay;
+	int			cost_limit;
+	bool		immediate_checkpoint;
+
+	/*
+	 * Signaling between the launcher and the worker process.
+	 *
+	 * As there is only a single worker, and the launcher won't read these
+	 * until the worker exits, they can be accessed without the need for a
+	 * lock. If multiple workers are supported then this will have to be
+	 * revisited.
+	 */
+
+	/* result, set by worker before exiting */
+	DataChecksumsWorkerResult success;
+
+	/*
+	 * tells the worker process whether it should also process the shared
+	 * catalogs
+	 */
+	bool		process_shared_catalogs;
+} DataChecksumsWorkerShmemStruct;
+
+/* Shared memory segment for datachecksumsworker */
+static DataChecksumsWorkerShmemStruct *DataChecksumsWorkerShmem;
+
+typedef struct DataChecksumsWorkerDatabase
+{
+	Oid			dboid;
+	char	   *dbname;
+} DataChecksumsWorkerDatabase;
+
+typedef struct DataChecksumsWorkerResultEntry
+{
+	Oid			dboid;
+	DataChecksumsWorkerResult result;
+	int			retries;
+} DataChecksumsWorkerResultEntry;
+
+
+/*
+ * Flag set by the interrupt handler
+ */
+static volatile sig_atomic_t abort_requested = false;
+
+/*
+ * Have we set the DataChecksumsWorkerShmemStruct->launcher_running flag?
+ * If we have, we need to clear it before exiting!
+ */
+static volatile sig_atomic_t launcher_running = false;
+
+/*
+ * Are we enabling data checksums, or disabling them?
+ */
+static DataChecksumsWorkerOperation operation;
+
+/* Prototypes */
+static List *BuildDatabaseList(void);
+static List *BuildRelationList(bool temp_relations, bool include_shared);
+static void FreeDatabaseList(List *dblist);
+static DataChecksumsWorkerResult ProcessDatabase(DataChecksumsWorkerDatabase *db);
+static bool ProcessAllDatabases(bool immediate_checkpoint);
+static bool ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy);
+static void launcher_cancel_handler(SIGNAL_ARGS);
+static void WaitForAllTransactionsToFinish(void);
+
+/*
+ * StartDataChecksumsWorkerLauncher
+ *		Main entry point for datachecksumsworker launcher process
+ *
+ * The main entrypoint for starting data checksums processing for enabling as
+ * well as disabling.
+ */
+void
+StartDataChecksumsWorkerLauncher(DataChecksumsWorkerOperation op,
+								 int cost_delay,
+								 int cost_limit,
+								 bool fast)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	bool		launcher_running;
+
+#ifdef USE_ASSERT_CHECKING
+	/* The cost delay settings have no effect when disabling */
+	if (op == DISABLE_DATACHECKSUMS)
+		Assert(cost_delay == 0 && cost_limit == 0);
+#endif
+
+	/* Store the desired state in shared memory */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	DataChecksumsWorkerShmem->launch_operation = op;
+	DataChecksumsWorkerShmem->launch_cost_delay = cost_delay;
+	DataChecksumsWorkerShmem->launch_cost_limit = cost_limit;
+	DataChecksumsWorkerShmem->launch_fast = fast;
+
+	/* is the launcher already running? */
+	launcher_running = DataChecksumsWorkerShmem->launcher_running;
+
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * Launch a new launcher process, if it's not running already.
+	 *
+	 * If the launcher is currently busy enabling the checksums, and we want
+	 * them disabled (or vice versa), the launcher will notice that at latest
+	 * when it's about to exit, and will loop back process the new request. So
+	 * if the launcher is already running, we don't need to do anything more
+	 * here to abort it.
+	 *
+	 * If you call pg_enable/disable_data_checksums() twice in a row, before
+	 * the launcher has had a chance to start up, we still end up launching it
+	 * twice.  That's OK, the second invocation will see that a launcher is
+	 * already running and exit quickly.
+	 *
+	 * TODO: We could optimize here and skip launching the launcher, if we are
+	 * already in the desired state, i.e. if the checksums are already enabled
+	 * and you call pg_enable_data_checksums().
+	 */
+	if (!launcher_running)
+	{
+		/*
+		 * Prepare the BackgroundWorker and launch it.
+		 */
+		memset(&bgw, 0, sizeof(bgw));
+		bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+		bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+		snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+		snprintf(bgw.bgw_function_name, BGW_MAXLEN, "DataChecksumsWorkerLauncherMain");
+		snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksum launcher");
+		snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksum launcher");
+		bgw.bgw_restart_time = BGW_NEVER_RESTART;
+		bgw.bgw_notify_pid = MyProcPid;
+		bgw.bgw_main_arg = (Datum) 0;
+
+		if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+			ereport(ERROR,
+					errcode(ERRCODE_INSUFFICIENT_RESOURCES),
+					errmsg("failed to start background worker to process data checksums"));
+	}
+}
+
+/*
+ * ProcessSingleRelationFork
+ *		Enable data checksums in a single relation/fork.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy)
+{
+	BlockNumber numblocks = RelationGetNumberOfBlocksInFork(reln, forkNum);
+	char		activity[NAMEDATALEN * 2 + 128];
+	char	   *relns;
+
+	relns = get_namespace_name(RelationGetNamespace(reln));
+
+	if (!relns)
+		return false;
+
+	/* Report the current relation to pgstat_activity */
+	snprintf(activity, sizeof(activity) - 1, "processing: %s.%s (%s, %dblocks)",
+			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	/*
+	 * As of now we only update the block counter for main forks in order to
+	 * not cause too frequent calls. TODO: investigate whether we should do it
+	 * more frequent?
+	 */
+	if (forkNum == MAIN_FORKNUM)
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+									 numblocks);
+
+	/*
+	 * We are looping over the blocks which existed at the time of process
+	 * start, which is safe since new blocks are created with checksums set
+	 * already due to the state being "inprogress-on".
+	 */
+	for (BlockNumber blknum = 0; blknum < numblocks; blknum++)
+	{
+		Buffer		buf = ReadBufferExtended(reln, forkNum, blknum, RBM_NORMAL, strategy);
+
+		/* Need to get an exclusive lock before we can flag as dirty */
+		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+
+		/*
+		 * Mark the buffer as dirty and force a full page write.  We have to
+		 * re-write the page to WAL even if the checksum hasn't changed,
+		 * because if there is a replica it might have a slightly different
+		 * version of the page with an invalid checksum, caused by unlogged
+		 * changes (e.g. hintbits) on the master happening while checksums
+		 * were off. This can happen if there was a valid checksum on the page
+		 * at one point in the past, so only when checksums are first on, then
+		 * off, and then turned on again.  TODO: investigate if this could be
+		 * avoided if the checksum is calculated to be correct and wal_level
+		 * is set to "minimal",
+		 */
+		START_CRIT_SECTION();
+		MarkBufferDirty(buf);
+		log_newpage_buffer(buf, false);
+		END_CRIT_SECTION();
+
+		UnlockReleaseBuffer(buf);
+
+		/*
+		 * This is the only place where we check if we are asked to abort, the
+		 * abortion will bubble up from here. It's safe to check this without
+		 * a lock, because if we miss it being set, we will try again soon.
+		 */
+		Assert(operation == ENABLE_DATACHECKSUMS);
+		if (DataChecksumsWorkerShmem->launch_operation == DISABLE_DATACHECKSUMS)
+			abort_requested = true;
+
+		if (abort_requested)
+			return false;
+
+		/*
+		 * As of now we only update the block counter for main forks in order
+		 * to not cause too frequent calls. TODO: investigate whether we
+		 * should do it more frequent?
+		 */
+		if (forkNum == MAIN_FORKNUM)
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+										 (blknum + 1));
+
+		vacuum_delay_point(false);
+	}
+
+	pfree(relns);
+	return true;
+}
+
+/*
+ * ProcessSingleRelationByOid
+ *		Process a single relation based on oid.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationByOid(Oid relationId, BufferAccessStrategy strategy)
+{
+	Relation	rel;
+	bool		aborted = false;
+
+	StartTransactionCommand();
+
+	rel = try_relation_open(relationId, AccessShareLock);
+	if (rel == NULL)
+	{
+		/*
+		 * Relation no longer exists. We don't consider this an error since
+		 * there are no pages in it that need data checksums, and thus return
+		 * true. The worker operates off a list of relations generated at the
+		 * start of processing, so relations being dropped in the meantime is
+		 * to be expected.
+		 */
+		CommitTransactionCommand();
+		pgstat_report_activity(STATE_IDLE, NULL);
+		return true;
+	}
+	RelationGetSmgr(rel);
+
+	for (ForkNumber fnum = 0; fnum <= MAX_FORKNUM; fnum++)
+	{
+		if (smgrexists(rel->rd_smgr, fnum))
+		{
+			if (!ProcessSingleRelationFork(rel, fnum, strategy))
+			{
+				aborted = true;
+				break;
+			}
+		}
+	}
+	relation_close(rel, AccessShareLock);
+	elog(DEBUG2,
+		 "data checksum processing done for relation with OID %u: %s",
+		 relationId, (aborted ? "aborted" : "finished"));
+
+	CommitTransactionCommand();
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return !aborted;
+}
+
+/*
+ * ProcessDatabase
+ *		Enable data checksums in a single database.
+ *
+ * We do this by launching a dynamic background worker into this database, and
+ * waiting for it to finish.  We have to do this in a separate worker, since
+ * each process can only be connected to one database during its lifetime.
+ */
+static DataChecksumsWorkerResult
+ProcessDatabase(DataChecksumsWorkerDatabase *db)
+{
+	BackgroundWorker bgw;
+	BackgroundWorkerHandle *bgw_handle;
+	BgwHandleStatus status;
+	pid_t		pid;
+	char		activity[NAMEDATALEN + 64];
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_FAILED;
+
+	memset(&bgw, 0, sizeof(bgw));
+	bgw.bgw_flags = BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+	bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
+	snprintf(bgw.bgw_library_name, BGW_MAXLEN, "postgres");
+	snprintf(bgw.bgw_function_name, BGW_MAXLEN, "%s", "DataChecksumsWorkerMain");
+	snprintf(bgw.bgw_name, BGW_MAXLEN, "datachecksum worker");
+	snprintf(bgw.bgw_type, BGW_MAXLEN, "datachecksum worker");
+	bgw.bgw_restart_time = BGW_NEVER_RESTART;
+	bgw.bgw_notify_pid = MyProcPid;
+	bgw.bgw_main_arg = ObjectIdGetDatum(db->dboid);
+
+	/*
+	 * If there are no worker slots available, make sure we retry processing
+	 * this database. This will make the datachecksumsworker move on to the
+	 * next database and quite likely fail with the same problem. TODO: Maybe
+	 * we need a backoff to avoid running through all the databases here in
+	 * short order.
+	 */
+	if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+	{
+		ereport(WARNING,
+				errmsg("failed to start worker for enabling data checksums in database \"%s\", retrying",
+					   db->dbname),
+				errhint("The max_worker_processes setting might be too low."));
+		return DATACHECKSUMSWORKER_RETRYDB;
+	}
+
+	status = WaitForBackgroundWorkerStartup(bgw_handle, &pid);
+	if (status == BGWH_STOPPED)
+	{
+		ereport(WARNING,
+				errmsg("could not start background worker for enabling data checksums in database \"%s\"",
+					   db->dbname),
+				errhint("More details on the error might be found in the server log."));
+		return DATACHECKSUMSWORKER_FAILED;
+	}
+
+	/*
+	 * If the postmaster crashed we cannot end up with a processed database so
+	 * we have no alternative other than exiting. When enabling checksums we
+	 * won't at this time have changed the pg_control version to enabled so
+	 * when the cluster comes back up processing will have to be restarted.
+	 * When disabling, the pg_control version will be set to off before this
+	 * so when the cluster comes up checksums will be off as expected.
+	 */
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				errcode(ERRCODE_ADMIN_SHUTDOWN),
+				errmsg("cannot enable data checksums without the postmaster process"),
+				errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums()."));
+
+	Assert(status == BGWH_STARTED);
+	ereport(DEBUG1,
+			errmsg("initiating data checksum processing in database \"%s\"",
+				   db->dbname));
+
+	snprintf(activity, sizeof(activity) - 1,
+			 "Waiting for worker in database %s (pid %ld)", db->dbname, (long) pid);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	status = WaitForBackgroundWorkerShutdown(bgw_handle);
+	if (status == BGWH_POSTMASTER_DIED)
+		ereport(FATAL,
+				errcode(ERRCODE_ADMIN_SHUTDOWN),
+				errmsg("postmaster exited during data checksum processing in \"%s\"",
+					   db->dbname),
+				errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums()."));
+
+	if (DataChecksumsWorkerShmem->success == DATACHECKSUMSWORKER_ABORTED)
+		ereport(LOG,
+				errmsg("data checksums processing was aborted in database \"%s\"",
+					   db->dbname));
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+
+	return DataChecksumsWorkerShmem->success;
+}
+
+/*
+ * launcher_exit
+ *
+ * Internal routine for cleaning up state when the launcher process exits. We
+ * need to clean up the abort flag to ensure that processing can be restarted
+ * again after it was previously aborted.
+ */
+static void
+launcher_exit(int code, Datum arg)
+{
+	if (launcher_running)
+	{
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		launcher_running = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		LWLockRelease(DataChecksumsWorkerLock);
+	}
+}
+
+/*
+ * launcher_cancel_handler
+ *
+ * Internal routine for reacting to SIGINT and flagging the worker to abort.
+ * The worker won't be interrupted immediately but will check for abort flag
+ * between each block in a relation.
+ */
+static void
+launcher_cancel_handler(SIGNAL_ARGS)
+{
+	int			save_errno = errno;
+
+	abort_requested = true;
+
+	/*
+	 * There is no sleeping in the main loop, the flag will be checked
+	 * periodically in ProcessSingleRelationFork. The worker does however
+	 * sleep when waiting for concurrent transactions to end so we still need
+	 * to set the latch.
+	 */
+	SetLatch(MyLatch);
+
+	errno = save_errno;
+}
+
+/*
+ * WaitForAllTransactionsToFinish
+ *		Blocks awaiting all current transactions to finish
+ *
+ * Returns when all transactions which are active at the call of the function
+ * have ended, or if the postmaster dies while waiting. If the postmaster dies
+ * the abort flag will be set to indicate that the caller of this shouldn't
+ * proceed.
+ *
+ * NB: this will return early, if aborted by SIGINT or if the target state
+ * is changed while we're running.
+ */
+static void
+WaitForAllTransactionsToFinish(void)
+{
+	TransactionId waitforxid;
+
+	LWLockAcquire(XidGenLock, LW_SHARED);
+	waitforxid = XidFromFullTransactionId(TransamVariables->nextXid);
+	LWLockRelease(XidGenLock);
+
+	while (TransactionIdPrecedes(GetOldestActiveTransactionId(false, true), waitforxid))
+	{
+		char		activity[64];
+		int			rc;
+
+		/* Oldest running xid is older than us, so wait */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for current transactions to finish (waiting for %u)",
+				 waitforxid);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 3 seconds */
+		ResetLatch(MyLatch);
+		rc = WaitLatch(MyLatch,
+					   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+					   3000,
+					   WAIT_EVENT_CHECKSUM_ENABLE_STARTCONDITION);
+
+		/*
+		 * If the postmaster died we won't be able to enable checksums
+		 * cluster-wide so abort and hope to continue when restarted.
+		 */
+		if (rc & WL_POSTMASTER_DEATH)
+			ereport(FATAL,
+					errcode(ERRCODE_ADMIN_SHUTDOWN),
+					errmsg("postmaster exited during data checksum processing"),
+					errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums()."));
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_SHARED);
+		if (DataChecksumsWorkerShmem->launch_operation != operation)
+			abort_requested = true;
+		LWLockRelease(DataChecksumsWorkerLock);
+		if (abort_requested)
+			break;
+	}
+
+	pgstat_report_activity(STATE_IDLE, NULL);
+	return;
+}
+
+/*
+ * DataChecksumsWorkerLauncherMain
+ *
+ * Main function for launching dynamic background workers for processing data
+ * checksums in databases. This function has the bgworker management, with
+ * ProcessAllDatabases being responsible for looping over the databases and
+ * initiating processing.
+ */
+void
+DataChecksumsWorkerLauncherMain(Datum arg)
+{
+	on_shmem_exit(launcher_exit, 0);
+
+	ereport(DEBUG1,
+			errmsg("background worker \"datachecksum launcher\" started"));
+
+	pqsignal(SIGTERM, die);
+	pqsignal(SIGINT, launcher_cancel_handler);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_LAUNCHER;
+	init_ps_display(NULL);
+
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+
+	if (DataChecksumsWorkerShmem->launcher_running)
+	{
+		/* Launcher was already running, let it finish */
+		LWLockRelease(DataChecksumsWorkerLock);
+		return;
+	}
+
+	launcher_running = true;
+
+	/*
+	 * Initialize a connection to shared catalogs only.
+	 */
+	BackgroundWorkerInitializeConnectionByOid(InvalidOid, InvalidOid, 0);
+
+	operation = DataChecksumsWorkerShmem->launch_operation;
+	DataChecksumsWorkerShmem->launcher_running = true;
+	DataChecksumsWorkerShmem->operation = operation;
+	DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+	DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+	DataChecksumsWorkerShmem->immediate_checkpoint = DataChecksumsWorkerShmem->launch_fast;
+	LWLockRelease(DataChecksumsWorkerLock);
+
+	/*
+	 * The target state can change while we are busy enabling/disabling
+	 * checksums, if the user calls pg_disable/enable_data_checksums() before
+	 * we are finished with the previous request. In that case, we will loop
+	 * back here, to process the new request.
+	 */
+again:
+
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	if (operation == ENABLE_DATACHECKSUMS)
+	{
+		/*
+		 * If we are asked to enable checksums in a cluster which already has
+		 * checksums enabled, exit immediately as there is nothing more to do.
+		 * Hold interrupts to make sure state doesn't change during checking.
+		 */
+		HOLD_INTERRUPTS();
+		if (DataChecksumsNeedVerify())
+		{
+			RESUME_INTERRUPTS();
+			goto done;
+		}
+		RESUME_INTERRUPTS();
+
+		/*
+		 * Set the state to inprogress-on and wait on the procsignal barrier.
+		 */
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_ENABLING);
+		SetDataChecksumsOnInProgress(DataChecksumsWorkerShmem->immediate_checkpoint);
+
+		/*
+		 * All backends are now in inprogress-on state and are writing data
+		 * checksums.  Start processing all data at rest.
+		 */
+		if (!ProcessAllDatabases(DataChecksumsWorkerShmem->immediate_checkpoint))
+		{
+			/*
+			 * If the target state changed during processing then it's not a
+			 * failure, so restart processing instead.
+			 */
+			LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+			if (DataChecksumsWorkerShmem->launch_operation != operation)
+			{
+				LWLockRelease(DataChecksumsWorkerLock);
+				goto done;
+			}
+			LWLockRelease(DataChecksumsWorkerLock);
+			ereport(ERROR,
+					errcode(ERRCODE_INSUFFICIENT_RESOURCES),
+					errmsg("unable to enable data checksums in cluster"));
+		}
+
+		/*
+		 * Data checksums have been set on all pages, set the state to on in
+		 * order to instruct backends to validate checksums on reading.
+		 */
+		SetDataChecksumsOn(DataChecksumsWorkerShmem->immediate_checkpoint);
+	}
+	else if (operation == DISABLE_DATACHECKSUMS)
+	{
+		int			flags;
+
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+									 PROGRESS_DATACHECKSUMS_PHASE_DISABLING);
+		SetDataChecksumsOff(DataChecksumsWorkerShmem->immediate_checkpoint);
+
+		flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+		if (DataChecksumsWorkerShmem->immediate_checkpoint)
+			flags = flags | CHECKPOINT_FAST;
+		RequestCheckpoint(flags);
+	}
+	else
+	{
+		Assert(false);
+	}
+
+done:
+
+	/*
+	 * This state will only be displayed for a fleeting moment, but for the
+	 * sake of correctness it is still added before ending the command.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_DONE);
+
+	/*
+	 * All done. But before we exit, check if the target state was changed
+	 * while we were running. In that case we will have to start all over
+	 * again.
+	 */
+	LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+	if (DataChecksumsWorkerShmem->launch_operation != operation)
+	{
+		DataChecksumsWorkerShmem->operation = DataChecksumsWorkerShmem->launch_operation;
+		operation = DataChecksumsWorkerShmem->launch_operation;
+		DataChecksumsWorkerShmem->cost_delay = DataChecksumsWorkerShmem->launch_cost_delay;
+		DataChecksumsWorkerShmem->cost_limit = DataChecksumsWorkerShmem->launch_cost_limit;
+		LWLockRelease(DataChecksumsWorkerLock);
+		goto again;
+	}
+
+	/* Shut down progress reporting as we are done */
+	pgstat_progress_end_command();
+
+	launcher_running = false;
+	DataChecksumsWorkerShmem->launcher_running = false;
+	LWLockRelease(DataChecksumsWorkerLock);
+}
+
+/*
+ * ProcessAllDatabases
+ *		Compute the list of all databases and process checksums in each
+ *
+ * This will repeatedly generate a list of databases to process for enabling
+ * checksums. Until no new databases are found, this will loop around computing
+ * a new list and comparing it to the already seen ones.
+ *
+ * If immediate_checkpoint is set to true then a CHECKPOINT_FAST will be
+ * issued. This is useful for testing but should be avoided in production use
+ * as it may affect cluster performance drastically.
+ */
+static bool
+ProcessAllDatabases(bool immediate_checkpoint)
+{
+	List	   *DatabaseList;
+	HTAB	   *ProcessedDatabases = NULL;
+	HASHCTL		hash_ctl;
+	bool		found_failed = false;
+	int			flags;
+
+	/* Initialize a hash tracking all processed databases */
+	memset(&hash_ctl, 0, sizeof(hash_ctl));
+	hash_ctl.keysize = sizeof(Oid);
+	hash_ctl.entrysize = sizeof(DataChecksumsWorkerResultEntry);
+	ProcessedDatabases = hash_create("Processed databases",
+									 64,
+									 &hash_ctl,
+									 HASH_ELEM | HASH_BLOBS);
+
+	/*
+	 * Set up so first run processes shared catalogs, but not once in every
+	 * db.
+	 */
+	DataChecksumsWorkerShmem->process_shared_catalogs = true;
+
+	/*
+	 * Get a list of all databases to process. This may include databases that
+	 * were created during our runtime.  Since a database can be created as a
+	 * copy of any other database (which may not have existed in our last
+	 * run), we have to repeat this loop until no new databases show up in the
+	 * list.
+	 */
+	DatabaseList = BuildDatabaseList();
+
+	/* Allow a test case to modify the initial list of databases */
+	INJECTION_POINT("datachecksumsworker-initial-dblist", DatabaseList);
+
+	/*
+	 * Update progress reporting with the total number of databases we need to
+	 * process.  This number should not be changed during processing, the
+	 * columns for processed databases is instead increased such that it can
+	 * be compared against the total.
+	 */
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_DBS_TOTAL,
+			PROGRESS_DATACHECKSUMS_DBS_DONE,
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE,
+			PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+			PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
+		};
+
+		int64		vals[6];
+
+		vals[0] = list_length(DatabaseList);
+		vals[1] = 0;
+
+		/* translated to NULL */
+		vals[2] = -1;
+		vals[3] = -1;
+		vals[4] = -1;
+		vals[5] = -1;
+
+		pgstat_progress_update_multi_param(6, index, vals);
+	}
+
+	while (true)
+	{
+		int			processed_databases = 0;
+
+		foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+		{
+			DataChecksumsWorkerResult result;
+			DataChecksumsWorkerResultEntry *entry;
+			bool		found;
+
+			/*
+			 * Check if this database has been processed already, and if so
+			 * whether it should be retried or skipped.
+			 */
+			entry = (DataChecksumsWorkerResultEntry *) hash_search(ProcessedDatabases, &db->dboid,
+																   HASH_FIND, NULL);
+
+			if (entry)
+			{
+				if (entry->result == DATACHECKSUMSWORKER_RETRYDB)
+				{
+					/*
+					 * Limit the number of retries to avoid infinite looping
+					 * in case there simply won't be enough workers in the
+					 * cluster to finish this operation.
+					 */
+					if (entry->retries > DATACHECKSUMSWORKER_MAX_DB_RETRIES)
+						entry->result = DATACHECKSUMSWORKER_FAILED;
+				}
+
+				/* Skip if this database has been processed already */
+				if (entry->result != DATACHECKSUMSWORKER_RETRYDB)
+					continue;
+			}
+
+			result = ProcessDatabase(db);
+			processed_databases++;
+
+			/*
+			 * Update the number of processed databases in the progress
+			 * report.
+			 */
+			pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_DBS_DONE,
+										 processed_databases);
+
+			/* Allow a test process to alter the result of the operation */
+			INJECTION_POINT("datachecksumsworker-fail-db", &result);
+
+			if (result == DATACHECKSUMSWORKER_SUCCESSFUL)
+			{
+				/*
+				 * If one database has completed shared catalogs, we don't
+				 * have to process them again.
+				 */
+				if (DataChecksumsWorkerShmem->process_shared_catalogs)
+					DataChecksumsWorkerShmem->process_shared_catalogs = false;
+			}
+			else if (result == DATACHECKSUMSWORKER_ABORTED)
+			{
+				/* Abort flag set, so exit the whole process */
+				return false;
+			}
+
+			entry = hash_search(ProcessedDatabases, &db->dboid, HASH_ENTER, &found);
+			entry->dboid = db->dboid;
+			entry->result = result;
+			if (!found)
+				entry->retries = 0;
+			else
+				entry->retries++;
+		}
+
+		elog(DEBUG1,
+			 "%i databases processed for data checksum enabling, %s",
+			 processed_databases,
+			 (processed_databases ? "process with restart" : "process completed"));
+
+		FreeDatabaseList(DatabaseList);
+
+		/*
+		 * If no databases were processed in this run of the loop, we have now
+		 * finished all databases and no concurrently created ones can exist.
+		 */
+		if (processed_databases == 0)
+			break;
+
+		/*
+		 * Re-generate the list of databases for another pass. Since we wait
+		 * for all pre-existing transactions finish, this way we can be
+		 * certain that there are no databases left without checksums.
+		 */
+		WaitForAllTransactionsToFinish();
+		DatabaseList = BuildDatabaseList();
+	}
+
+	/*
+	 * ProcessedDatabases now has all databases and the results of their
+	 * processing. Failure to enable checksums for a database can be because
+	 * they actually failed for some reason, or because the database was
+	 * dropped between us getting the database list and trying to process it.
+	 * Get a fresh list of databases to detect the second case where the
+	 * database was dropped before we had started processing it. If a database
+	 * still exists, but enabling checksums failed then we fail the entire
+	 * checksumming process and exit with an error.
+	 */
+	WaitForAllTransactionsToFinish();
+	DatabaseList = BuildDatabaseList();
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, DatabaseList)
+	{
+		DataChecksumsWorkerResultEntry *entry;
+		bool		found;
+
+		entry = hash_search(ProcessedDatabases, (void *) &db->dboid,
+							HASH_FIND, &found);
+
+		/*
+		 * We are only interested in the processed databases which failed, and
+		 * where the failed database still exists.  This indicates that
+		 * enabling checksums actually failed, and not that the failure was
+		 * due to the db being concurrently dropped.
+		 */
+		if (found && entry->result == DATACHECKSUMSWORKER_FAILED)
+		{
+			ereport(WARNING,
+					errmsg("failed to enable data checksums in \"%s\"", db->dbname));
+			found_failed = found;
+			continue;
+		}
+	}
+
+	FreeDatabaseList(DatabaseList);
+
+	if (found_failed)
+	{
+		/* Disable checksums on cluster, because we failed */
+		SetDataChecksumsOff(DataChecksumsWorkerShmem->immediate_checkpoint);
+		/* Force a checkpoint to make everything consistent */
+		flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+		if (immediate_checkpoint)
+			flags = flags | CHECKPOINT_FAST;
+		RequestCheckpoint(flags);
+		ereport(ERROR,
+				errcode(ERRCODE_INSUFFICIENT_RESOURCES),
+				errmsg("data checksums failed to get enabled in all databases, aborting"),
+				errhint("The server log might have more information on the cause of the error."));
+	}
+
+	/*
+	 * When enabling checksums, we have to wait for a checkpoint for the
+	 * checksums to change from in-progress to on.
+	 */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT);
+
+	/*
+	 * Force a checkpoint to get everything out to disk. The use of immediate
+	 * checkpoints is for running tests, as they would otherwise not execute
+	 * in such a way that they can reliably be placed under timeout control.
+	 */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_FAST;
+	RequestCheckpoint(flags);
+
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_BARRIER);
+	return true;
+}
+
+/*
+ * DataChecksumsWorkerShmemSize
+ *		Compute required space for datachecksumsworker-related shared memory
+ */
+Size
+DataChecksumsWorkerShmemSize(void)
+{
+	Size		size;
+
+	size = sizeof(DataChecksumsWorkerShmemStruct);
+	size = MAXALIGN(size);
+
+	return size;
+}
+
+/*
+ * DataChecksumsWorkerShmemInit
+ *		Allocate and initialize datachecksumsworker-related shared memory
+ */
+void
+DataChecksumsWorkerShmemInit(void)
+{
+	bool		found;
+
+	DataChecksumsWorkerShmem = (DataChecksumsWorkerShmemStruct *)
+		ShmemInitStruct("DataChecksumsWorker Data",
+						DataChecksumsWorkerShmemSize(),
+						&found);
+
+	if (!found)
+	{
+		MemSet(DataChecksumsWorkerShmem, 0, DataChecksumsWorkerShmemSize());
+
+		/*
+		 * Even if this is a redundant assignment, we want to be explicit
+		 * about our intent for readability, since we want to be able to query
+		 * this state in case of restartability.
+		 */
+		DataChecksumsWorkerShmem->launch_operation = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		DataChecksumsWorkerShmem->launch_fast = false;
+	}
+}
+
+/*
+ * BuildDatabaseList
+ *		Compile a list of all currently available databases in the cluster
+ *
+ * This creates the list of databases for the datachecksumsworker workers to
+ * add checksums to. If the caller wants to ensure that no concurrently
+ * running CREATE DATABASE calls exist, this needs to be preceded by a call
+ * to WaitForAllTransactionsToFinish().
+ */
+static List *
+BuildDatabaseList(void)
+{
+	List	   *DatabaseList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(DatabaseRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_database pgdb = (Form_pg_database) GETSTRUCT(tup);
+		DataChecksumsWorkerDatabase *db;
+
+		oldctx = MemoryContextSwitchTo(ctx);
+
+		db = (DataChecksumsWorkerDatabase *) palloc0(sizeof(DataChecksumsWorkerDatabase));
+
+		db->dboid = pgdb->oid;
+		db->dbname = pstrdup(NameStr(pgdb->datname));
+
+		DatabaseList = lappend(DatabaseList, db);
+
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return DatabaseList;
+}
+
+static void
+FreeDatabaseList(List *dblist)
+{
+	if (!dblist)
+		return;
+
+	foreach_ptr(DataChecksumsWorkerDatabase, db, dblist)
+	{
+		if (db->dbname != NULL)
+			pfree(db->dbname);
+	}
+
+	list_free_deep(dblist);
+}
+
+/*
+ * BuildRelationList
+ *		Compile a list of relations in the database
+ *
+ * Returns a list of OIDs for the request relation types. If temp_relations
+ * is True then only temporary relations are returned. If temp_relations is
+ * False then non-temporary relations which have data checksums are returned.
+ * If include_shared is True then shared relations are included as well in a
+ * non-temporary list. include_shared has no relevance when building a list of
+ * temporary relations.
+ */
+static List *
+BuildRelationList(bool temp_relations, bool include_shared)
+{
+	List	   *RelationList = NIL;
+	Relation	rel;
+	TableScanDesc scan;
+	HeapTuple	tup;
+	MemoryContext ctx = CurrentMemoryContext;
+	MemoryContext oldctx;
+
+	StartTransactionCommand();
+
+	rel = table_open(RelationRelationId, AccessShareLock);
+	scan = table_beginscan_catalog(rel, 0, NULL);
+
+	while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
+	{
+		Form_pg_class pgc = (Form_pg_class) GETSTRUCT(tup);
+
+		/*
+		 * Only include temporary relations when asked for a temp relation
+		 * list.
+		 */
+		if (pgc->relpersistence == RELPERSISTENCE_TEMP)
+		{
+			if (!temp_relations)
+				continue;
+		}
+		else
+		{
+			/*
+			 * If we are only interested in temp relations then continue
+			 * immediately as the current relation isn't a temp relation.
+			 */
+			if (temp_relations)
+				continue;
+
+			if (!RELKIND_HAS_STORAGE(pgc->relkind))
+				continue;
+
+			if (pgc->relisshared && !include_shared)
+				continue;
+		}
+
+		oldctx = MemoryContextSwitchTo(ctx);
+		RelationList = lappend_oid(RelationList, pgc->oid);
+		MemoryContextSwitchTo(oldctx);
+	}
+
+	table_endscan(scan);
+	table_close(rel, AccessShareLock);
+
+	CommitTransactionCommand();
+
+	return RelationList;
+}
+
+/*
+ * DataChecksumsWorkerMain
+ *
+ * Main function for enabling checksums in a single database, This is the
+ * function set as the bgw_function_name in the dynamic background worker
+ * process initiated for each database by the worker launcher. After enabling
+ * data checksums in each applicable relation in the database, it will wait for
+ * all temporary relations that were present when the function started to
+ * disappear before returning. This is required since we cannot rewrite
+ * existing temporary relations with data checksums.
+ */
+void
+DataChecksumsWorkerMain(Datum arg)
+{
+	Oid			dboid = DatumGetObjectId(arg);
+	List	   *RelationList = NIL;
+	List	   *InitialTempTableList = NIL;
+	BufferAccessStrategy strategy;
+	bool		aborted = false;
+	int64		rels_done;
+
+	operation = ENABLE_DATACHECKSUMS;
+
+	pqsignal(SIGTERM, die);
+
+	BackgroundWorkerUnblockSignals();
+
+	MyBackendType = B_DATACHECKSUMSWORKER_WORKER;
+	init_ps_display(NULL);
+
+	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid,
+											  BGWORKER_BYPASS_ALLOWCONN);
+
+	/* worker will have a separate entry in pg_stat_progress_data_checksums */
+	pgstat_progress_start_command(PROGRESS_COMMAND_DATACHECKSUMS,
+								  InvalidOid);
+
+	/*
+	 * Get a list of all temp tables present as we start in this database. We
+	 * need to wait until they are all gone until we are done, since we cannot
+	 * access these relations and modify them.
+	 */
+	InitialTempTableList = BuildRelationList(true, false);
+
+	/*
+	 * Enable vacuum cost delay, if any.
+	 */
+	Assert(DataChecksumsWorkerShmem->operation == ENABLE_DATACHECKSUMS);
+	VacuumCostDelay = DataChecksumsWorkerShmem->cost_delay;
+	VacuumCostLimit = DataChecksumsWorkerShmem->cost_limit;
+	VacuumCostActive = (VacuumCostDelay > 0);
+	VacuumCostBalance = 0;
+	VacuumCostPageHit = 0;
+	VacuumCostPageMiss = 0;
+	VacuumCostPageDirty = 0;
+
+	/*
+	 * Create and set the vacuum strategy as our buffer strategy.
+	 */
+	strategy = GetAccessStrategy(BAS_VACUUM);
+
+	RelationList = BuildRelationList(false,
+									 DataChecksumsWorkerShmem->process_shared_catalogs);
+
+	/* Update the total number of relations to be processed in this DB. */
+	{
+		const int	index[] = {
+			PROGRESS_DATACHECKSUMS_RELS_TOTAL,
+			PROGRESS_DATACHECKSUMS_RELS_DONE
+		};
+
+		int64		vals[2];
+
+		vals[0] = list_length(RelationList);
+		vals[1] = 0;
+
+		pgstat_progress_update_multi_param(2, index, vals);
+	}
+
+	/* Process the relations */
+	rels_done = 0;
+	foreach_oid(reloid, RelationList)
+	{
+		if (!ProcessSingleRelationByOid(reloid, strategy))
+		{
+			aborted = true;
+			break;
+		}
+
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_RELS_DONE,
+									 ++rels_done);
+	}
+	list_free(RelationList);
+
+	if (aborted)
+	{
+		DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+		ereport(DEBUG1,
+				errmsg("data checksum processing aborted in database OID %u",
+					   dboid));
+		return;
+	}
+
+	/* The worker is about to wait for temporary tables to go away. */
+	pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_PHASE,
+								 PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL);
+
+	/*
+	 * Wait for all temp tables that existed when we started to go away. This
+	 * is necessary since we cannot "reach" them to enable checksums. Any temp
+	 * tables created after we started will already have checksums in them
+	 * (due to the "inprogress-on" state), so no need to wait for those.
+	 */
+	for (;;)
+	{
+		List	   *CurrentTempTables;
+		int			numleft;
+		char		activity[64];
+
+		CurrentTempTables = BuildRelationList(true, false);
+		numleft = 0;
+		foreach_oid(tmptbloid, InitialTempTableList)
+		{
+			if (list_member_oid(CurrentTempTables, tmptbloid))
+				numleft++;
+		}
+		list_free(CurrentTempTables);
+
+		INJECTION_POINT("datachecksumsworker-fake-temptable-wait", &numleft);
+
+		if (numleft == 0)
+			break;
+
+		/*
+		 * At least one temp table is left to wait for, indicate in pgstat
+		 * activity and progress reporting.
+		 */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for %d temp tables to be removed", numleft);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 3 seconds */
+		ResetLatch(MyLatch);
+		(void) WaitLatch(MyLatch,
+						 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+						 3000,
+						 WAIT_EVENT_CHECKSUM_ENABLE_TEMPTABLE_WAIT);
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		aborted = DataChecksumsWorkerShmem->launch_operation != operation;
+		LWLockRelease(DataChecksumsWorkerLock);
+
+		if (aborted || abort_requested)
+		{
+			DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_ABORTED;
+			ereport(DEBUG1,
+					errmsg("data checksum processing aborted in database OID %u",
+						   dboid));
+			return;
+		}
+	}
+
+	list_free(InitialTempTableList);
+
+	/* worker done */
+	pgstat_progress_end_command();
+
+	DataChecksumsWorkerShmem->success = DATACHECKSUMSWORKER_SUCCESSFUL;
+}
diff --git a/src/backend/postmaster/meson.build b/src/backend/postmaster/meson.build
index 0008603cfee..ce10ef1059a 100644
--- a/src/backend/postmaster/meson.build
+++ b/src/backend/postmaster/meson.build
@@ -6,6 +6,7 @@ backend_sources += files(
   'bgworker.c',
   'bgwriter.c',
   'checkpointer.c',
+  'datachecksumsworker.c',
   'fork_process.c',
   'interrupt.c',
   'launch_backend.c',
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 7c064cf9fbb..10b861c0c77 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2991,6 +2991,11 @@ PostmasterStateMachine(void)
 									B_INVALID,
 									B_STANDALONE_BACKEND);
 
+			/* also add checksumming processes */
+			remainMask = btmask_add(remainMask,
+									B_DATACHECKSUMSWORKER_LAUNCHER,
+									B_DATACHECKSUMSWORKER_WORKER);
+
 			/* All types should be included in targetMask or remainMask */
 			Assert((remainMask.mask | targetMask.mask) == BTYPE_MASK_ALL.mask);
 		}
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index cc03f0706e9..f9f06821a8f 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -186,6 +186,7 @@ xlog_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 		case XLOG_FPW_CHANGE:
 		case XLOG_FPI_FOR_HINT:
 		case XLOG_FPI:
+		case XLOG_CHECKSUMS:
 		case XLOG_OVERWRITE_CONTRECORD:
 		case XLOG_CHECKPOINT_REDO:
 			break;
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index b23d0c19360..68507b1b5cc 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -31,6 +31,7 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/bgwriter.h"
+#include "postmaster/datachecksumsworker.h"
 #include "postmaster/walsummarizer.h"
 #include "replication/logicallauncher.h"
 #include "replication/origin.h"
@@ -140,6 +141,7 @@ CalculateShmemSize(void)
 	size = add_size(size, SlotSyncShmemSize());
 	size = add_size(size, AioShmemSize());
 	size = add_size(size, WaitLSNShmemSize());
+	size = add_size(size, DataChecksumsWorkerShmemSize());
 
 	/* include additional requested shmem from preload libraries */
 	size = add_size(size, total_addin_request);
@@ -316,6 +318,7 @@ CreateOrAttachShmemStructs(void)
 	PgArchShmemInit();
 	ApplyLauncherShmemInit();
 	SlotSyncShmemInit();
+	DataChecksumsWorkerShmemInit();
 
 	/*
 	 * Set up other modules that need some shared memory space
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 087821311cc..2f6ccdfb32f 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -18,12 +18,14 @@
 #include <unistd.h>
 
 #include "access/parallel.h"
+#include "access/xlog.h"
 #include "commands/async.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "port/pg_bitutils.h"
 #include "replication/logicalworker.h"
 #include "replication/walsender.h"
+#include "storage/checksum.h"
 #include "storage/condition_variable.h"
 #include "storage/ipc.h"
 #include "storage/latch.h"
@@ -576,6 +578,18 @@ ProcessProcSignalBarrier(void)
 					case PROCSIGNAL_BARRIER_SMGRRELEASE:
 						processed = ProcessBarrierSmgrRelease();
 						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON:
+						processed = AbsorbDataChecksumsBarrier(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_ON:
+						processed = AbsorbDataChecksumsBarrier(PG_DATA_CHECKSUM_VERSION);
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF:
+						processed = AbsorbDataChecksumsBarrier(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
+						break;
+					case PROCSIGNAL_BARRIER_CHECKSUM_OFF:
+						processed = AbsorbDataChecksumsBarrier(PG_DATA_CHECKSUM_OFF);
+						break;
 				}
 
 				/*
diff --git a/src/backend/storage/page/README b/src/backend/storage/page/README
index e30d7ac59ad..73c36a63908 100644
--- a/src/backend/storage/page/README
+++ b/src/backend/storage/page/README
@@ -10,7 +10,9 @@ http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf, discussed
 2010/12/22 on -hackers list.
 
 Current implementation requires this be enabled system-wide at initdb time, or
-by using the pg_checksums tool on an offline cluster.
+by using the pg_checksums tool on an offline cluster.  Checksums can also be
+enabled at runtime using pg_enable_data_checksums(), and disabled by using
+pg_disable_data_checksums().
 
 The checksum is not valid at all times on a data page!!
 The checksum is valid when the page leaves the shared pool and is checked
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index aac6e695954..cfb1753ffba 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -107,7 +107,7 @@ PageIsVerified(PageData *page, BlockNumber blkno, int flags, bool *checksum_fail
 	 */
 	if (!PageIsNew(page))
 	{
-		if (DataChecksumsEnabled())
+		if (DataChecksumsNeedVerify())
 		{
 			checksum = pg_checksum_page(page, blkno);
 
@@ -151,8 +151,8 @@ PageIsVerified(PageData *page, BlockNumber blkno, int flags, bool *checksum_fail
 		if ((flags & (PIV_LOG_WARNING | PIV_LOG_LOG)) != 0)
 			ereport(flags & PIV_LOG_WARNING ? WARNING : LOG,
 					(errcode(ERRCODE_DATA_CORRUPTED),
-					 errmsg("page verification failed, calculated checksum %u but expected %u",
-							checksum, p->pd_checksum)));
+					 errmsg("page verification failed, calculated checksum %u but expected %u (page LSN %X/%08X)",
+							checksum, p->pd_checksum, LSN_FORMAT_ARGS(PageXLogRecPtrGet(p->pd_lsn)))));
 
 		if (header_sane && (flags & PIV_IGNORE_CHECKSUM_FAILURE))
 			return true;
@@ -1511,7 +1511,7 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 	static char *pageCopy = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return page;
 
 	/*
@@ -1541,7 +1541,7 @@ void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
 	/* If we don't need a checksum, just return */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsNeedWrite())
 		return;
 
 	((PageHeader) page)->pd_checksum = pg_checksum_page(page, blkno);
diff --git a/src/backend/utils/activity/pgstat_backend.c b/src/backend/utils/activity/pgstat_backend.c
index 199ba2cc17a..7afe0098267 100644
--- a/src/backend/utils/activity/pgstat_backend.c
+++ b/src/backend/utils/activity/pgstat_backend.c
@@ -380,6 +380,8 @@ pgstat_tracks_backend_bktype(BackendType bktype)
 		case B_CHECKPOINTER:
 		case B_IO_WORKER:
 		case B_STARTUP:
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 			return false;
 
 		case B_AUTOVAC_WORKER:
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index 13ae57ed649..a290d56f409 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -362,6 +362,8 @@ pgstat_tracks_io_bktype(BackendType bktype)
 		case B_LOGGER:
 			return false;
 
+		case B_DATACHECKSUMSWORKER_LAUNCHER:
+		case B_DATACHECKSUMSWORKER_WORKER:
 		case B_AUTOVAC_LAUNCHER:
 		case B_AUTOVAC_WORKER:
 		case B_BACKEND:
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index c1ac71ff7f2..e94716ac3a1 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -118,6 +118,9 @@ CHECKPOINT_DELAY_COMPLETE	"Waiting for a backend that blocks a checkpoint from c
 CHECKPOINT_DELAY_START	"Waiting for a backend that blocks a checkpoint from starting."
 CHECKPOINT_DONE	"Waiting for a checkpoint to complete."
 CHECKPOINT_START	"Waiting for a checkpoint to start."
+CHECKSUM_ENABLE_STARTCONDITION	"Waiting for data checksums enabling to start."
+CHECKSUM_ENABLE_FINISHCONDITION	"Waiting for data checksums to be enabled."
+CHECKSUM_ENABLE_TEMPTABLE_WAIT	"Waiting for temporary tables to be dropped for data checksums to be enabled."
 EXECUTE_GATHER	"Waiting for activity from a child process while executing a <literal>Gather</literal> plan node."
 HASH_BATCH_ALLOCATE	"Waiting for an elected Parallel Hash participant to allocate a hash table."
 HASH_BATCH_ELECT	"Waiting to elect a Parallel Hash participant to allocate a hash table."
@@ -358,6 +361,7 @@ InjectionPoint	"Waiting to read or update information related to injection point
 SerialControl	"Waiting to read or update shared <filename>pg_serial</filename> state."
 AioWorkerSubmissionQueue	"Waiting to access AIO worker submission queue."
 WaitLSN	"Waiting to read or update shared Wait-for-LSN state."
+DataChecksumsWorker	"Waiting for data checksumsworker."
 
 #
 # END OF PREDEFINED LWLOCKS (DO NOT CHANGE THIS LINE)
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 7e2ed69138a..0c8ce7fa8b1 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -295,6 +295,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		cmdtype = PROGRESS_COMMAND_BASEBACKUP;
 	else if (pg_strcasecmp(cmd, "COPY") == 0)
 		cmdtype = PROGRESS_COMMAND_COPY;
+	else if (pg_strcasecmp(cmd, "DATACHECKSUMS") == 0)
+		cmdtype = PROGRESS_COMMAND_DATACHECKSUMS;
 	else
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -1167,9 +1169,6 @@ pg_stat_get_db_checksum_failures(PG_FUNCTION_ARGS)
 	int64		result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
@@ -1185,9 +1184,6 @@ pg_stat_get_db_checksum_last_failure(PG_FUNCTION_ARGS)
 	TimestampTz result;
 	PgStat_StatDBEntry *dbentry;
 
-	if (!DataChecksumsEnabled())
-		PG_RETURN_NULL();
-
 	if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
 		result = 0;
 	else
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index fec79992c8d..9b78e0012ef 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -844,7 +844,8 @@ InitializeSessionUserIdStandalone(void)
 	 * workers, in slot sync worker and in background workers.
 	 */
 	Assert(!IsUnderPostmaster || AmAutoVacuumWorkerProcess() ||
-		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess());
+		   AmLogicalSlotSyncWorkerProcess() || AmBackgroundWorkerProcess() ||
+		   AmDataChecksumsWorkerProcess());
 
 	/* call only once */
 	Assert(!OidIsValid(AuthenticatedUserId));
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 98f9598cd78..b598deb5648 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -746,6 +746,24 @@ InitPostgres(const char *in_dbname, Oid dboid,
 
 	ProcSignalInit(MyCancelKey, MyCancelKeyLength);
 
+	/*
+	 * Initialize a local cache of the data_checksum_version, to be updated by
+	 * the procsignal-based barriers.
+	 *
+	 * This intentionally happens after initializing the procsignal, otherwise
+	 * we might miss a state change. This means we can get a barrier for the
+	 * state we've just initialized - but it can happen only once.
+	 *
+	 * The postmaster (which is what gets forked into the new child process)
+	 * does not handle barriers, therefore it may not have the current value
+	 * of LocalDataChecksumVersion value (it'll have the value read from the
+	 * control file, which may be arbitrarily old).
+	 *
+	 * NB: Even if the postmaster handled barriers, the value might still be
+	 * stale, as it might have changed after this process forked.
+	 */
+	InitLocalDataChecksumVersion();
+
 	/*
 	 * Also set up timeout handlers needed for backend operation.  We need
 	 * these in every case except bootstrap.
@@ -874,7 +892,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
 					 errhint("You should immediately run CREATE USER \"%s\" SUPERUSER;.",
 							 username != NULL ? username : "postgres")));
 	}
-	else if (AmBackgroundWorkerProcess())
+	else if (AmBackgroundWorkerProcess() || AmDataChecksumsWorkerProcess())
 	{
 		if (username == NULL && !OidIsValid(useroid))
 		{
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 3b9d8349078..d3d2424db51 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -531,11 +531,12 @@
   max => '1.0',
 },
 
-{ name => 'data_checksums', type => 'bool', context => 'PGC_INTERNAL', group => 'PRESET_OPTIONS',
+{ name => 'data_checksums', type => 'enum', context => 'PGC_INTERNAL', group => 'PRESET_OPTIONS',
   short_desc => 'Shows whether data checksums are turned on for this cluster.',
   flags => 'GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED',
   variable => 'data_checksums',
-  boot_val => 'false',
+  boot_val => 'PG_DATA_CHECKSUM_OFF',
+  options => 'data_checksums_options',
 },
 
 # Can't be set by ALTER SYSTEM as it can lead to recursive definition
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index f87b558c2c6..8e7abad1b35 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -491,6 +491,14 @@ static const struct config_enum_entry file_copy_method_options[] = {
 	{NULL, 0, false}
 };
 
+static const struct config_enum_entry data_checksums_options[] = {
+	{"on", PG_DATA_CHECKSUM_VERSION, true},
+	{"off", PG_DATA_CHECKSUM_OFF, true},
+	{"inprogress-on", PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION, true},
+	{"inprogress-off", PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION, true},
+	{NULL, 0, false}
+};
+
 /*
  * Options for enum values stored in other modules
  */
@@ -617,7 +625,6 @@ static int	shared_memory_size_mb;
 static int	shared_memory_size_in_huge_pages;
 static int	wal_block_size;
 static int	num_os_semaphores;
-static bool data_checksums;
 static bool integer_datetimes;
 
 #ifdef USE_ASSERT_CHECKING
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 46cb2f36efa..327a677cb81 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -585,7 +585,7 @@ main(int argc, char *argv[])
 		ControlFile->state != DB_SHUTDOWNED_IN_RECOVERY)
 		pg_fatal("cluster must be shut down");
 
-	if (ControlFile->data_checksum_version == 0 &&
+	if (ControlFile->data_checksum_version != PG_DATA_CHECKSUM_VERSION &&
 		mode == PG_MODE_CHECK)
 		pg_fatal("data checksums are not enabled in cluster");
 
@@ -593,7 +593,7 @@ main(int argc, char *argv[])
 		mode == PG_MODE_DISABLE)
 		pg_fatal("data checksums are already disabled in cluster");
 
-	if (ControlFile->data_checksum_version > 0 &&
+	if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION &&
 		mode == PG_MODE_ENABLE)
 		pg_fatal("data checksums are already enabled in cluster");
 
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index 30ad46912e1..3151e3d8265 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -287,6 +287,8 @@ main(int argc, char *argv[])
 		   ControlFile->checkPointCopy.oldestCommitTsXid);
 	printf(_("Latest checkpoint's newestCommitTsXid:%u\n"),
 		   ControlFile->checkPointCopy.newestCommitTsXid);
+	printf(_("Latest checkpoint's data_checksum_version:%u\n"),
+		   ControlFile->checkPointCopy.data_checksum_version);
 	printf(_("Time of latest checkpoint:            %s\n"),
 		   ckpttime_str);
 	printf(_("Fake LSN counter for unlogged rels:   %X/%08X\n"),
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index 90cef0864de..29684e82440 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -15,6 +15,7 @@
 #include "access/xlog_internal.h"
 #include "common/string.h"
 #include "pg_upgrade.h"
+#include "storage/bufpage.h"
 
 
 /*
@@ -736,6 +737,14 @@ check_control_data(ControlData *oldctrl,
 	 * check_for_isn_and_int8_passing_mismatch().
 	 */
 
+	/*
+	 * If data checksums are in any in-progress state then disallow the
+	 * upgrade. The user should either let the process finish, or turn off
+	 * data checksums, before retrying.
+	 */
+	if (oldctrl->data_checksum_version > PG_DATA_CHECKSUM_VERSION)
+		pg_fatal("checksums are being enabled in the old cluster");
+
 	/*
 	 * We might eventually allow upgrades from checksum to no-checksum
 	 * clusters.
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 605280ed8fb..100df16384f 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -56,6 +56,7 @@ extern PGDLLIMPORT int CommitDelay;
 extern PGDLLIMPORT int CommitSiblings;
 extern PGDLLIMPORT bool track_wal_io_timing;
 extern PGDLLIMPORT int wal_decode_buffer_size;
+extern PGDLLIMPORT int data_checksums;
 
 extern PGDLLIMPORT int CheckPointSegments;
 
@@ -117,7 +118,7 @@ extern PGDLLIMPORT int wal_level;
  * of the bits make it to disk, but the checksum wouldn't match.  Also WAL-log
  * them if forced by wal_log_hints=on.
  */
-#define XLogHintBitIsNeeded() (DataChecksumsEnabled() || wal_log_hints)
+#define XLogHintBitIsNeeded() (wal_log_hints || DataChecksumsNeedWrite())
 
 /* Do we need to WAL-log information required only for Hot Standby and logical replication? */
 #define XLogStandbyInfoActive() (wal_level >= WAL_LEVEL_REPLICA)
@@ -230,7 +231,16 @@ extern XLogRecPtr GetXLogWriteRecPtr(void);
 
 extern uint64 GetSystemIdentifier(void);
 extern char *GetMockAuthenticationNonce(void);
-extern bool DataChecksumsEnabled(void);
+extern bool DataChecksumsNeedWrite(void);
+extern bool DataChecksumsNeedVerify(void);
+extern bool DataChecksumsOnInProgress(void);
+extern bool DataChecksumsOffInProgress(void);
+extern void SetDataChecksumsOnInProgress(bool immediate_checkpoint);
+extern void SetDataChecksumsOn(bool immediate_checkpoint);
+extern void SetDataChecksumsOff(bool immediate_checkpoint);
+extern bool AbsorbDataChecksumsBarrier(int target_state);
+extern const char *show_data_checksums(void);
+extern void InitLocalDataChecksumVersion(void);
 extern bool GetDefaultCharSignedness(void);
 extern XLogRecPtr GetFakeLSNForUnloggedRel(void);
 extern Size XLOGShmemSize(void);
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index 34deb2fe5f0..faaa0e62d38 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -25,6 +25,7 @@
 #include "lib/stringinfo.h"
 #include "pgtime.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/relfilelocator.h"
 
 
@@ -289,6 +290,12 @@ typedef struct xl_restore_point
 	char		rp_name[MAXFNAMELEN];
 } xl_restore_point;
 
+/* Information logged when data checksum level is changed */
+typedef struct xl_checksum_state
+{
+	uint32		new_checksumtype;
+} xl_checksum_state;
+
 /* Overwrite of prior contrecord */
 typedef struct xl_overwrite_contrecord
 {
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 293e9e03f59..7fc7209b3f3 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -62,6 +62,9 @@ typedef struct CheckPoint
 	 * set to InvalidTransactionId.
 	 */
 	TransactionId oldestActiveXid;
+
+	/* data checksums at the time of the checkpoint  */
+	uint32		data_checksum_version;
 } CheckPoint;
 
 /* XLOG info values for XLOG rmgr */
@@ -80,6 +83,7 @@ typedef struct CheckPoint
 /* 0xC0 is used in Postgres 9.5-11 */
 #define XLOG_OVERWRITE_CONTRECORD		0xD0
 #define XLOG_CHECKPOINT_REDO			0xE0
+#define XLOG_CHECKSUMS					0xF0
 
 
 /*
@@ -221,7 +225,7 @@ typedef struct ControlFileData
 	bool		float8ByVal;	/* float8, int8, etc pass-by-value? */
 
 	/* Are data pages protected by checksums? Zero if no checksum version */
-	uint32		data_checksum_version;
+	uint32		data_checksum_version;	/* persistent */
 
 	/*
 	 * True if the default signedness of char is "signed" on a platform where
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 66af2d96d67..f319c7661ab 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12389,6 +12389,25 @@
   proname => 'jsonb_subscript_handler', prorettype => 'internal',
   proargtypes => 'internal', prosrc => 'jsonb_subscript_handler' },
 
+# data checksum management functions
+{ oid => '9258',
+  descr => 'disable data checksums',
+  proname => 'pg_disable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'bool', proallargtypes => '{bool}',
+  proargmodes => '{i}',
+  proargnames => '{fast}',
+  prosrc => 'disable_data_checksums' },
+
+{ oid => '9257',
+  descr => 'enable data checksums',
+  proname => 'pg_enable_data_checksums', provolatile => 'v', prorettype => 'void',
+  proparallel => 'r',
+  proargtypes => 'int4 int4 bool', proallargtypes => '{int4,int4,bool}',
+  proargmodes => '{i,i,i}',
+  proargnames => '{cost_delay,cost_limit,fast}',
+  prosrc => 'enable_data_checksums' },
+
 # collation management functions
 { oid => '3445', descr => 'import collations from operating system',
   proname => 'pg_import_system_collations', procost => '100',
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index 1cde4bd9bcf..afd0df4cbb6 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -162,4 +162,21 @@
 #define PROGRESS_COPY_TYPE_PIPE 3
 #define PROGRESS_COPY_TYPE_CALLBACK 4
 
+/* Progress parameters for PROGRESS_DATACHECKSUMS */
+#define PROGRESS_DATACHECKSUMS_PHASE		0
+#define PROGRESS_DATACHECKSUMS_DBS_TOTAL	1
+#define PROGRESS_DATACHECKSUMS_DBS_DONE		2
+#define PROGRESS_DATACHECKSUMS_RELS_TOTAL	3
+#define PROGRESS_DATACHECKSUMS_RELS_DONE	4
+#define PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL	5
+#define PROGRESS_DATACHECKSUMS_BLOCKS_DONE	6
+
+/* Phases of datachecksumsworker operation */
+#define PROGRESS_DATACHECKSUMS_PHASE_ENABLING			0
+#define PROGRESS_DATACHECKSUMS_PHASE_DISABLING			1
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_TEMPREL	2
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_CHECKPOINT 3
+#define PROGRESS_DATACHECKSUMS_PHASE_WAITING_BARRIER	4
+#define PROGRESS_DATACHECKSUMS_PHASE_DONE				5
+
 #endif
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 9a7d733ddef..581fbae2ee0 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -367,6 +367,9 @@ typedef enum BackendType
 	B_WAL_SUMMARIZER,
 	B_WAL_WRITER,
 
+	B_DATACHECKSUMSWORKER_LAUNCHER,
+	B_DATACHECKSUMSWORKER_WORKER,
+
 	/*
 	 * Logger is not connected to shared memory and does not have a PGPROC
 	 * entry.
@@ -392,6 +395,9 @@ extern PGDLLIMPORT BackendType MyBackendType;
 #define AmWalSummarizerProcess()	(MyBackendType == B_WAL_SUMMARIZER)
 #define AmWalWriterProcess()		(MyBackendType == B_WAL_WRITER)
 #define AmIoWorkerProcess()			(MyBackendType == B_IO_WORKER)
+#define AmDataChecksumsWorkerProcess() \
+	(MyBackendType == B_DATACHECKSUMSWORKER_LAUNCHER || \
+	 MyBackendType == B_DATACHECKSUMSWORKER_WORKER)
 
 #define AmSpecialWorkerProcess() \
 	(AmAutoVacuumLauncherProcess() || \
diff --git a/src/include/postmaster/datachecksumsworker.h b/src/include/postmaster/datachecksumsworker.h
new file mode 100644
index 00000000000..0daef709ec8
--- /dev/null
+++ b/src/include/postmaster/datachecksumsworker.h
@@ -0,0 +1,51 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.h
+ *	  header file for data checksum helper background worker
+ *
+ *
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/postmaster/datachecksumsworker.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef DATACHECKSUMSWORKER_H
+#define DATACHECKSUMSWORKER_H
+
+/* Shared memory */
+extern Size DataChecksumsWorkerShmemSize(void);
+extern void DataChecksumsWorkerShmemInit(void);
+
+/* Possible operations the Datachecksumsworker can perform */
+typedef enum DataChecksumsWorkerOperation
+{
+	ENABLE_DATACHECKSUMS,
+	DISABLE_DATACHECKSUMS,
+	/* TODO: VERIFY_DATACHECKSUMS, */
+} DataChecksumsWorkerOperation;
+
+/*
+ * Possible states for a database entry which has been processed. Exported
+ * here since we want to be able to reference this from injection point tests.
+ */
+typedef enum
+{
+	DATACHECKSUMSWORKER_SUCCESSFUL = 0,
+	DATACHECKSUMSWORKER_ABORTED,
+	DATACHECKSUMSWORKER_FAILED,
+	DATACHECKSUMSWORKER_RETRYDB,
+} DataChecksumsWorkerResult;
+
+/* Start the background processes for enabling or disabling checksums */
+void		StartDataChecksumsWorkerLauncher(DataChecksumsWorkerOperation op,
+											 int cost_delay,
+											 int cost_limit,
+											 bool fast);
+
+/* Background worker entrypoints */
+void		DataChecksumsWorkerLauncherMain(Datum arg);
+void		DataChecksumsWorkerMain(Datum arg);
+
+#endif							/* DATACHECKSUMSWORKER_H */
diff --git a/src/include/postmaster/proctypelist.h b/src/include/postmaster/proctypelist.h
index 242862451d8..3dc93b176d9 100644
--- a/src/include/postmaster/proctypelist.h
+++ b/src/include/postmaster/proctypelist.h
@@ -38,6 +38,8 @@ PG_PROCTYPE(B_BACKEND, gettext_noop("client backend"), BackendMain, true)
 PG_PROCTYPE(B_BG_WORKER, gettext_noop("background worker"), BackgroundWorkerMain, true)
 PG_PROCTYPE(B_BG_WRITER, gettext_noop("background writer"), BackgroundWriterMain, true)
 PG_PROCTYPE(B_CHECKPOINTER, gettext_noop("checkpointer"), CheckpointerMain, true)
+PG_PROCTYPE(B_DATACHECKSUMSWORKER_LAUNCHER, gettext_noop("datachecksum launcher"), NULL, false)
+PG_PROCTYPE(B_DATACHECKSUMSWORKER_WORKER, gettext_noop("datachecksum worker"), NULL, false)
 PG_PROCTYPE(B_DEAD_END_BACKEND, gettext_noop("dead-end client backend"), BackendMain, true)
 PG_PROCTYPE(B_INVALID, gettext_noop("unrecognized"), NULL, false)
 PG_PROCTYPE(B_IO_WORKER, gettext_noop("io worker"), IoWorkerMain, true)
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index abc2cf2a020..2fb242f029d 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -16,6 +16,7 @@
 
 #include "access/xlogdefs.h"
 #include "storage/block.h"
+#include "storage/checksum.h"
 #include "storage/off.h"
 
 /* GUC variable */
@@ -204,7 +205,6 @@ typedef PageHeaderData *PageHeader;
  * handling pages.
  */
 #define PG_PAGE_LAYOUT_VERSION		4
-#define PG_DATA_CHECKSUM_VERSION	1
 
 /* ----------------------------------------------------------------
  *						page support functions
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 25d13a798d1..0faaac14b1b 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -15,6 +15,21 @@
 
 #include "storage/block.h"
 
+/*
+ * Checksum version 0 is used for when data checksums are disabled (OFF).
+ * PG_DATA_CHECKSUM_VERSION defines that data checksums are enabled in the
+ * cluster and PG_DATA_CHECKSUM_INPROGRESS_{ON|OFF}_VERSION defines that data
+ * checksums are either currently being enabled or disabled.
+ */
+typedef enum ChecksumType
+{
+	PG_DATA_CHECKSUM_OFF = 0,
+	PG_DATA_CHECKSUM_VERSION,
+	PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION,
+	PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION,
+	PG_DATA_CHECKSUM_ANY_VERSION
+} ChecksumType;
+
 /*
  * Compute the checksum for a Postgres page.  The page must be aligned on a
  * 4-byte boundary.
diff --git a/src/include/storage/lwlocklist.h b/src/include/storage/lwlocklist.h
index 5b0ce383408..071b553ee32 100644
--- a/src/include/storage/lwlocklist.h
+++ b/src/include/storage/lwlocklist.h
@@ -86,6 +86,7 @@ PG_LWLOCK(51, InjectionPoint)
 PG_LWLOCK(52, SerialControl)
 PG_LWLOCK(53, AioWorkerSubmissionQueue)
 PG_LWLOCK(54, WaitLSN)
+PG_LWLOCK(55, DataChecksumsWorker)
 
 /*
  * There also exist several built-in LWLock tranches.  As with the predefined
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index c6f5ebceefd..d90d35b1d6f 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -463,11 +463,11 @@ extern PGDLLIMPORT PGPROC *PreparedXactProcs;
  * Background writer, checkpointer, WAL writer, WAL summarizer, and archiver
  * run during normal operation.  Startup process and WAL receiver also consume
  * 2 slots, but WAL writer is launched only after startup has exited, so we
- * only need 6 slots.
+ * only need 6 slots to cover these. The DataChecksums worker and launcher
+ * can consume 2 slots when data checksums are enabled or disabled.
  */
 #define MAX_IO_WORKERS          32
-#define NUM_AUXILIARY_PROCS		(6 + MAX_IO_WORKERS)
-
+#define NUM_AUXILIARY_PROCS		(8 + MAX_IO_WORKERS)
 
 /* configurable options */
 extern PGDLLIMPORT int DeadlockTimeout;
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index afeeb1ca019..c54c61e2cd8 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -54,6 +54,11 @@ typedef enum
 typedef enum
 {
 	PROCSIGNAL_BARRIER_SMGRRELEASE, /* ask smgr to close files */
+
+	PROCSIGNAL_BARRIER_CHECKSUM_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_ON,
 } ProcSignalBarrierType;
 
 /*
diff --git a/src/include/utils/backend_progress.h b/src/include/utils/backend_progress.h
index dda813ab407..c664e92dbfe 100644
--- a/src/include/utils/backend_progress.h
+++ b/src/include/utils/backend_progress.h
@@ -28,6 +28,7 @@ typedef enum ProgressCommandType
 	PROGRESS_COMMAND_CREATE_INDEX,
 	PROGRESS_COMMAND_BASEBACKUP,
 	PROGRESS_COMMAND_COPY,
+	PROGRESS_COMMAND_DATACHECKSUMS,
 } ProgressCommandType;
 
 #define PGSTAT_NUM_PROGRESS_PARAM	20
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index d079b91b1a2..21c63481e99 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -19,6 +19,7 @@ SUBDIRS = \
 		  test_binaryheap \
 		  test_bitmapset \
 		  test_bloomfilter \
+		  test_checksums \
 		  test_copy_callbacks \
 		  test_custom_rmgrs \
 		  test_ddl_deparse \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index f5114469b92..4a289be05bb 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -18,6 +18,7 @@ subdir('test_aio')
 subdir('test_binaryheap')
 subdir('test_bitmapset')
 subdir('test_bloomfilter')
+subdir('test_checksums')
 subdir('test_copy_callbacks')
 subdir('test_custom_rmgrs')
 subdir('test_ddl_deparse')
diff --git a/src/test/modules/test_checksums/.gitignore b/src/test/modules/test_checksums/.gitignore
new file mode 100644
index 00000000000..871e943d50e
--- /dev/null
+++ b/src/test/modules/test_checksums/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/modules/test_checksums/Makefile b/src/test/modules/test_checksums/Makefile
new file mode 100644
index 00000000000..a5b6259a728
--- /dev/null
+++ b/src/test/modules/test_checksums/Makefile
@@ -0,0 +1,40 @@
+#-------------------------------------------------------------------------
+#
+# Makefile for src/test/checksum
+#
+# Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/checksum/Makefile
+#
+#-------------------------------------------------------------------------
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+export enable_injection_points
+
+MODULE_big = test_checksums
+OBJS = \
+	$(WIN32RES) \
+	test_checksums.o
+PGFILEDESC = "test_checksums - test code for data checksums"
+
+EXTENSION = test_checksums
+DATA = test_checksums--1.0.sql
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_checksums
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
diff --git a/src/test/modules/test_checksums/README b/src/test/modules/test_checksums/README
new file mode 100644
index 00000000000..0f0317060b3
--- /dev/null
+++ b/src/test/modules/test_checksums/README
@@ -0,0 +1,22 @@
+src/test/checksum/README
+
+Regression tests for data checksums
+===================================
+
+This directory contains a test suite for enabling data checksums
+in a running cluster.
+
+Running the tests
+=================
+
+    make check
+
+or
+
+    make installcheck
+
+NOTE: This creates a temporary installation (in the case of "check"),
+with multiple nodes, be they master or standby(s) for the purpose of
+the tests.
+
+NOTE: This requires the --enable-tap-tests argument to configure.
diff --git a/src/test/modules/test_checksums/meson.build b/src/test/modules/test_checksums/meson.build
new file mode 100644
index 00000000000..ffc737ca87a
--- /dev/null
+++ b/src/test/modules/test_checksums/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+test_checksums_sources = files(
+	'test_checksums.c',
+)
+
+test_checksums = shared_module('test_checksums',
+	test_checksums_sources,
+	kwargs: pg_test_mod_args,
+)
+test_install_libs += test_checksums
+
+test_install_data += files(
+	'test_checksums.control',
+	'test_checksums--1.0.sql',
+)
+
+tests += {
+  'name': 'test_checksums',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'env': {
+       'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+	},
+    'tests': [
+      't/001_basic.pl',
+      't/002_restarts.pl',
+      't/003_standby_restarts.pl',
+      't/004_offline.pl',
+      't/005_injection.pl',
+      't/006_pgbench_single.pl',
+      't/007_pgbench_standby.pl',
+    ],
+  },
+}
diff --git a/src/test/modules/test_checksums/t/001_basic.pl b/src/test/modules/test_checksums/t/001_basic.pl
new file mode 100644
index 00000000000..728a5c4510c
--- /dev/null
+++ b/src/test/modules/test_checksums/t/001_basic.pl
@@ -0,0 +1,63 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are turned off
+test_checksum_state($node, 'off');
+
+# Enable data checksums and wait for the state transition to 'on'
+enable_data_checksums($node, wait => 'on');
+
+# Run a dummy query just to make sure we can read back data
+my $result =
+  $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1 ");
+is($result, '9999', 'ensure checksummed pages can be read back');
+
+# Enable data checksums again which should be a no-op so we explicitly don't
+# wait for any state transition as none should happen here
+enable_data_checksums($node);
+test_checksum_state($node, 'on');
+# ..and make sure we can still read/write data
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+# Disable checksums again and wait for the state transition
+disable_data_checksums($node, wait => 'on');
+
+# Test reading data again
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '10000', 'ensure previously checksummed pages can be read back');
+
+# Re-enable checksums and make sure that the underlying data has changed to
+# ensure that checksums will be different.
+$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+enable_data_checksums($node, wait => 'on');
+
+# Run a dummy query just to make sure we can read back the data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '10000', 'ensure checksummed pages can be read back');
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/002_restarts.pl b/src/test/modules/test_checksums/t/002_restarts.pl
new file mode 100644
index 00000000000..6c17f304eac
--- /dev/null
+++ b/src/test/modules/test_checksums/t/002_restarts.pl
@@ -0,0 +1,110 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with a
+# restart which breaks processing.
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Initialize result storage for queries
+my $result;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+test_checksum_state($node, 'off');
+
+SKIP:
+{
+	skip 'Data checksum delay tests not enabled in PG_TEST_EXTRA', 6
+	  if (!$ENV{PG_TEST_EXTRA}
+		|| $ENV{PG_TEST_EXTRA} !~ /\bchecksum_extended\b/);
+
+	# Create a barrier for checksumming to block on, in this case a pre-
+	# existing temporary table which is kept open while processing is started.
+	# We can accomplish this by setting up an interactive psql process which
+	# keeps the temporary table created as we enable checksums in another psql
+	# process.
+	#
+	# This is a similar test to the synthetic variant in 005_injection.pl
+	# which fakes this scenario.
+	my $bsession = $node->background_psql('postgres');
+	$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+	# In another session, make sure we can see the blocking temp table but
+	# start processing anyways and check that we are blocked with a proper
+	# wait event.
+	$result = $node->safe_psql('postgres',
+		"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';"
+	);
+	is($result, 't', 'ensure we can see the temporary table');
+
+	# Enabling data checksums shouldn't work as the process is blocked on the
+	# temporary table held open by $bsession. Ensure that we reach inprogress-
+	# on before we do more tests.
+	enable_data_checksums($node, wait => 'inprogress-on');
+
+	# Wait for processing to finish and the worker waiting for leftover temp
+	# relations to be able to actually finish
+	$result = $node->poll_query_until(
+		'postgres',
+		"SELECT wait_event FROM pg_catalog.pg_stat_activity "
+		  . "WHERE backend_type = 'datachecksum worker';",
+		'ChecksumEnableTemptableWait');
+
+	# The datachecksumsworker waits for temporary tables to disappear for 3
+	# seconds before retrying, so sleep for 4 seconds to be guaranteed to see
+	# a retry cycle
+	sleep(4);
+
+	# Re-check the wait event to ensure we are blocked on the right thing.
+	$result = $node->safe_psql('postgres',
+			"SELECT wait_event FROM pg_catalog.pg_stat_activity "
+		  . "WHERE backend_type = 'datachecksum worker';");
+	is($result, 'ChecksumEnableTemptableWait',
+		'ensure the correct wait condition is set');
+	test_checksum_state($node, 'inprogress-on');
+
+	# Stop the cluster while bsession is still attached.  We can't close the
+	# session first since the brief period between closing and stopping might
+	# be enough for checksums to get enabled.
+	$node->stop;
+	$bsession->quit;
+	$node->start;
+
+	# Ensure the checksums aren't enabled across the restart.  This leaves the
+	# cluster in the same state as before we entered the SKIP block.
+	test_checksum_state($node, 'off');
+}
+
+enable_data_checksums($node, wait => 'on');
+
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '9999', 'ensure checksummed pages can be read back');
+
+$result = $node->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+disable_data_checksums($node, wait => 1);
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/003_standby_restarts.pl b/src/test/modules/test_checksums/t/003_standby_restarts.pl
new file mode 100644
index 00000000000..f724d4ea74c
--- /dev/null
+++ b/src/test/modules/test_checksums/t/003_standby_restarts.pl
@@ -0,0 +1,114 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# streaming replication
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+# Initialize primary node
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+$node_primary->init(allows_streaming => 1, no_data_checksums => 1);
+$node_primary->start;
+
+my $slotname = 'physical_slot';
+$node_primary->safe_psql('postgres',
+	"SELECT pg_create_physical_replication_slot('$slotname')");
+
+# Take backup
+my $backup_name = 'my_backup';
+$node_primary->backup($backup_name);
+
+# Create streaming standby linking to primary
+my $node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $backup_name,
+	has_streaming => 1);
+$node_standby_1->append_conf(
+	'postgresql.conf', qq[
+primary_slot_name = '$slotname'
+]);
+$node_standby_1->start;
+
+# Create some content on the primary to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Wait for standbys to catch up
+$node_primary->wait_for_catchup($node_standby_1, 'replay',
+	$node_primary->lsn('insert'));
+
+# Check that checksums are turned off on all nodes
+test_checksum_state($node_primary, 'off');
+test_checksum_state($node_standby_1, 'off');
+
+# ---------------------------------------------------------------------------
+# Enable checksums for the cluster, and make sure that both the primary and
+# standby change state.
+#
+
+# Ensure that the primary switches to "inprogress-on"
+enable_data_checksums($node_primary, wait => 'inprogress-on');
+# Wait for checksum enable to be replayed
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Ensure that the standby has switched to "inprogress-on" or "on".  Normally it
+# would be "inprogress-on", but it is theoretically possible for the primary to
+# complete the checksum enabling *and* have the standby replay that record
+# before we reach the check below.
+my $result = $node_standby_1->poll_query_until(
+	'postgres',
+	"SELECT setting = 'off' FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+	'f');
+is($result, 1, 'ensure standby has absorbed the inprogress-on barrier');
+$result = $node_standby_1->safe_psql('postgres',
+	"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+);
+
+is(($result eq 'inprogress-on' || $result eq 'on'),
+	1, 'ensure checksums are on, or in progress, on standby_1');
+
+# Insert some more data which should be checksummed on INSERT
+$node_primary->safe_psql('postgres',
+	"INSERT INTO t VALUES (generate_series(1, 10000));");
+
+# Wait for checksums enabled on the primary and standby
+wait_for_checksum_state($node_primary, 'on');
+wait_for_checksum_state($node_standby_1, 'on');
+
+$result =
+  $node_primary->safe_psql('postgres', "SELECT count(a) FROM t WHERE a > 1");
+is($result, '19998', 'ensure we can safely read all data with checksums');
+
+$result = $node_primary->poll_query_until(
+	'postgres',
+	"SELECT count(*) FROM pg_stat_activity WHERE backend_type LIKE 'datachecksumsworker%';",
+	'0');
+is($result, 1, 'await datachecksums worker/launcher termination');
+
+#
+# Disable checksums and ensure it's propagated to standby and that we can
+# still read all data
+#
+
+# Disable checksums and wait for the operation to be replayed
+disable_data_checksums($node_primary);
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+# Ensure that the primary abd standby has switched to off
+wait_for_checksum_state($node_primary, 'off');
+wait_for_checksum_state($node_standby_1, 'off');
+# Doublecheck reading data without errors
+$result =
+  $node_primary->safe_psql('postgres', "SELECT count(a) FROM t WHERE a > 1");
+is($result, "19998", 'ensure we can safely read all data without checksums');
+
+$node_standby_1->stop;
+$node_primary->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/004_offline.pl b/src/test/modules/test_checksums/t/004_offline.pl
new file mode 100644
index 00000000000..e9fbcf77eab
--- /dev/null
+++ b/src/test/modules/test_checksums/t/004_offline.pl
@@ -0,0 +1,82 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums offline from various states
+# of checksum processing
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+# Initialize node with checksums disabled.
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1,10000) AS a;");
+
+# Ensure that checksums are disabled
+test_checksum_state($node, 'off');
+
+# Enable checksums offline using pg_checksums
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are enabled
+test_checksum_state($node, 'on');
+
+# Run a dummy query just to make sure we can read back some data
+my $result =
+  $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '9999', 'ensure checksummed pages can be read back');
+
+# Disable checksums offline again using pg_checksums
+$node->stop;
+$node->checksum_disable_offline;
+$node->start;
+
+# Ensure that checksums are disabled
+test_checksum_state($node, 'off');
+
+# Create a barrier for checksumming to block on, in this case a pre-existing
+# temporary table which is kept open while processing is started. We can
+# accomplish this by setting up an interactive psql process which keeps the
+# temporary table created as we enable checksums in another psql process.
+
+my $bsession = $node->background_psql('postgres');
+$bsession->query_safe('CREATE TEMPORARY TABLE tt (a integer);');
+
+# In another session, make sure we can see the blocking temp table but start
+# processing anyways and check that we are blocked with a proper wait event.
+$result = $node->safe_psql('postgres',
+	"SELECT relpersistence FROM pg_catalog.pg_class WHERE relname = 'tt';");
+is($result, 't', 'ensure we can see the temporary table');
+
+enable_data_checksums($node, wait => 'inprogress-on');
+
+# Turn the cluster off and enable checksums offline, then start back up
+$bsession->quit;
+$node->stop;
+$node->checksum_enable_offline;
+$node->start;
+
+# Ensure that checksums are now enabled even though processing wasn't
+# restarted
+test_checksum_state($node, 'on');
+
+# Run a dummy query just to make sure we can read back some data
+$result = $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '9999', 'ensure checksummed pages can be read back');
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/005_injection.pl b/src/test/modules/test_checksums/t/005_injection.pl
new file mode 100644
index 00000000000..ae801cd336f
--- /dev/null
+++ b/src/test/modules/test_checksums/t/005_injection.pl
@@ -0,0 +1,126 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# injection point tests injecting failures into the processing
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+# ---------------------------------------------------------------------------
+# Test cluster setup
+#
+
+# Initiate testcluster
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init(no_data_checksums => 1);
+$node->start;
+
+# Set up test environment
+$node->safe_psql('postgres', 'CREATE EXTENSION test_checksums;');
+
+# ---------------------------------------------------------------------------
+# Inducing failures and crashes in processing
+
+# Force enabling checksums to fail by marking one of the databases as having
+# failed in processing.
+disable_data_checksums($node, wait => 1);
+$node->safe_psql('postgres', 'SELECT dcw_inject_fail_database(true);');
+enable_data_checksums($node, wait => 'off');
+$node->safe_psql('postgres', 'SELECT dcw_inject_fail_database(false);');
+
+# Force the server to crash after enabling data checksums but before issuing
+# the checkpoint.  Since the switch has been WAL logged the server should come
+# up with checksums enabled after replay.
+test_checksum_state($node, 'off');
+$node->safe_psql('postgres', 'SELECT dc_crash_before_checkpoint();');
+enable_data_checksums($node, fast => 'true');
+my $ret = wait_for_cluster_crash($node);
+ok($ret == 1, "Cluster crash detection timeout");
+ok(!$node->is_alive, "Cluster crashed due to abort() before checkpointing");
+$node->_update_pid(-1);
+$node->start;
+test_checksum_state($node, 'on');
+
+# Another test just like the previous, but for disabling data checksums (and
+# crashing just before checkpointing).  The previous injection points were all
+# detached from through the crash so they need to be reattached.
+$node->safe_psql('postgres', 'SELECT dc_crash_before_checkpoint();');
+disable_data_checksums($node);
+$ret = wait_for_cluster_crash($node);
+ok($ret == 1, "Cluster crash detection timeout");
+ok(!$node->is_alive, "Cluster crashed due to abort() before checkpointing");
+$node->_update_pid(-1);
+$node->start;
+test_checksum_state($node, 'off');
+
+# Now inject a crash before inserting the WAL record for data checksum state
+# change, when the server comes back up again the state should not have been
+# set to the new value since the process didn't succeed.
+$node->safe_psql('postgres', 'SELECT dc_crash_before_xlog();');
+enable_data_checksums($node);
+$ret = wait_for_cluster_crash($node);
+ok($ret == 1, "Cluster crash detection timeout");
+ok(!$node->is_alive, "Cluster crashed");
+$node->_update_pid(-1);
+$node->start;
+test_checksum_state($node, 'off');
+
+# This re-runs the same test again but with first disabling data checksums and
+# then enabling again, crashing right before inserting the WAL record.  When
+# it comes back up the checksums must not be enabled.
+$node->safe_psql('postgres', 'SELECT dc_crash_before_xlog();');
+enable_data_checksums($node);
+$ret = wait_for_cluster_crash($node);
+ok($ret == 1, "Cluster crash detection timeout");
+ok(!$node->is_alive, "Cluster crashed");
+$node->_update_pid(-1);
+$node->start;
+test_checksum_state($node, 'off');
+
+# ---------------------------------------------------------------------------
+# Timing and retry related tests
+#
+
+# Force the enable checksums processing to make multiple passes by removing
+# one database from the list in the first pass.  This will simulate a CREATE
+# DATABASE during processing.  Doing this via fault injection makes the test
+# not be subject to exact timing.
+$node->safe_psql('postgres', 'SELECT dcw_prune_dblist(true);');
+enable_data_checksums($node, wait => 'on');
+
+SKIP:
+{
+	skip 'Data checksum delay tests not enabled in PG_TEST_EXTRA', 4
+	  if (!$ENV{PG_TEST_EXTRA}
+		|| $ENV{PG_TEST_EXTRA} !~ /\bchecksum_extended\b/);
+
+	# Inject a delay in the barrier for enabling checksums
+	disable_data_checksums($node, wait => 1);
+	$node->safe_psql('postgres', 'SELECT dcw_inject_delay_barrier();');
+	enable_data_checksums($node, wait => 'on');
+
+	# Fake the existence of a temporary table at the start of processing, which
+	# will force the processing to wait and retry in order to wait for it to
+	# disappear.
+	disable_data_checksums($node, wait => 1);
+	$node->safe_psql('postgres', 'SELECT dcw_fake_temptable(true);');
+	enable_data_checksums($node, wait => 'on');
+}
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_checksums/t/006_pgbench_single.pl b/src/test/modules/test_checksums/t/006_pgbench_single.pl
new file mode 100644
index 00000000000..96f3b2cd8a6
--- /dev/null
+++ b/src/test/modules/test_checksums/t/006_pgbench_single.pl
@@ -0,0 +1,268 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster with
+# concurrent activity via pgbench runs
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+if (!$ENV{PG_TEST_EXTRA} || $ENV{PG_TEST_EXTRA} !~ /\bchecksum_extended\b/)
+{
+	plan skip_all => 'Extended tests not enabled';
+}
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+my $node;
+my $node_loglocation = 0;
+
+# The number of full test iterations which will be performed. The exact number
+# of tests performed and the wall time taken is non-deterministic as the test
+# performs a lot of randomized actions, but 10 iterations will be a long test
+# run regardless.
+my $TEST_ITERATIONS = 10;
+
+# Variables which record the current state of the cluster
+my $data_checksum_state = 'off';
+my $pgbench = undef;
+
+# determines whether enable_data_checksums/disable_data_checksums forces an
+# immediate checkpoint
+my @flip_modes = ('true', 'false');
+
+# Start a pgbench run in the background against the server specified via the
+# port passed as parameter.
+sub background_rw_pgbench
+{
+	my $port = shift;
+
+	# If a previous pgbench is still running, start by shutting it down.
+	if ($pgbench)
+	{
+		$pgbench->finish;
+	}
+
+	# Randomize the number of pgbench clients a bit (range 1-16)
+	my $clients = 1 + int(rand(15));
+
+	my @cmd = ('pgbench', '-p', $port, '-T', '600', '-c', $clients);
+
+	# Randomize whether we spawn connections or not
+	push(@cmd, '-C') if (cointoss);
+	# Finally add the database name to use
+	push(@cmd, 'postgres');
+
+	$pgbench = IPC::Run::start(
+		\@cmd,
+		'<' => '/dev/null',
+		'>' => '/dev/null',
+		'2>' => '/dev/null',
+		IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default));
+}
+
+# Invert the state of data checksums in the cluster, if data checksums are on
+# then disable them and vice versa. Also performs proper validation of the
+# before and after state.
+sub flip_data_checksums
+{
+	# First, make sure the cluster is in the state we expect it to be
+	test_checksum_state($node, $data_checksum_state);
+
+	if ($data_checksum_state eq 'off')
+	{
+		# Coin-toss to see if we are injecting a retry due to a temptable
+		$node->safe_psql('postgres', 'SELECT dcw_fake_temptable();')
+		  if cointoss();
+
+		# log LSN right before we start changing checksums
+		my $result =
+		  $node->safe_psql('postgres', "SELECT pg_current_wal_lsn()");
+		note("LSN before enabling: " . $result . "\n");
+
+		my $mode = $flip_modes[ int(rand(@flip_modes)) ];
+
+		# Ensure that the primary switches to "inprogress-on"
+		enable_data_checksums(
+			$node,
+			wait => 'inprogress-on',
+			'fast' => $mode);
+
+		random_sleep();
+
+		# Wait for checksums enabled on the primary
+		wait_for_checksum_state($node, 'on');
+
+		# log LSN right after the primary flips checksums to "on"
+		$result = $node->safe_psql('postgres', "SELECT pg_current_wal_lsn()");
+		note("LSN after enabling: " . $result . "\n");
+
+		random_sleep();
+
+		$node->safe_psql('postgres', 'SELECT dcw_fake_temptable(false);');
+		$data_checksum_state = 'on';
+	}
+	elsif ($data_checksum_state eq 'on')
+	{
+		random_sleep();
+
+		# log LSN right before we start changing checksums
+		my $result =
+		  $node->safe_psql('postgres', "SELECT pg_current_wal_lsn()");
+		note("LSN before disabling: " . $result . "\n");
+
+		my $mode = $flip_modes[ int(rand(@flip_modes)) ];
+
+		disable_data_checksums($node, 'fast' => $mode);
+
+		# Wait for checksums disabled on the primary
+		wait_for_checksum_state($node, 'off');
+
+		# log LSN right after the primary flips checksums to "off"
+		$result = $node->safe_psql('postgres', "SELECT pg_current_wal_lsn()");
+		note("LSN after disabling: " . $result . "\n");
+
+		random_sleep();
+
+		$data_checksum_state = 'off';
+	}
+	else
+	{
+		# This should only happen due to programmer error when hacking on the
+		# test code, but since that might pass subtly by let's ensure it gets
+		# caught with a test error if so.
+		bail('data_checksum_state variable has invalid state:'
+			  . $data_checksum_state);
+	}
+}
+
+# Create and start a cluster with one node
+$node = PostgreSQL::Test::Cluster->new('main');
+$node->init(allows_streaming => 1, no_data_checksums => 1);
+# max_connections need to be bumped in order to accommodate for pgbench clients
+# and log_statement is dialled down since it otherwise will generate enormous
+# amounts of logging. Page verification failures are still logged.
+$node->append_conf(
+	'postgresql.conf',
+	qq[
+max_connections = 100
+log_statement = none
+]);
+$node->start;
+$node->safe_psql('postgres', 'CREATE EXTENSION test_checksums;');
+# Create some content to have un-checksummed data in the cluster
+$node->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1, 100000) AS a;");
+# Initialize pgbench
+$node->command_ok([ 'pgbench', '-i', '-s', '100', '-q', 'postgres' ]);
+# Start the test suite with pgbench running.
+background_rw_pgbench($node->port);
+
+# Main test suite. This loop will start a pgbench run on the cluster and while
+# that's running flip the state of data checksums concurrently. It will then
+# randomly restart thec cluster (in fast or immediate) mode and then check for
+# the desired state.  The idea behind doing things randomly is to stress out
+# any timing related issues by subjecting the cluster for varied workloads.
+# A TODO is to generate a trace such that any test failure can be traced to
+# its order of operations for debugging.
+for (my $i = 0; $i < $TEST_ITERATIONS; $i++)
+{
+	note("iteration ", ($i + 1), " of ", $TEST_ITERATIONS);
+
+	if (!$node->is_alive)
+	{
+		random_sleep();
+
+		# Start, to do recovery, and stop
+		$node->start;
+		$node->stop('fast');
+
+		# Since the log isn't being written to now, parse the log and check
+		# for instances of checksum verification failures.
+		my $log = PostgreSQL::Test::Utils::slurp_file($node->logfile,
+			$node_loglocation);
+		unlike(
+			$log,
+			qr/page verification failed/,
+			"no checksum validation errors in primary log (during WAL recovery)"
+		);
+		$node_loglocation = -s $node->logfile;
+
+		# Randomize the WAL size, to trigger checkpoints less/more often
+		my $sb = 64 + int(rand(1024));
+		$node->append_conf('postgresql.conf', qq[max_wal_size = $sb]);
+
+		$node->start;
+
+		# Start a pgbench in the background against the primary
+		background_rw_pgbench($node->port);
+	}
+
+	$node->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+
+	flip_data_checksums();
+	random_sleep();
+	my $result =
+	  $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+	is($result, '100000', 'ensure data pages can be read back on primary');
+
+	random_sleep();
+
+	# Potentially powercycle the node
+	if (cointoss())
+	{
+		$node->stop(stopmode());
+
+		PostgreSQL::Test::Utils::system_log("pg_controldata",
+			$node->data_dir);
+
+		my $log = PostgreSQL::Test::Utils::slurp_file($node->logfile,
+			$node_loglocation);
+		unlike(
+			$log,
+			qr/page verification failed/,
+			"no checksum validation errors in primary log (outside WAL recovery)"
+		);
+		$node_loglocation = -s $node->logfile;
+	}
+
+	random_sleep();
+}
+
+# Make sure the node is running
+if (!$node->is_alive)
+{
+	$node->start;
+}
+
+# Testrun is over, ensure that data reads back as expected and perform a final
+# verification of the data checksum state.
+my $result =
+  $node->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '100000', 'ensure data pages can be read back on primary');
+test_checksum_state($node, $data_checksum_state);
+
+# Perform one final pass over the logs and hunt for unexpected errors
+my $log =
+  PostgreSQL::Test::Utils::slurp_file($node->logfile, $node_loglocation);
+unlike(
+	$log,
+	qr/page verification failed/,
+	"no checksum validation errors in primary log");
+$node_loglocation = -s $node->logfile;
+
+$node->teardown_node;
+
+done_testing();
diff --git a/src/test/modules/test_checksums/t/007_pgbench_standby.pl b/src/test/modules/test_checksums/t/007_pgbench_standby.pl
new file mode 100644
index 00000000000..8b8e031cbf6
--- /dev/null
+++ b/src/test/modules/test_checksums/t/007_pgbench_standby.pl
@@ -0,0 +1,398 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test suite for testing enabling data checksums in an online cluster,
+# comprising of a primary and a replicated standby, with concurrent activity
+# via pgbench runs
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+use DataChecksums::Utils;
+
+my $node_primary_slot = 'physical_slot';
+my $node_primary_backup = 'primary_backup';
+my $node_primary;
+my $node_primary_loglocation = 0;
+my $node_standby_1;
+my $node_standby_1_loglocation = 0;
+
+# The number of full test iterations which will be performed. The exact number
+# of tests performed and the wall time taken is non-deterministic as the test
+# performs a lot of randomized actions, but 5 iterations will be a long test
+# run regardless.
+my $TEST_ITERATIONS = 5;
+
+# Variables which record the current state of the cluster
+my $data_checksum_state = 'off';
+
+my $pgbench_primary = undef;
+my $pgbench_standby = undef;
+
+# Variables holding state for managing the cluster and aux processes in
+# various ways
+my ($pgb_primary_stdin, $pgb_primary_stdout, $pgb_primary_stderr) =
+  ('', '', '');
+my ($pgb_standby_1_stdin, $pgb_standby_1_stdout, $pgb_standby_1_stderr) =
+  ('', '', '');
+
+if (!$ENV{PG_TEST_EXTRA} || $ENV{PG_TEST_EXTRA} !~ /\bchecksum_extended\b/)
+{
+	plan skip_all => 'Extended tests not enabled';
+}
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+# determines whether enable_data_checksums/disable_data_checksums forces an
+# immediate checkpoint
+my @flip_modes = ('true', 'false');
+
+# Start a pgbench run in the background against the server specified via the
+# port passed as parameter
+sub background_pgbench
+{
+	my ($port, $standby) = @_;
+
+	# Terminate any currently running pgbench process before continuing
+	$pgbench_primary->finish if $pgbench_primary;
+
+	my $clients = 1 + int(rand(15));
+
+	my @cmd = ('pgbench', '-p', $port, '-T', '600', '-c', $clients);
+	# Randomize whether we spawn connections or not
+	push(@cmd, '-C') if (cointoss());
+	# If we run on a standby it needs to be a read-only benchmark
+	push(@cmd, '-S') if ($standby);
+	# Finally add the database name to use
+	push(@cmd, 'postgres');
+
+	$pgbench_primary = IPC::Run::start(
+		[ 'pgbench', '-p', $port, '-T', '600', '-c', $clients, 'postgres' ],
+		'<' => '/dev/null',
+		'>' => '/dev/null',
+		'2>' => '/dev/null',
+		IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default));
+}
+
+# Invert the state of data checksums in the cluster, if data checksums are on
+# then disable them and vice versa. Also performs proper validation of the
+# before and after state.
+sub flip_data_checksums
+{
+	test_checksum_state($node_primary, $data_checksum_state);
+	test_checksum_state($node_standby_1, $data_checksum_state);
+
+	if ($data_checksum_state eq 'off')
+	{
+		# Coin-toss to see if we are injecting a retry due to a temptable
+		$node_primary->safe_psql('postgres',
+			'SELECT dcw_fake_temptable(true);')
+		  if cointoss();
+
+		# log LSN right before we start changing checksums
+		my $result =
+		  $node_primary->safe_psql('postgres', "SELECT pg_current_wal_lsn()");
+		note("LSN before enabling: " . $result . "\n");
+
+		my $mode = $flip_modes[ int(rand(@flip_modes)) ];
+
+		# Ensure that the primary switches to "inprogress-on"
+		enable_data_checksums(
+			$node_primary,
+			wait => 'inprogress-on',
+			'fast' => $mode);
+		random_sleep();
+		# Wait for checksum enable to be replayed
+		$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+		# Ensure that the standby has switched to "inprogress-on" or "on".
+		# Normally it would be "inprogress-on", but it is theoretically
+		# possible for the primary to complete the checksum enabling *and* have
+		# the standby replay that record before we reach the check below.
+		$result = $node_standby_1->poll_query_until(
+			'postgres',
+			"SELECT setting = 'off' "
+			  . "FROM pg_catalog.pg_settings "
+			  . "WHERE name = 'data_checksums';",
+			'f');
+		is($result, 1,
+			'ensure standby has absorbed the inprogress-on barrier');
+		random_sleep();
+		$result = $node_standby_1->safe_psql('postgres',
+				"SELECT setting "
+			  . "FROM pg_catalog.pg_settings "
+			  . "WHERE name = 'data_checksums';");
+
+		is(($result eq 'inprogress-on' || $result eq 'on'),
+			1, 'ensure checksums are on, or in progress, on standby_1');
+
+		# Wait for checksums enabled on the primary and standby
+		wait_for_checksum_state($node_primary, 'on');
+
+		# log LSN right after the primary flips checksums to "on"
+		$result =
+		  $node_primary->safe_psql('postgres', "SELECT pg_current_wal_lsn()");
+		note("LSN after enabling: " . $result . "\n");
+
+		random_sleep();
+		wait_for_checksum_state($node_standby_1, 'on');
+
+		$node_primary->safe_psql('postgres',
+			'SELECT dcw_fake_temptable(false);');
+		$data_checksum_state = 'on';
+	}
+	elsif ($data_checksum_state eq 'on')
+	{
+		random_sleep();
+
+		# log LSN right before we start changing checksums
+		my $result =
+		  $node_primary->safe_psql('postgres', "SELECT pg_current_wal_lsn()");
+		note("LSN before disabling: " . $result . "\n");
+
+		my $mode = $flip_modes[ int(rand(@flip_modes)) ];
+
+		disable_data_checksums($node_primary, 'fast' => $mode);
+		$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+		# Wait for checksums disabled on the primary and standby
+		wait_for_checksum_state($node_primary, 'off');
+		wait_for_checksum_state($node_standby_1, 'off');
+
+		# log LSN right after the primary flips checksums to "off"
+		$result =
+		  $node_primary->safe_psql('postgres', "SELECT pg_current_wal_lsn()");
+		note("LSN after disabling: " . $result . "\n");
+
+		random_sleep();
+		wait_for_checksum_state($node_standby_1, 'off');
+
+		$data_checksum_state = 'off';
+	}
+	else
+	{
+		# This should only happen due to programmer error when hacking on the
+		# test code, but since that might pass subtly by let's ensure it gets
+		# caught with a test error if so.
+		is(1, 0, 'data_checksum_state variable has invalid state');
+	}
+}
+
+# Create and start a cluster with one primary and one standby node, and ensure
+# they are caught up and in sync.
+$node_primary = PostgreSQL::Test::Cluster->new('main');
+$node_primary->init(allows_streaming => 1, no_data_checksums => 1);
+# max_connections need to be bumped in order to accommodate for pgbench clients
+# and log_statement is dialled down since it otherwise will generate enormous
+# amounts of logging. Page verification failures are still logged.
+$node_primary->append_conf(
+	'postgresql.conf',
+	qq[
+max_connections = 30
+log_statement = none
+]);
+$node_primary->start;
+$node_primary->safe_psql('postgres', 'CREATE EXTENSION test_checksums;');
+# Create some content to have un-checksummed data in the cluster
+$node_primary->safe_psql('postgres',
+	"CREATE TABLE t AS SELECT generate_series(1, 100000) AS a;");
+$node_primary->safe_psql('postgres',
+	"SELECT pg_create_physical_replication_slot('$node_primary_slot');");
+$node_primary->backup($node_primary_backup);
+
+$node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1');
+$node_standby_1->init_from_backup($node_primary, $node_primary_backup,
+	has_streaming => 1);
+$node_standby_1->append_conf(
+	'postgresql.conf', qq[
+primary_slot_name = '$node_primary_slot'
+]);
+$node_standby_1->start;
+
+# Initialize pgbench and wait for the objects to be created on the standby
+$node_primary->command_ok([ 'pgbench', '-i', '-s', '100', '-q', 'postgres' ]);
+$node_primary->wait_for_catchup($node_standby_1, 'replay');
+
+# Start the test suite with pgbench running on all nodes
+background_pgbench($node_standby_1->port, 1);
+background_pgbench($node_primary->port, 0);
+
+# Main test suite. This loop will start a pgbench run on the cluster and while
+# that's running flip the state of data checksums concurrently. It will then
+# randomly restart the cluster (in fast or immediate) mode and then check for
+# the desired state.  The idea behind doing things randomly is to stress out
+# any timing related issues by subjecting the cluster for varied workloads.
+# A TODO is to generate a trace such that any test failure can be traced to
+# its order of operations for debugging.
+for (my $i = 0; $i < $TEST_ITERATIONS; $i++)
+{
+	note("iteration ", ($i + 1), " of ", $TEST_ITERATIONS);
+
+	if (!$node_primary->is_alive)
+	{
+		random_sleep();
+
+		# start, to do recovery, and stop
+		$node_primary->start;
+		$node_primary->stop('fast');
+
+		# Since the log isn't being written to now, parse the log and check
+		# for instances of checksum verification failures.
+		my $log = PostgreSQL::Test::Utils::slurp_file($node_primary->logfile,
+			$node_primary_loglocation);
+		unlike(
+			$log,
+			qr/page verification failed/,
+			"no checksum validation errors in primary log (during WAL recovery)"
+		);
+		$node_primary_loglocation = -s $node_primary->logfile;
+
+		# randomize the WAL size, to trigger checkpoints less/more often
+		my $sb = 64 + int(rand(960));
+		$node_primary->append_conf('postgresql.conf', qq[max_wal_size = $sb]);
+
+		note("changing primary max_wal_size to " . $sb);
+
+		$node_primary->start;
+
+		# Start a pgbench in the background against the primary
+		background_pgbench($node_primary->port, 0);
+	}
+
+	if (!$node_standby_1->is_alive)
+	{
+		random_sleep();
+
+		$node_standby_1->start;
+		$node_standby_1->stop('fast');
+
+		# Since the log isn't being written to now, parse the log and check
+		# for instances of checksum verification failures.
+		my $log =
+		  PostgreSQL::Test::Utils::slurp_file($node_standby_1->logfile,
+			$node_standby_1_loglocation);
+		unlike(
+			$log,
+			qr/page verification failed/,
+			"no checksum validation errors in standby_1 log (during WAL recovery)"
+		);
+		$node_standby_1_loglocation = -s $node_standby_1->logfile;
+
+		# randomize the WAL size, to trigger checkpoints less/more often
+		my $sb = 64 + int(rand(960));
+		$node_standby_1->append_conf('postgresql.conf',
+			qq[max_wal_size = $sb]);
+
+		note("changing standby max_wal_size to " . $sb);
+
+		$node_standby_1->start;
+
+		# Start a select-only pgbench in the background on the standby
+		background_pgbench($node_standby_1->port, 1);
+	}
+
+	$node_primary->safe_psql('postgres', "UPDATE t SET a = a + 1;");
+
+	flip_data_checksums();
+	random_sleep();
+	my $result = $node_primary->safe_psql('postgres',
+		"SELECT count(*) FROM t WHERE a > 1");
+	is($result, '100000', 'ensure data pages can be read back on primary');
+	random_sleep();
+	$node_primary->wait_for_catchup($node_standby_1, 'write');
+
+	random_sleep();
+
+	# Potentially powercycle the cluster (the nodes independently)
+	# XXX should maybe try stopping nodes in the opposite order too?
+	if (cointoss())
+	{
+		$node_primary->stop(stopmode());
+
+		# print the contents of the control file on the primary
+		PostgreSQL::Test::Utils::system_log("pg_controldata",
+			$node_primary->data_dir);
+
+		# slurp the file after shutdown, so that it doesn't interfere with the recovery
+		my $log = PostgreSQL::Test::Utils::slurp_file($node_primary->logfile,
+			$node_primary_loglocation);
+		unlike(
+			$log,
+			qr/page verification failed/,
+			"no checksum validation errors in primary log (outside WAL recovery)"
+		);
+		$node_primary_loglocation = -s $node_primary->logfile;
+	}
+
+	random_sleep();
+
+	if (cointoss())
+	{
+		$node_standby_1->stop(stopmode());
+
+		# print the contents of the control file on the standby
+		PostgreSQL::Test::Utils::system_log("pg_controldata",
+			$node_standby_1->data_dir);
+
+		# slurp the file after shutdown, so that it doesn't interfere with the recovery
+		my $log =
+		  PostgreSQL::Test::Utils::slurp_file($node_standby_1->logfile,
+			$node_standby_1_loglocation);
+		unlike(
+			$log,
+			qr/page verification failed/,
+			"no checksum validation errors in standby_1 log (outside WAL recovery)"
+		);
+		$node_standby_1_loglocation = -s $node_standby_1->logfile;
+	}
+}
+
+# make sure the nodes are running
+if (!$node_primary->is_alive)
+{
+	$node_primary->start;
+}
+
+if (!$node_standby_1->is_alive)
+{
+	$node_standby_1->start;
+}
+
+# Testrun is over, ensure that data reads back as expected and perform a final
+# verification of the data checksum state.
+my $result =
+  $node_primary->safe_psql('postgres', "SELECT count(*) FROM t WHERE a > 1");
+is($result, '100000', 'ensure data pages can be read back on primary');
+test_checksum_state($node_primary, $data_checksum_state);
+test_checksum_state($node_standby_1, $data_checksum_state);
+
+# Perform one final pass over the logs and hunt for unexpected errors
+my $log = PostgreSQL::Test::Utils::slurp_file($node_primary->logfile,
+	$node_primary_loglocation);
+unlike(
+	$log,
+	qr/page verification failed/,
+	"no checksum validation errors in primary log");
+$node_primary_loglocation = -s $node_primary->logfile;
+$log = PostgreSQL::Test::Utils::slurp_file($node_standby_1->logfile,
+	$node_standby_1_loglocation);
+unlike(
+	$log,
+	qr/page verification failed/,
+	"no checksum validation errors in standby_1 log");
+$node_standby_1_loglocation = -s $node_standby_1->logfile;
+
+$node_standby_1->teardown_node;
+$node_primary->teardown_node;
+
+done_testing();
diff --git a/src/test/modules/test_checksums/t/DataChecksums/Utils.pm b/src/test/modules/test_checksums/t/DataChecksums/Utils.pm
new file mode 100644
index 00000000000..cf670be944c
--- /dev/null
+++ b/src/test/modules/test_checksums/t/DataChecksums/Utils.pm
@@ -0,0 +1,283 @@
+
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+=pod
+
+=head1 NAME
+
+DataChecksums::Utils - Utility functions for testing data checksums in a running cluster
+
+=head1 SYNOPSIS
+
+  use PostgreSQL::Test::Cluster;
+  use DataChecksums::Utils qw( .. );
+
+  # Create, and start, a new cluster
+  my $node = PostgreSQL::Test::Cluster->new('primary');
+  $node->init;
+  $node->start;
+
+  test_checksum_state($node, 'off');
+
+  enable_data_checksums($node);
+
+  wait_for_checksum_state($node, 'on');
+
+
+=cut
+
+package DataChecksums::Utils;
+
+use strict;
+use warnings FATAL => 'all';
+use Exporter 'import';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+our @EXPORT = qw(
+  cointoss
+  disable_data_checksums
+  enable_data_checksums
+  random_sleep
+  stopmode
+  test_checksum_state
+  wait_for_checksum_state
+  wait_for_cluster_crash
+);
+
+=pod
+
+=head1 METHODS
+
+=over
+
+=item test_checksum_state(node, state)
+
+Test that the current value of the data checksum GUC in the server running
+at B<node> matches B<state>.  If the values differ, a test failure is logged.
+Returns True if the values match, otherwise False.
+
+=cut
+
+sub test_checksum_state
+{
+	my ($postgresnode, $state) = @_;
+
+	my $result = $postgresnode->safe_psql('postgres',
+		"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';"
+	);
+	is($result, $state, 'ensure checksums are set to ' . $state);
+	return $result eq $state;
+}
+
+=item wait_for_checksum_state(node, state)
+
+Test the value of the data checksum GUC in the server running at B<node>
+repeatedly until it matches B<state> or times out.  Processing will run for
+$PostgreSQL::Test::Utils::timeout_default seconds before timing out.  If the
+values differ when the process times out, False is returned and a test failure
+is logged, otherwise True.
+
+=cut
+
+sub wait_for_checksum_state
+{
+	my ($postgresnode, $state) = @_;
+
+	my $res = $postgresnode->poll_query_until(
+		'postgres',
+		"SELECT setting FROM pg_catalog.pg_settings WHERE name = 'data_checksums';",
+		$state);
+	is($res, 1, 'ensure data checksums are transitioned to ' . $state);
+	return $res == 1;
+}
+
+=item wait_for_cluster_crash(node, params)
+
+Repeatedly test if the cluster running at B<node> for responds to connections
+and return when it no longer does so, or when it times out.  Processing will
+run for $PostgreSQL::Test::Utils::timeout_default seconds unless a timeout
+value is specified as a parameter.  Returns True if the cluster crashed, else
+False if the process timed out.
+
+=over
+
+=item timeout
+
+Approximate number of seconds to wait for cluster to crash, default is
+$PostgreSQL::Test::Utils::timeout_default.  There are no real-time guarantee
+that the total process time won't exceed the timeout.
+
+=back
+
+=cut
+
+sub wait_for_cluster_crash
+{
+	my $postgresnode = shift;
+	my %params = @_;
+	my $crash = 0;
+
+	$params{timeout} = $PostgreSQL::Test::Utils::timeout_default
+	  unless (defined($params{timeout}));
+
+	for (my $naps = 0; $naps < $params{timeout}; $naps++)
+	{
+		if (!$postgresnode->is_alive)
+		{
+			$crash = 1;
+			last;
+		}
+		sleep(1);
+	}
+
+	return $crash == 1;
+}
+
+=item enable_data_checksums($node, %params)
+
+Function for enabling data checksums in the cluster running at B<node>.
+
+=over
+
+=item cost_delay
+
+The B<cost_delay> to use when enabling data checksums, default is 0.
+
+=item cost_limit
+
+The B<cost_limit> to use when enabling data checksums, default is 100.
+
+=item fast
+
+If set to C<true> an immediate checkpoint will be issued after data
+checksums are enabled.  Setting this to false will lead to slower tests.
+The default is C<true>.
+
+=item wait
+
+If defined, the function will wait for the state defined in this parameter,
+waiting timing out, before returning.  The function will wait for
+$PostgreSQL::Test::Utils::timeout_default seconds before timing out.
+
+=back
+
+=cut
+
+sub enable_data_checksums
+{
+	my $postgresnode = shift;
+	my %params = @_;
+
+	# Set sane defaults for the parameters
+	$params{cost_delay} = 0 unless (defined($params{cost_delay}));
+	$params{cost_limit} = 100 unless (defined($params{cost_limit}));
+	$params{fast} = 'true' unless (defined($params{fast}));
+
+	my $query = <<'EOQ';
+SELECT pg_enable_data_checksums(%s, %s, %s);
+EOQ
+
+	$postgresnode->safe_psql(
+		'postgres',
+		sprintf($query,
+			$params{cost_delay}, $params{cost_limit}, $params{fast}));
+
+	wait_for_checksum_state($postgresnode, $params{wait})
+	  if (defined($params{wait}));
+}
+
+=item disable_data_checksums($node, %params)
+
+Function for disabling data checksums in the cluster running at B<node>.
+
+=over
+
+=item wait
+
+If defined, the function will wait for the state to turn to B<off>, or
+waiting timing out, before returning.  The function will wait for
+$PostgreSQL::Test::Utils::timeout_default seconds before timing out.
+Unlike in C<enable_data_checksums> the value of the parameter is discarded.
+
+=over
+
+=item fast
+
+If set to C<true> the checkpoint after disabling will be set to immediate, else
+it will be deferred.  The default if no value is set is B<true>.
+
+=back
+
+=cut
+
+sub disable_data_checksums
+{
+	my $postgresnode = shift;
+	my %params = @_;
+
+	# Set sane defaults for the parameters
+	$params{fast} = 'true' unless (defined($params{fast}));
+
+	my $query = <<'EOQ';
+SELECT pg_disable_data_checksums(%s);
+EOQ
+
+	$postgresnode->safe_psql('postgres', sprintf($query, $params{fast}));
+
+	wait_for_checksum_state($postgresnode, 'off') if (defined($params{wait}));
+}
+
+=item cointoss
+
+Helper for retrieving a binary value with random distribution for deciding
+whether to turn things off during testing.
+
+=back
+
+=cut
+
+sub cointoss
+{
+	return int(rand() < 0.5);
+}
+
+=item random_sleep(max)
+
+Helper for injecting random sleeps here and there in the testrun. The sleep
+duration will be in the range (0,B<max>), but won't be predictable in order to
+avoid sleep patterns that manage to avoid race conditions and timing bugs.
+The default B<max> is 3 seconds.
+
+=back
+
+=cut
+
+sub random_sleep
+{
+	my $max = shift;
+	sleep(int(rand(defined($max) ? $max : 3))) if cointoss;
+}
+
+=item stopmode
+
+Small helper function for randomly selecting a valid stopmode.
+
+=back
+
+=cut
+
+sub stopmode
+{
+	return 'immediate' if (cointoss);
+	return 'fast';
+}
+
+=pod
+
+=back
+
+=cut
+
+1;
diff --git a/src/test/modules/test_checksums/test_checksums--1.0.sql b/src/test/modules/test_checksums/test_checksums--1.0.sql
new file mode 100644
index 00000000000..aa086d5c430
--- /dev/null
+++ b/src/test/modules/test_checksums/test_checksums--1.0.sql
@@ -0,0 +1,28 @@
+/* src/test/modules/test_checksums/test_checksums--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_checksums" to load this file. \quit
+
+CREATE FUNCTION dcw_inject_delay_barrier(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION dcw_inject_fail_database(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION dcw_prune_dblist(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION dcw_fake_temptable(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION dc_crash_before_checkpoint(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION dc_crash_before_xlog(attach boolean DEFAULT true)
+	RETURNS pg_catalog.void
+	AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_checksums/test_checksums.c b/src/test/modules/test_checksums/test_checksums.c
new file mode 100644
index 00000000000..c182f2c868b
--- /dev/null
+++ b/src/test/modules/test_checksums/test_checksums.c
@@ -0,0 +1,225 @@
+/*--------------------------------------------------------------------------
+ *
+ * test_checksums.c
+ *		Test data checksums
+ *
+ * Copyright (c) 2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *		src/test/modules/test_checksums/test_checksums.c
+ *
+ * -------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "funcapi.h"
+#include "miscadmin.h"
+#include "postmaster/datachecksumsworker.h"
+#include "storage/latch.h"
+#include "utils/injection_point.h"
+#include "utils/wait_event.h"
+
+#define USEC_PER_SEC    1000000
+
+PG_MODULE_MAGIC;
+
+extern PGDLLEXPORT void dc_delay_barrier(const char *name, const void *private_data, void *arg);
+extern PGDLLEXPORT void dc_fail_database(const char *name, const void *private_data, void *arg);
+extern PGDLLEXPORT void dc_dblist(const char *name, const void *private_data, void *arg);
+extern PGDLLEXPORT void dc_fake_temptable(const char *name, const void *private_data, void *arg);
+
+extern PGDLLEXPORT void crash(const char *name, const void *private_data, void *arg);
+
+/*
+ * Test for delaying emission of procsignalbarriers.
+ */
+void
+dc_delay_barrier(const char *name, const void *private_data, void *arg)
+{
+	(void) name;
+	(void) private_data;
+
+	(void) WaitLatch(MyLatch,
+					 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+					 (3 * 1000),
+					 WAIT_EVENT_PG_SLEEP);
+}
+
+PG_FUNCTION_INFO_V1(dcw_inject_delay_barrier);
+Datum
+dcw_inject_delay_barrier(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksums-enable-checksums-delay",
+							 "test_checksums",
+							 "dc_delay_barrier",
+							 NULL,
+							 0);
+	else
+		InjectionPointDetach("datachecksums-enable-checksums-delay");
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
+
+void
+dc_fail_database(const char *name, const void *private_data, void *arg)
+{
+	static bool first_pass = true;
+	DataChecksumsWorkerResult *res = (DataChecksumsWorkerResult *) arg;
+
+	if (first_pass)
+		*res = DATACHECKSUMSWORKER_FAILED;
+	first_pass = false;
+}
+
+PG_FUNCTION_INFO_V1(dcw_inject_fail_database);
+Datum
+dcw_inject_fail_database(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksumsworker-fail-db",
+							 "test_checksums",
+							 "dc_fail_database",
+							 NULL,
+							 0);
+	else
+		InjectionPointDetach("datachecksumsworker-fail-db");
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
+
+/*
+ * Test to remove an entry from the Databaselist to force re-processing since
+ * not all databases could be processed in the first iteration of the loop.
+ */
+void
+dc_dblist(const char *name, const void *private_data, void *arg)
+{
+	static bool first_pass = true;
+	List	   *DatabaseList = (List *) arg;
+
+	if (first_pass)
+		DatabaseList = list_delete_last(DatabaseList);
+	first_pass = false;
+}
+
+PG_FUNCTION_INFO_V1(dcw_prune_dblist);
+Datum
+dcw_prune_dblist(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksumsworker-initial-dblist",
+							 "test_checksums",
+							 "dc_dblist",
+							 NULL,
+							 0);
+	else
+		InjectionPointDetach("datachecksumsworker-initial-dblist");
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
+
+/*
+ * Test to force waiting for existing temptables.
+ */
+void
+dc_fake_temptable(const char *name, const void *private_data, void *arg)
+{
+	static bool first_pass = true;
+	int		   *numleft = (int *) arg;
+
+	if (first_pass)
+		*numleft = 1;
+	first_pass = false;
+}
+
+PG_FUNCTION_INFO_V1(dcw_fake_temptable);
+Datum
+dcw_fake_temptable(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksumsworker-fake-temptable-wait",
+							 "test_checksums",
+							 "dc_fake_temptable",
+							 NULL,
+							 0);
+	else
+		InjectionPointDetach("datachecksumsworker-fake-temptable-wait");
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
+
+void
+crash(const char *name, const void *private_data, void *arg)
+{
+	abort();
+}
+
+/*
+ * dc_crash_before_checkpoint
+ *
+ * Ensure that the server crashes just before the checkpoint is issued after
+ * enabling or disabling checksums.
+ */
+PG_FUNCTION_INFO_V1(dc_crash_before_checkpoint);
+Datum
+dc_crash_before_checkpoint(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	InjectionPointAttach("datachecksums-enable-checksums-pre-checkpoint",
+						 "test_checksums", "crash", NULL, 0);
+	InjectionPointAttach("datachecksums-disable-checksums-pre-checkpoint",
+						 "test_checksums", "crash", NULL, 0);
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
+
+/*
+ * dc_crash_before_xlog
+ *
+ * Ensure that the server crashes right before it is about insert the xlog
+ * record XLOG_CHECKSUMS.
+ */
+PG_FUNCTION_INFO_V1(dc_crash_before_xlog);
+Datum
+dc_crash_before_xlog(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		attach = PG_GETARG_BOOL(0);
+
+	if (attach)
+		InjectionPointAttach("datachecksums-xlogchecksums-pre-xloginsert",
+							 "test_checksums", "crash", NULL, 0);
+#else
+	elog(ERROR,
+		 "test is not working as intended when injection points are disabled");
+#endif
+	PG_RETURN_VOID();
+}
diff --git a/src/test/modules/test_checksums/test_checksums.control b/src/test/modules/test_checksums/test_checksums.control
new file mode 100644
index 00000000000..84b4cc035a7
--- /dev/null
+++ b/src/test/modules/test_checksums/test_checksums.control
@@ -0,0 +1,4 @@
+comment = 'Test code for data checksums'
+default_version = '1.0'
+module_pathname = '$libdir/test_checksums'
+relocatable = true
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 35413f14019..3af7944acea 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3872,6 +3872,51 @@ sub advance_wal
 	}
 }
 
+=item $node->checksum_enable_offline()
+
+Enable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_enable_offline
+{
+	my ($self) = @_;
+
+	print "# Enabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-e');
+	return;
+}
+
+=item checksum_disable_offline
+
+Disable data page checksums in an offline cluster with B<pg_checksums>. The
+caller is responsible for ensuring that the cluster is in the right state for
+this operation.
+
+=cut
+
+sub checksum_disable_offline
+{
+	my ($self) = @_;
+
+	print "# Disabling checksums in \"$self->data_dir\"\n";
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-d');
+	return;
+}
+
+sub checksum_verify_offline
+{
+	my ($self) = @_;
+
+	PostgreSQL::Test::Utils::system_or_bail('pg_checksums', '-D',
+		$self->data_dir, '-c');
+	return;
+}
+
 =pod
 
 =back
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 94e45dd4d57..6b66563a313 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2082,6 +2082,42 @@ pg_stat_progress_create_index| SELECT s.pid,
     s.param15 AS partitions_done
    FROM (pg_stat_get_progress_info('CREATE INDEX'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
+pg_stat_progress_data_checksums| SELECT s.pid,
+    s.datid,
+    d.datname,
+        CASE s.param1
+            WHEN 0 THEN 'enabling'::text
+            WHEN 1 THEN 'disabling'::text
+            WHEN 2 THEN 'waiting'::text
+            WHEN 3 THEN 'waiting on temporary tables'::text
+            WHEN 4 THEN 'waiting on checkpoint'::text
+            WHEN 5 THEN 'done'::text
+            ELSE NULL::text
+        END AS phase,
+        CASE s.param2
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param2
+        END AS databases_total,
+    s.param3 AS databases_done,
+        CASE s.param4
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param4
+        END AS relations_total,
+        CASE s.param5
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param5
+        END AS relations_done,
+        CASE s.param6
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param6
+        END AS blocks_total,
+        CASE s.param7
+            WHEN '-1'::integer THEN NULL::bigint
+            ELSE s.param7
+        END AS blocks_done
+   FROM (pg_stat_get_progress_info('DATACHECKSUMS'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
+     LEFT JOIN pg_database d ON ((s.datid = d.oid)))
+  ORDER BY s.datid;
 pg_stat_progress_vacuum| SELECT s.pid,
     s.datid,
     d.datname,
diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 67e1860e984..c9feff8331e 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -51,6 +51,22 @@ client backend|relation|vacuum
 client backend|temp relation|normal
 client backend|wal|init
 client backend|wal|normal
+datachecksum launcher|relation|bulkread
+datachecksum launcher|relation|bulkwrite
+datachecksum launcher|relation|init
+datachecksum launcher|relation|normal
+datachecksum launcher|relation|vacuum
+datachecksum launcher|temp relation|normal
+datachecksum launcher|wal|init
+datachecksum launcher|wal|normal
+datachecksum worker|relation|bulkread
+datachecksum worker|relation|bulkwrite
+datachecksum worker|relation|init
+datachecksum worker|relation|normal
+datachecksum worker|relation|vacuum
+datachecksum worker|temp relation|normal
+datachecksum worker|wal|init
+datachecksum worker|wal|normal
 io worker|relation|bulkread
 io worker|relation|bulkwrite
 io worker|relation|init
@@ -95,7 +111,7 @@ walsummarizer|wal|init
 walsummarizer|wal|normal
 walwriter|wal|init
 walwriter|wal|normal
-(79 rows)
+(95 rows)
 \a
 -- ensure that both seqscan and indexscan plans are allowed
 SET enable_seqscan TO on;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index cf3f6a7dafd..0fe39da9ec4 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -417,6 +417,8 @@ CheckPointStmt
 CheckpointStatsData
 CheckpointerRequest
 CheckpointerShmemStruct
+ChecksumBarrierCondition
+ChecksumType
 Chromosome
 CkptSortItem
 CkptTsStatus
@@ -587,6 +589,7 @@ CustomScan
 CustomScanMethods
 CustomScanState
 CycleCtr
+DataChecksumsWorkerOperation
 DBState
 DCHCacheEntry
 DEADLOCK_INFO
@@ -610,6 +613,10 @@ DataPageDeleteStack
 DataTypesUsageChecks
 DataTypesUsageVersionCheck
 DatabaseInfo
+DataChecksumsWorkerDatabase
+DataChecksumsWorkerResult
+DataChecksumsWorkerResultEntry
+DataChecksumsWorkerShmemStruct
 DateADT
 DateTimeErrorExtra
 Datum
@@ -4272,6 +4279,7 @@ xl_btree_split
 xl_btree_unlink_page
 xl_btree_update
 xl_btree_vacuum
+xl_checksum_state
 xl_clog_truncate
 xl_commit_ts_truncate
 xl_dbase_create_file_copy_rec
-- 
2.39.3 (Apple Git-146)

#74Andres Freund
andres@anarazel.de
In reply to: Daniel Gustafsson (#73)
Re: Changing the state of data checksums in a running cluster

Hi,

One high level issue first: I don't think the way this uses checkpoints and
restartpoints is likely to work out.

Synchronously having to wait for restartpoints during recovery seems generally
operationally a huge issue, but actually could also easily lead to undetected
deadlocks, particularly with syncrep.

Using the checksum state from the control file seems very fraught,
particularly with PITR, as the control file can be "from the future". Which
can be a problem e.g. if checksums were disabled, but we start recovery with a
control file with checksums enabled.

Forcing synchronous checkpoints in a bunch of places also will make this a
good bit slower than necessary, particularly for testing.

My suggestion for how to do this instead is to put the checksum state into the
XLOG_CHECKPOINT_* records. When starting recovery from an online checkpoint,
I think we should use the ChecksumType from the XLOG_CHECKPOINT_REDO record,
that way the standby/recovery environment's assumption about whether checksums
were enabled is the same as it was at the time the WAL was generated. For
shutdown checkpoints, we could either start to emit a XLOG_CHECKPOINT_REDO, or
we can use the information from the checkpoint record itself.

I think if we only ever use the checksum state from the point where we start
recovery, we might not need to force *any* checkpoints.

Daniel and I chatted about that proposal, and couldn't immediately come up
with scenarios where that would be wrong. For a while I thought there would
be problems when doing PITR from a base backup that had checksums initially
enabled, but where checksums were disabled before the base backup was
completed. My worry was that a later (once checksums were disabled) hint bit
write (which would not necessarily be WAL logged) would corrupt the checksum,
but I don't think that's a problem, because the startup process will only read
data pages in the process of processing WAL records, and if there's a WAL
record, there would also have to be an FPW, which would "cure" the
unchecksummed page.

More comments below, inline - I wrote these first, so it's possible that I
missed updating some of them in light of what I now wrote above.

+/*
+ * This must match largest number of sets in barrier_eq and barrier_ne in the
+ * below checksum_barriers definition.
+ */
+#define MAX_BARRIER_CONDITIONS 2
+
+/*
+ * Configuration of conditions which must match when absorbing a procsignal
+ * barrier during data checksum enable/disable operations.  A single function
+ * is used for absorbing all barriers, and the set of conditions to use is
+ * looked up in the checksum_barriers struct.  The struct member for the target
+ * state defines which state the backend must currently be in, and which it
+ * must not be in.
+ */
+typedef struct ChecksumBarrierCondition
+{
+	/* The target state of the barrier */
+	int			target;
+	/* A set of states in which at least one MUST match the current state */
+	int			barrier_eq[MAX_BARRIER_CONDITIONS];
+	/* The number of elements in the barrier_eq set */
+	int			barrier_eq_sz;
+	/* A set of states which all MUST NOT match the current state */
+	int			barrier_ne[MAX_BARRIER_CONDITIONS];
+	/* The number of elements in the barrier_ne set */
+	int			barrier_ne_sz;
+} ChecksumBarrierCondition;
+
+static const ChecksumBarrierCondition checksum_barriers[] =
+{
+	{PG_DATA_CHECKSUM_OFF, {PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION, PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION}, 2, {PG_DATA_CHECKSUM_VERSION}, 1},
+	{PG_DATA_CHECKSUM_VERSION, {PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION}, 1, {0}, 0},
+	{PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION, {PG_DATA_CHECKSUM_ANY_VERSION}, 1, {PG_DATA_CHECKSUM_VERSION}, 1},
+	{PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION, {PG_DATA_CHECKSUM_VERSION}, 1, {0}, 0},
+	{-1}
+};

The explanation for this doesn't really explain what the purpose of this thing
is... Perhaps worth referencing datachecksumsworker.c or such?

For a local, constantly sized, array, you shouldn't need a -1 terminator, as
you can instead use lengthof() or such to detect invalid accesses.

+/*
+ * Flag to remember if the procsignalbarrier being absorbed for checksums is
+ * the first one.  The first procsignalbarrier can in rare cases be for the
+ * state we've initialized, i.e. a duplicate.  This may happen for any
+ * data_checksum_version value when the process is spawned between the update
+ * of XLogCtl->data_checksum_version and the barrier being emitted.  This can
+ * only happen on the very first barrier so mark that with this flag.
+ */
+static bool InitialDataChecksumTransition = true;

This is pretty hard to understand right now, at the very least it needs an
updated comment. But perhaps we can just get rid of this and accept barriers
that are redundant.

@@ -830,9 +905,10 @@ XLogInsertRecord(XLogRecData *rdata,
* only happen just after a checkpoint, so it's better to be slow in
* this case and fast otherwise.
*
-		 * Also check to see if fullPageWrites was just turned on or there's a
-		 * running backup (which forces full-page writes); if we weren't
-		 * already doing full-page writes then go back and recompute.
+		 * Also check to see if fullPageWrites was just turned on, there's a
+		 * running backup or if checksums are enabled (all of which forces
+		 * full-page writes); if we weren't already doing full-page writes
+		 * then go back and recompute.
*
* If we aren't doing full-page writes then RedoRecPtr doesn't
* actually affect the contents of the XLOG record, so we'll update
@@ -845,7 +921,9 @@ XLogInsertRecord(XLogRecData *rdata,
Assert(RedoRecPtr < Insert->RedoRecPtr);
RedoRecPtr = Insert->RedoRecPtr;
}
-		doPageWrites = (Insert->fullPageWrites || Insert->runningBackups > 0);
+		doPageWrites = (Insert->fullPageWrites ||
+						Insert->runningBackups > 0 ||
+						DataChecksumsNeedWrite());

if (doPageWrites &&
(!prevDoPageWrites ||

Why do we need to separately check for DataChecksumsNeedWrite() if turning on
checksums also forces Insert->fullPageWrites to on?

+/*
+ * SetDataChecksumsOnInProgress
+ *		Sets the data checksum state to "inprogress-on" to enable checksums
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on". See
+ * SetDataChecksumsOn below for a description on how this state change works.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOnInProgress(bool immediate_checkpoint)
+{
+	uint64		barrier;
+	int			flags;
+
+	Assert(ControlFile != NULL);
+
+	/*
+	 * The state transition is performed in a critical section with
+	 * checkpoints held off to provide crash safety.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();

ISTM that delayChkptFlags ought to only be set once within the critical
section. Obviously we can't fail inbetween as the code stands, but that's not
guaranteed to stay this way.

+	XLogChecksums(PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION;
+	SpinLockRelease(&XLogCtl->info_lck);

Maybe worth adding an assertion checking that we are currently in an expected
state (off or inprogress, I think?).

+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;

Swap as well.

Think it might be worth mentioning that we rely on the memory ordering implied
by XLogChecksums() and WaitForProcSignalBarrier() for the changes to
delayChkptFlags. Unless we have a different logic around that?

+	/*
+	 * Await state change in all backends to ensure that all backends are in
+	 * "inprogress-on". Once done we know that all backends are writing data
+	 * checksums.
+	 */
+	WaitForProcSignalBarrier(barrier);
+
+	/*
+	 * force checkpoint to persist the current checksum state in control file
+	 * etc.
+	 *
+	 * XXX is this needed? There's already a checkpoint at the end of
+	 * ProcessAllDatabases, maybe this is redundant?
+	 */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_FAST;
+	RequestCheckpoint(flags);

Why do we need a checkpoint at all?

+}
+
+/*
+ * SetDataChecksumsOn
+ *		Enables data checksums cluster-wide
+ *
+ * Enabling data checksums is performed using two barriers, the first one to
+ * set the state to "inprogress-on" (done by SetDataChecksumsOnInProgress())
+ * and the second one to set the state to "on" (done here). Below is a short
+ * description of the processing, a more detailed write-up can be found in
+ * datachecksumsworker.c.
+ *
+ * To start the process of enabling data checksums in a running cluster the
+ * data_checksum_version state must be changed to "inprogress-on".  This state
+ * requires data checksums to be written but not verified. This ensures that
+ * all data pages can be checksummed without the risk of false negatives in
+ * validation during the process.  When all existing pages are guaranteed to
+ * have checksums, and all new pages will be initiated with checksums, the
+ * state can be changed to "on". Once the state is "on" checksums will be both
+ * written and verified. See datachecksumsworker.c for a longer discussion on
+ * how data checksums can be enabled in a running cluster.
+ *
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOn(bool immediate_checkpoint)
{
+	uint64		barrier;
+	int			flags;
+
Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+
+	/*
+	 * The only allowed state transition to "on" is from "inprogress-on" since
+	 * that state ensures that all pages will have data checksums written.
+	 */
+	if (XLogCtl->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION)
+	{
+		SpinLockRelease(&XLogCtl->info_lck);
+		elog(ERROR, "checksums not in \"inprogress-on\" mode");

Seems like a PANIC condition to me...

+	}
+
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	INJECTION_POINT("datachecksums-enable-checksums-delay", NULL);
+	START_CRIT_SECTION();

I think it's a really really bad idea to do something fallible, like
INJECTION_POINT, after setting delayChkptFlags, but before entering the crit
section. Any error in the injection point will lead to a corrupted
delayChkptFlags, no?

+	XLogChecksums(PG_DATA_CHECKSUM_VERSION);
+
+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = PG_DATA_CHECKSUM_VERSION;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	/*
+	 * Await state transition of "on" in all backends. When done we know that
+	 * data checksums are enabled in all backends and data checksums are both
+	 * written and verified.
+	 */
+	WaitForProcSignalBarrier(barrier);
+
+	INJECTION_POINT("datachecksums-enable-checksums-pre-checkpoint", NULL);
+
+	/* XXX is this needed? */
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_FAST;
+	RequestCheckpoint(flags);
+}
+
+/*
+ * SetDataChecksumsOff
+ *		Disables data checksums cluster-wide
+ *
+ * Disabling data checksums must be performed with two sets of barriers, each
+ * carrying a different state. The state is first set to "inprogress-off"
+ * during which checksums are still written but not verified. This ensures that
+ * backends which have yet to observe the state change from "on" won't get
+ * validation errors on concurrently modified pages. Once all backends have
+ * changed to "inprogress-off", the barrier for moving to "off" can be emitted.
+ * This function blocks until all backends in the cluster have acknowledged the
+ * state transition.
+ */
+void
+SetDataChecksumsOff(bool immediate_checkpoint)
+{
+	[...]
+	/*
+	 * Ensure that we don't incur a checkpoint during disabling checksums.
+	 */
+	MyProc->delayChkptFlags |= DELAY_CHKPT_START;
+	START_CRIT_SECTION();
+
+	XLogChecksums(0);

Why no symbolic name here?

+	SpinLockAcquire(&XLogCtl->info_lck);
+	XLogCtl->data_checksum_version = 0;
+	SpinLockRelease(&XLogCtl->info_lck);
+
+	barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+
+	END_CRIT_SECTION();
+	MyProc->delayChkptFlags &= ~DELAY_CHKPT_START;
+
+	WaitForProcSignalBarrier(barrier);
+
+	flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT;
+	if (immediate_checkpoint)
+		flags = flags | CHECKPOINT_FAST;
+	RequestCheckpoint(flags);
+}
+
+/*
+ * AbsorbDataChecksumsBarrier
+ *		Generic function for absorbing data checksum state changes
+ *
+ * All procsignalbarriers regarding data checksum state changes are absorbed
+ * with this function.  The set of conditions required for the state change to
+ * be accepted are listed in the checksum_barriers struct, target_state is
+ * used to look up the relevant entry.
+ */
+bool
+AbsorbDataChecksumsBarrier(int target_state)
+{
+	const ChecksumBarrierCondition *condition = checksum_barriers;
+	int			current = LocalDataChecksumVersion;
+	bool		found = false;
+
+	/*
+	 * Find the barrier condition definition for the target state. Not finding
+	 * a condition would be a grave programmer error as the states are a
+	 * discrete set.
+	 */
+	while (condition->target != target_state && condition->target != -1)
+		condition++;
+	if (unlikely(condition->target == -1))
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("invalid target state %i for data checksum barrier",
+					   target_state));

FWIW, you don't need unlikely() when the branch does an ereport(ERROR), as
ereports >=ERROR are marked "cold" automatically.

+	/*
+	 * The current state MUST be equal to one of the EQ states defined in this
+	 * barrier condition, or equal to the target_state if - and only if -
+	 * InitialDataChecksumTransition is true.
+	 */
+	for (int i = 0; i < condition->barrier_eq_sz; i++)
+	{
+		if (current == condition->barrier_eq[i] ||
+			condition->barrier_eq[i] == PG_DATA_CHECKSUM_ANY_VERSION)
+			found = true;
+	}
+	if (InitialDataChecksumTransition && current == target_state)
+		found = true;
+
+	/*
+	 * The current state MUST NOT be equal to any of the NE states defined in
+	 * this barrier condition.
+	 */
+	for (int i = 0; i < condition->barrier_ne_sz; i++)
+	{
+		if (current == condition->barrier_ne[i])
+			found = false;
+	}
+
+	/*
+	 * If the relevent state criteria aren't satisfied, throw an error which
+	 * will be caught by the procsignal machinery for a later retry.
+	 */
+	if (!found)
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				errmsg("incorrect data checksum state %i for target state %i",
+					   current, target_state));
+
+	SetLocalDataChecksumVersion(target_state);
+	InitialDataChecksumTransition = false;
+	return true;
+}
+/*
+ * Log the new state of checksums
+ */
+static void
+XLogChecksums(uint32 new_type)
+{
+	xl_checksum_state xlrec;
+	XLogRecPtr	recptr;
+
+	xlrec.new_checksumtype = new_type;
+
+	XLogBeginInsert();
+	XLogRegisterData((char *) &xlrec, sizeof(xl_checksum_state));
+
+	INJECTION_POINT("datachecksums-xlogchecksums-pre-xloginsert", &new_type);
+
+	recptr = XLogInsert(RM_XLOG_ID, XLOG_CHECKSUMS);
+	XLogFlush(recptr);
+}

Why an injection point between XLogBeginInsert() and XLogInsert(), rather than
have the injection point before the XLogBeginInsert()?

+	else if (info == XLOG_CHECKSUMS)
+	{
+		xl_checksum_state state;
+		uint64		barrier;
+
+		memcpy(&state, XLogRecGetData(record), sizeof(xl_checksum_state));
+
+		/*
+		 * XXX Could this end up written to the control file prematurely? IIRC
+		 * that happens during checkpoint, so what if that gets triggered e.g.
+		 * because someone runs CHECKPOINT? If we then crash (or something
+		 * like that), could that confuse the instance?
+		 */
+		SpinLockAcquire(&XLogCtl->info_lck);
+		XLogCtl->data_checksum_version = state.new_checksumtype;
+		SpinLockRelease(&XLogCtl->info_lck);
+
+		/*
+		 * Block on a procsignalbarrier to await all processes having seen the
+		 * change to checksum status. Once the barrier has been passed we can
+		 * initiate the corresponding processing.
+		 */
+		switch (state.new_checksumtype)
+		{
+			case PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			case PG_DATA_CHECKSUM_VERSION:
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_ON);
+				WaitForProcSignalBarrier(barrier);
+				break;
+
+			default:
+				Assert(state.new_checksumtype == 0);
+				barrier = EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF);
+				WaitForProcSignalBarrier(barrier);
+				break;

I'd not add a default clause, but add a case each value of the enum. That way
we'll get warnings if the set of states changes.

These WaitForProcSignalBarrier() are one of the scariest bits of the
patchset. If the startup process were to hold any lock that backends need, and
the backends waited for that lock without processing interrupts, we'd have an
undetected deadlock. This is much more likely to be a problem for the startup
process, as it does the work of many backends on the primary.

We do process interrupts while waiting for heavyweight locks, so that at least
is not an issue. Seems worth to call out rather explicitly.

+	if (checksumRestartPoint &&
+		(info == XLOG_CHECKPOINT_ONLINE ||
+		 info == XLOG_CHECKPOINT_REDO ||
+		 info == XLOG_CHECKPOINT_SHUTDOWN))
+	{
+		int			flags;
+
+		elog(LOG, "forcing creation of a restartpoint after XLOG_CHECKSUMS");
+
+		/* We explicitly want an immediate checkpoint here */
+		flags = CHECKPOINT_FORCE | CHECKPOINT_WAIT | CHECKPOINT_FAST;
+		RequestCheckpoint(flags);
+
+		checksumRestartPoint = false;
+	}

As noted above, I don't think we should rely on starting restartpoints.

diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 2be4e069816..baf6c8cc2cc 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1613,7 +1613,8 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
* enabled for this cluster, and if this is a relation file, then verify
* the checksum.
*/
-	if (!noverify_checksums && DataChecksumsEnabled() &&
+	if (!noverify_checksums &&
+		DataChecksumsNeedWrite() &&
RelFileNumberIsValid(relfilenumber))
verify_checksum = true;

Why is DataChecksumsNeedWrite() being tested here?

--
-- We also set up some things as accessible to standard roles.
--
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 086c4c8fb6f..6d452b10bce 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1374,6 +1374,26 @@ CREATE VIEW pg_stat_progress_copy AS
FROM pg_stat_get_progress_info('COPY') AS S
LEFT JOIN pg_database D ON S.datid = D.oid;
+CREATE VIEW pg_stat_progress_data_checksums AS
+    SELECT
+        S.pid AS pid, S.datid, D.datname AS datname,
+        CASE S.param1 WHEN 0 THEN 'enabling'
+                      WHEN 1 THEN 'disabling'
+                      WHEN 2 THEN 'waiting on temporary tables'
+                      WHEN 3 THEN 'waiting on checkpoint'
+					  WHEN 4 THEN 'waiting on barrier'
+                      WHEN 5 THEN 'done'
+                      END AS phase,
+        CASE S.param2 WHEN -1 THEN NULL ELSE S.param2 END AS databases_total,
+        S.param3 AS databases_done,
+        CASE S.param4 WHEN -1 THEN NULL ELSE S.param4 END AS relations_total,
+        CASE S.param5 WHEN -1 THEN NULL ELSE S.param5 END AS relations_done,
+        CASE S.param6 WHEN -1 THEN NULL ELSE S.param6 END AS blocks_total,
+        CASE S.param7 WHEN -1 THEN NULL ELSE S.param7 END AS blocks_done
+    FROM pg_stat_get_progress_info('DATACHECKSUMS') AS S
+        LEFT JOIN pg_database D ON S.datid = D.oid
+    ORDER BY S.datid; -- return the launcher process first
+

Not this patch's fault, but I strongly dislike that we do this in SQL. Every
postgres database in the world has ~110kB of pg_stat_progress view definitions
in it. We should just do this in a C function.

diff --git a/src/backend/postmaster/datachecksumsworker.c b/src/backend/postmaster/datachecksumsworker.c
new file mode 100644
index 00000000000..57311760b2b
--- /dev/null
+++ b/src/backend/postmaster/datachecksumsworker.c
@@ -0,0 +1,1491 @@
+/*-------------------------------------------------------------------------
+ *
+ * datachecksumsworker.c
+ *	  Background worker for enabling or disabling data checksums online
+ *
+ * When enabling data checksums on a database at initdb time or when shut down
+ * with pg_checksums, no extra process is required as each page is checksummed,
+ * and verified, when accessed.  When enabling checksums on an already running
+ * cluster, this worker will ensure that all pages are checksummed before
+ * verification of the checksums is turned on. In the case of disabling
+ * checksums, the state transition is performed only in the control file, no
+ * changes are performed on the data pages.
+ *
+ * Checksums can be either enabled or disabled cluster-wide, with on/off being
+ * the end state for data_checksums.
+ *
+ * Enabling checksums
+ * ------------------
+ * When enabling checksums in an online cluster, data_checksums will be set to
+ * "inprogress-on" which signals that write operations MUST compute and write
+ * the checksum on the data page, but during reading the checksum SHALL NOT be
+ * verified. This ensures that all objects created during checksumming will
+ * have checksums set, but no reads will fail due to incorrect checksum.

Maybe "... due to not yet set checksums."? Incorrect checksums sounds like
it's about checksums that are actively wrong, rather than just not set. Except
for the corner case of a torn page in the process of an hint bit write, after
having disabled checksums, there shouldn't be incorrect ones, right?

The
+ * DataChecksumsWorker will compile a list of databases which exist at the
+ * start of checksumming, and all of these which haven't been dropped during
+ * the processing MUST have been processed successfully in order for checksums
+ * to be enabled. Any new relation created during processing will see the
+ * in-progress state and will automatically be checksummed.

What about new databases created while checksums are being enabled? They could
be copied before the worker has processed them. At least for the file_copy
strategy, the copy will be verbatim and thus will not necessarily have
checksums set.

+ * Synchronization and Correctness
+ * -------------------------------
+ * The processes involved in enabling, or disabling, data checksums in an
+ * online cluster must be properly synchronized with the normal backends
+ * serving concurrent queries to ensure correctness. Correctness is defined
+ * as the following:
+ *
+ *    - Backends SHALL NOT violate the data_checksums state they have agreed to
+ *      by acknowledging the procsignalbarrier:  This means that all backends
+ *      MUST calculate and write data checksums during all states except off;
+ *      MUST validate checksums only in the 'on' state.
+ *    - Data checksums SHALL NOT be considered enabled cluster-wide until all
+ *      currently connected backends have state "on": This means that all
+ *      backends must wait on the procsignalbarrier to be acknowledged by all
+ *      before proceeding to validate data checksums.
+ *
+ * There are two levels of synchronization required for changing data_checksums

Maybe s/levels/steps/?

+ * in an online cluster: (i) changing state in the active backends ("on",
+ * "off", "inprogress-on" and "inprogress-off"), and (ii) ensuring no
+ * incompatible objects and processes are left in a database when workers end.
+ * The former deals with cluster-wide agreement on data checksum state and the
+ * latter with ensuring that any concurrent activity cannot break the data
+ * checksum contract during processing.
+ *
+ * Synchronizing the state change is done with procsignal barriers, where the
+ * WAL logging backend updating the global state in the controlfile will wait

It's not entirely obvious what "the WAL logging backend" means.

+ * for all other backends to absorb the barrier. Barrier absorption will happen
+ * during interrupt processing, which means that connected backends will change
+ * state at different times. To prevent data checksum state changes when
+ * writing and verifying checksums, interrupts shall be held off before
+ * interrogating state and resumed when the IO operation has been performed.
+ *
+ *   When Enabling Data Checksums
+ *   ----------------------------

Odd change in indentation here.

+ *   A process which fails to observe data checksums being enabled can induce
+ *   two types of errors: failing to write the checksum when modifying the page
+ *   and failing to validate the data checksum on the page when reading it.
+ *
+ *   When processing starts all backends belong to one of the below sets, with
+ *   one set being empty:
+ *
+ *   Bd: Backends in "off" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   If processing is started in an online cluster then all backends are in Bd.
+ *   If processing was halted by the cluster shutting down, the controlfile
+ *   state "inprogress-on" will be observed on system startup and all backends
+ *   will be placed in Bd.

Why not in Bi? Just for simplicities sake? ISTM we already need to be sure
that new backends start in Bi, as they might never observe the barrier...

Backends transition Bd -> Bi via a procsignalbarrier
+ *   which is emitted by the DataChecksumsLauncher.  When all backends have
+ *   acknowledged the barrier then Bd will be empty and the next phase can
+ *   begin: calculating and writing data checksums with DataChecksumsWorkers.
+ *   When the DataChecksumsWorker processes have finished writing checksums on
+ *   all pages and enables data checksums cluster-wide via another

s/enables/enabled/?

+ *   procsignalbarrier, there are four sets of backends where Bd shall be an
+ *   empty set:
+ *
+ *   Bg: Backend updating the global state and emitting the procsignalbarrier
+ *   Bd: Backends in "off" state
+ *   Be: Backends in "on" state
+ *   Bi: Backends in "inprogress-on" state
+ *
+ *   Backends in Bi and Be will write checksums when modifying a page, but only
+ *   backends in Be will verify the checksum during reading. The Bg backend is
+ *   blocked waiting for all backends in Bi to process interrupts and move to
+ *   Be. Any backend starting while Bg is waiting on the procsignalbarrier will
+ *   observe the global state being "on" and will thus automatically belong to
+ *   Be.  Checksums are enabled cluster-wide when Bi is an empty set. Bi and Be
+ *   are compatible sets while still operating based on their local state as
+ *   both write data checksums.
+ *
+ *   When Disabling Data Checksums
+ *   -----------------------------
+ *   A process which fails to observe that data checksums have been disabled
+ *   can induce two types of errors: writing the checksum when modifying the
+ *   page and validating a data checksum which is no longer correct due to
+ *   modifications to the page.

Hm. I wonder if we ought to zero out old checksums when loading a page into
s_b with checksums disabled... But that's really independent of this patchset.

+ * Potential optimizations
+ * -----------------------
+ * Below are some potential optimizations and improvements which were brought
+ * up during reviews of this feature, but which weren't implemented in the
+ * initial version. These are ideas listed without any validation on their
+ * feasibility or potential payoff. More discussion on these can be found on
+ * the -hackers threads linked to in the commit message of this feature.
+ *
+ *   * Launching datachecksumsworker for resuming operation from the startup
+ *     process: Currently users have to restart processing manually after a
+ *     restart since dynamic background worker cannot be started from the
+ *     postmaster. Changing the startup process could make restarting the
+ *     processing automatic on cluster restart.
+ *   * Avoid dirtying the page when checksums already match: Iff the checksum
+ *     on the page happens to already match we still dirty the page. It should
+ *     be enough to only do the log_newpage_buffer() call in that case.
+ *   * Invent a lightweight WAL record that doesn't contain the full-page
+ *     image but just the block number: On replay, the redo routine would read
+ *     the page from disk.

The last sentence might be truncated?

+ *   * Teach pg_checksums to avoid checksummed pages when pg_checksums is used
+ *     to enable checksums on a cluster which is in inprogress-on state and
+ *     may have checksummed pages (make pg_checksums be able to resume an
+ *     online operation).
+ *   * Restartability (not necessarily with page granularity).
+ *
+ *
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/postmaster/datachecksumsworker.c

Hm. The whole set of interactions with checkpoints/restartpoints aren't explored
here?

+/*
+ * ProcessSingleRelationFork
+ *		Enable data checksums in a single relation/fork.
+ *
+ * Returns true if successful, and false if *aborted*. On error, an actual
+ * error is raised in the lower levels.
+ */
+static bool
+ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy)
+{
+	BlockNumber numblocks = RelationGetNumberOfBlocksInFork(reln, forkNum);
+	char		activity[NAMEDATALEN * 2 + 128];
+	char	   *relns;
+
+	relns = get_namespace_name(RelationGetNamespace(reln));
+
+	if (!relns)
+		return false;
+
+	/* Report the current relation to pgstat_activity */
+	snprintf(activity, sizeof(activity) - 1, "processing: %s.%s (%s, %dblocks)",
+			 relns, RelationGetRelationName(reln), forkNames[forkNum], numblocks);
+	pgstat_report_activity(STATE_RUNNING, activity);
+
+	/*
+	 * As of now we only update the block counter for main forks in order to
+	 * not cause too frequent calls. TODO: investigate whether we should do it
+	 * more frequent?
+	 */
+	if (forkNum == MAIN_FORKNUM)
+		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_TOTAL,
+									 numblocks);

That doesn't make much sense to me. Presumably the reason to skip it for the
other forks is that they're small-ish. But if so, there's no point in skipping
the reporting either, as presumably there wouldn't be a lot of reporting?

+	/*
+	 * We are looping over the blocks which existed at the time of process
+	 * start, which is safe since new blocks are created with checksums set
+	 * already due to the state being "inprogress-on".
+	 */
+	for (BlockNumber blknum = 0; blknum < numblocks; blknum++)
+	{
+		Buffer		buf = ReadBufferExtended(reln, forkNum, blknum, RBM_NORMAL, strategy);
+
+		/* Need to get an exclusive lock before we can flag as dirty */
+		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+
+		/*
+		 * Mark the buffer as dirty and force a full page write.  We have to
+		 * re-write the page to WAL even if the checksum hasn't changed,
+		 * because if there is a replica it might have a slightly different
+		 * version of the page with an invalid checksum, caused by unlogged
+		 * changes (e.g. hintbits) on the master happening while checksums
+		 * were off. This can happen if there was a valid checksum on the page
+		 * at one point in the past, so only when checksums are first on, then
+		 * off, and then turned on again.  TODO: investigate if this could be
+		 * avoided if the checksum is calculated to be correct and wal_level
+		 * is set to "minimal",
+		 */
+		START_CRIT_SECTION();
+		MarkBufferDirty(buf);
+		log_newpage_buffer(buf, false);
+		END_CRIT_SECTION();

Hm. It's pretty annoying to have to pass page_std = false here, that could
increase the write volume noticeably. But there's not a great way to know
what the right value would be :(

+/*
+ * launcher_exit
+ *
+ * Internal routine for cleaning up state when the launcher process exits. We
+ * need to clean up the abort flag to ensure that processing can be restarted
+ * again after it was previously aborted.
+ */
+static void
+launcher_exit(int code, Datum arg)
+{
+	if (launcher_running)
+	{
+		LWLockAcquire(DataChecksumsWorkerLock, LW_EXCLUSIVE);
+		launcher_running = false;
+		DataChecksumsWorkerShmem->launcher_running = false;
+		LWLockRelease(DataChecksumsWorkerLock);
+	}
+}

Could we end up exiting this with the worker still running?

+/*
+ * WaitForAllTransactionsToFinish
+ *		Blocks awaiting all current transactions to finish
+ *
+ * Returns when all transactions which are active at the call of the function
+ * have ended, or if the postmaster dies while waiting. If the postmaster dies
+ * the abort flag will be set to indicate that the caller of this shouldn't
+ * proceed.
+ *
+ * NB: this will return early, if aborted by SIGINT or if the target state
+ * is changed while we're running.

I think either here, or at its callsites, the patch needs to explain *why* we
are waiting for all transactions to finish. Presumably this is to ensure that
other sessions haven't created relations that we can't see yet?

It actually doesn't seem to wait for all transactions, ust for ones with an
xid?

+ */
+static void
+WaitForAllTransactionsToFinish(void)
+{
+	TransactionId waitforxid;
+
+	LWLockAcquire(XidGenLock, LW_SHARED);
+	waitforxid = XidFromFullTransactionId(TransamVariables->nextXid);
+	LWLockRelease(XidGenLock);
+
+	while (TransactionIdPrecedes(GetOldestActiveTransactionId(false, true), waitforxid))
+	{
+		char		activity[64];
+		int			rc;
+
+		/* Oldest running xid is older than us, so wait */
+		snprintf(activity,
+				 sizeof(activity),
+				 "Waiting for current transactions to finish (waiting for %u)",
+				 waitforxid);
+		pgstat_report_activity(STATE_RUNNING, activity);
+
+		/* Retry every 3 seconds */
+		ResetLatch(MyLatch);
+		rc = WaitLatch(MyLatch,
+					   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+					   3000,
+					   WAIT_EVENT_CHECKSUM_ENABLE_STARTCONDITION);
+
+		/*
+		 * If the postmaster died we won't be able to enable checksums
+		 * cluster-wide so abort and hope to continue when restarted.
+		 */
+		if (rc & WL_POSTMASTER_DEATH)
+			ereport(FATAL,
+					errcode(ERRCODE_ADMIN_SHUTDOWN),
+					errmsg("postmaster exited during data checksum processing"),
+					errhint("Restart the database and restart data checksum processing by calling pg_enable_data_checksums()."));
+
+		LWLockAcquire(DataChecksumsWorkerLock, LW_SHARED);
+		if (DataChecksumsWorkerShmem->launch_operation != operation)
+			abort_requested = true;
+		LWLockRelease(DataChecksumsWorkerLock);
+		if (abort_requested)
+			break;

I don't like this much - loops with a timeout are generally a really bad idea
and we shouldn't add more instances. Presumably this also makes the tests
slower...

How about collecting the to-be-waited-for virtualxids and then wait for those?

+	if (operation == ENABLE_DATACHECKSUMS)
+	{
+		/*
+		 * If we are asked to enable checksums in a cluster which already has
+		 * checksums enabled, exit immediately as there is nothing more to do.
+		 * Hold interrupts to make sure state doesn't change during checking.
+		 */
+		HOLD_INTERRUPTS();
+		if (DataChecksumsNeedVerify())
+		{
+			RESUME_INTERRUPTS();
+			goto done;
+		}
+		RESUME_INTERRUPTS();

I don't understand what this interrupt stuff achieves here?

diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 25d13a798d1..0faaac14b1b 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -15,6 +15,21 @@

#include "storage/block.h"

+/*
+ * Checksum version 0 is used for when data checksums are disabled (OFF).
+ * PG_DATA_CHECKSUM_VERSION defines that data checksums are enabled in the
+ * cluster and PG_DATA_CHECKSUM_INPROGRESS_{ON|OFF}_VERSION defines that data
+ * checksums are either currently being enabled or disabled.
+ */
+typedef enum ChecksumType
+{
+	PG_DATA_CHECKSUM_OFF = 0,
+	PG_DATA_CHECKSUM_VERSION,
+	PG_DATA_CHECKSUM_INPROGRESS_ON_VERSION,
+	PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION,
+	PG_DATA_CHECKSUM_ANY_VERSION
+} ChecksumType;

Why is there "VERSION" in the name of these? Feels like that's basically just
vestigial at this point.

/*
* There also exist several built-in LWLock tranches.  As with the predefined
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index c6f5ebceefd..d90d35b1d6f 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -463,11 +463,11 @@ extern PGDLLIMPORT PGPROC *PreparedXactProcs;
* Background writer, checkpointer, WAL writer, WAL summarizer, and archiver
* run during normal operation.  Startup process and WAL receiver also consume
* 2 slots, but WAL writer is launched only after startup has exited, so we
- * only need 6 slots.
+ * only need 6 slots to cover these. The DataChecksums worker and launcher
+ * can consume 2 slots when data checksums are enabled or disabled.
*/
#define MAX_IO_WORKERS          32
-#define NUM_AUXILIARY_PROCS		(6 + MAX_IO_WORKERS)
-
+#define NUM_AUXILIARY_PROCS		(8 + MAX_IO_WORKERS)

Aren't they bgworkers now?

diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index afeeb1ca019..c54c61e2cd8 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -54,6 +54,11 @@ typedef enum
typedef enum
{
PROCSIGNAL_BARRIER_SMGRRELEASE, /* ask smgr to close files */
+
+	PROCSIGNAL_BARRIER_CHECKSUM_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_ON,
+	PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF,
+	PROCSIGNAL_BARRIER_CHECKSUM_ON,
} ProcSignalBarrierType;

I wonder if these really should be different barriers. What if we just made it
one, and instead drove the transition on the current shmem content?

Other stuff:
- what protects against multiple backends enabling checksums at the same time?

Afaict there isn't anything, and we just ignore the second request. Which
seems ok-ish if it's the same request as before, but not great if it's a
different one.

Should also have tests for that.

- I think this adds a bit too much of the logic to xlog.c, already an unwieldy
file. A fair bit of all of this doesn't seem like it needs to be in there.

- the code seems somewhat split brained about bgworkers and auxprocesses

Greetings,

Andres Freund