RecoveryInProgress() has critical side effects

Started by Robert Haasabout 4 years ago18 messages

robertmhaas@gmail.com

about 4 years ago

2 attachment(s)

Hi,

I have noticed that $SUBJECT, and I think it's bad, and I'd like to
propose some patches to fix it. 0001 is hopefully uncontroversial, but
0002 might need some discussion as I'm not quite sure what approach to
take.

The reason why RecoveryInProgress() has critical side effects is that
it calls InitXLOGAccess(). It turns out that the only critical thing
that InitXLOGAccess() does is call InitXLogInsert().[1]Isn't it great that these two similarly-named functions capitalize "xlog" differently? Wa-hoo. If a backend
doesn't call InitXLogInsert(), max_rdatas won't be initialized, and
the first attempt to insert WAL will probably fail with something like
"ERROR: too much WAL data". But there's no real reason why
InitXLogInsert() needs to happen only after recovery is finished. The
work that it does is just setting up data structures in backend-local
memory, and not much is lost if they are set up and not used. So, in
0001, I propose to move the call to InitXLogInsert() from
InitXLOGAccess() to BaseInit(). That means every backend will do it
the first time it starts up. It also means that CreateCheckPoint()
will no longer need a special case call to InitXLogInsert(), which
seems like a good thing.

Now, what about the other stuff that InitXLOGAccess() does? As far as
I can tell, "wal_segment_size = ControlFile->xlog_seg_size" is always
redundant, because LocalProcessControlFile() has always been called,
and it does the same thing. The postmaster does that call, and in
non-EXEC_BACKEND builds, the global variable setting will be inherited
by all child processes. In EXEC_BACKEND builds, every child process
will go through SubPostmasterMain() which also calls
LocalProcessControlFile() very early in the startup sequence.
InitXLOGAccess() also sets RedoRecPtr and doPageWrites ... but these
are non-critical values. If you set them to 0 and false respectively
and then call XLogInsert(), GetFullPagesWritesInfo() will return 0 and
false, XLogRecordAssemble() will use those values to make wrong
decisions, and then XLogInsertRecord() will notice that the wrong
values were used and will update them and cause XLogInsert() to loop
around. Similarly, XLogCheckpointNeeded() relies on the caller to set
RedoRecPtr appropriately, but all callers recheck after updating
RedoRecPtr, so it's fine if the initial value is zero. The only other
interesting case is XLogCheckBufferNeedsBackup(), which calls
GetFullPageWriteInfo(), but if somehow it gets the wrong answer, the
only thing that can happen is we might come the wrong conclusion about
whether a heap-update WAL record should attempt to compress by looking
for a common prefix and suffix between the old and new tuples. That's
non-critical, and if it happens or doesn't happen at most once per
backend, fine.

Nevertheless, what I did in 0002 is remove InitXLOGAccess() but move
the initialization of RedoRecPtr and doPageWrites into
GetFullPageWritesInfo(), only in the case where RedoRecPtr hasn't been
set yet. This has one property that I really really like, which is
that makes the code more understandable. There is no suggestion that
the correctness of WAL insertion somehow depends on a
RecoveryInProgress() call which may or may not have occurred at some
arbitrarily distant time in the past. Initializing the value at the
point of first use is just way clearer than initializing it as a
side-effect of some other function that's not even that tightly
connected. However, it does have the effect of introducing a branch
into GetFullPageWritesInfo(), which I'm not 100% sure is cheap enough
not to matter. I think the obvious alternative is to just not
initialize RedoRecPtr and doPageWrites at all and document in the
comments that the first time each backend does something with them
it's going to end up retrying once; perhaps that is preferable. Going
the other way, we could get rid of the global variables altogether and
have GetFullPageWrites() read the data from shared memory. However,
unless 8-byte atomicity is guaranteed, that requires a spinlock, so it
seems likely unappealing.

(In case the explanation above is unclear to casual readers, here's a
bit more detail: RedoRecPtr is the redo pointer from what we believe
to be the most recent checkpoint record, and doFullPageWrites is
whether full_page_writes are required at the moment. The latter can be
true either because full_page_writes=on or because a base backup is in
progress. When assembling an XLOG record, we have to decide whether to
include full-page images in that or not, and we do that based on the
value of these variables: if full page writes are required in general
and if the page LSN precedes RedoRecPtr, we include a full page image.
If it turns out that our RedoRecPtr was out of date - i.e. there's a
newer checkpoint we didn't know about - then we go back and
re-assemble the record to make sure all required full-page images are
included.)

Thoughts?

--
Robert Haas
EDB: http://www.enterprisedb.com

[1]: Isn't it great that these two similarly-named functions capitalize "xlog" differently? Wa-hoo.
"xlog" differently? Wa-hoo.

Attachments:

v1-0001-Move-InitXLogInsert-call-from-InitXLOGAccess-to-B.patchapplication/octet-stream; name=v1-0001-Move-InitXLogInsert-call-from-InitXLOGAccess-to-B.patchDownload

From a104cfdfea2571a0a3a6930f97945c6948278dbd Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Mon, 8 Nov 2021 16:35:30 -0500
Subject: [PATCH v1 1/2] Move InitXLogInsert() call from InitXLOGAccess() to
 BaseInit().

At present, there is an undocumented coding rule that you must call
RecoveryInProgress(), or do something else that results in a call
to InitXLogInsert(), before trying to write WAL. Otherwise, the
WAL construction buffers won't be initialized, resulting in
failures.

Since it's not good to rely on a status inquiry function like
RecoveryInProgress() having the side effect of initializing
critical data structures, instead do the initialization eariler,
when the backend first starts up.
---
 src/backend/access/transam/xlog.c | 13 -------------
 src/backend/utils/init/postinit.c |  6 ++++++
 2 files changed, 6 insertions(+), 13 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index e073121a7e..355d1737c3 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -8677,9 +8677,6 @@ InitXLOGAccess(void)
 	(void) GetRedoRecPtr();
 	/* Also update our copy of doPageWrites. */
 	doPageWrites = (Insert->fullPageWrites || Insert->forcePageWrites);
-
-	/* Also initialize the working areas for constructing WAL records */
-	InitXLogInsert();
 }
 
 /*
@@ -9129,16 +9126,6 @@ CreateCheckPoint(int flags)
 	if (RecoveryInProgress() && (flags & CHECKPOINT_END_OF_RECOVERY) == 0)
 		elog(ERROR, "can't create a checkpoint during recovery");
 
-	/*
-	 * Initialize InitXLogInsert working areas before entering the critical
-	 * section.  Normally, this is done by the first call to
-	 * RecoveryInProgress() or LocalSetXLogInsertAllowed(), but when creating
-	 * an end-of-recovery checkpoint, the LocalSetXLogInsertAllowed call is
-	 * done below in a critical section, and InitXLogInsert cannot be called
-	 * in a critical section.
-	 */
-	InitXLogInsert();
-
 	/*
 	 * Prepare to accumulate statistics.
 	 *
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 78bc64671e..0c56c38a14 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -541,6 +541,12 @@ BaseInit(void)
 	 * file shutdown hook can report temporary file statistics.
 	 */
 	InitTemporaryFileAccess();
+
+	/*
+	 * Initialize local buffers for WAL record construction, in case we
+	 * ever try to insert XLOG.
+	 */
+	InitXLogInsert();
 }
 
 
-- 
2.24.3 (Apple Git-128)

v1-0002-Remove-InitXLOGAccess.patchapplication/octet-stream; name=v1-0002-Remove-InitXLOGAccess.patchDownload

From a8eb5d5249458a453ceb09f30861fb3195ab070e Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Thu, 11 Nov 2021 10:55:16 -0500
Subject: [PATCH v1 2/2] Remove InitXLOGAccess().

It's not great that RecoveryInProgress() calls InitXLOGAccess(),
because a status inquiry function typically shouldn't have the side
effect of performing initializations. We could fix that by calling
InitXLOGAccess() from some other place, but instead, let's remove it
altogether.

One thing InitXLogAccess() did is initialize wal_segment_size, but it
doesn't need to do that. In the postmaster, PostmasterMain() calls
LocalProcessControlFile(), and all child processes will inherit that
value -- except in EXEC_BACKEND bulds, but then each backend runs
SubPostmasterMain() which also calls LocalProcessControlFile().

The other thing InitXLOGAccess() did is update RedoRecPtr and
doPageWrites. That doesn't necessarily need to be done at all,
because those values can always be inaccurate and all the code
that uses them will just retry if it turns out that they've
changed. However, doing it may avoid one retry, so make
GetFullPageWritesInfo() do the work instead.

The added cost of a branch in GetFullPageWritesInfo() is hopefully
small enough to be negligible, especially since we're also saving
something in RecoveryInProgress(). If not, we could also consider
skipping these initializations altogether and letting the first
attempt to insert WAL sort it out. That might result in, for example,
one extra call to XLogRecordAssemble() over the lifetime of the
backend, but that's probably not too bad.
---
 src/backend/access/transam/xlog.c   | 61 ++++++++++-------------------
 src/backend/postmaster/auxprocess.c |  1 -
 src/backend/utils/init/postinit.c   | 21 +++-------
 src/include/access/xlog.h           |  1 -
 4 files changed, 25 insertions(+), 59 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 355d1737c3..c718506bd9 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -352,8 +352,7 @@ XLogRecPtr	XactLastCommitEnd = InvalidXLogRecPtr;
  * XLogCtl->Insert.RedoRecPtr, whenever we can safely do so (ie, when we
  * hold an insertion lock).  See XLogInsertRecord for details.  We are also
  * allowed to update from XLogCtl->RedoRecPtr if we hold the info_lck;
- * see GetRedoRecPtr.  A freshly spawned backend obtains the value during
- * InitXLOGAccess.
+ * see GetRedoRecPtr.
  */
 static XLogRecPtr RedoRecPtr;
 
@@ -8418,23 +8417,6 @@ RecoveryInProgress(void)
 
 		LocalRecoveryInProgress = (xlogctl->SharedRecoveryState != RECOVERY_STATE_DONE);
 
-		/*
-		 * Initialize TimeLineID and RedoRecPtr when we discover that recovery
-		 * is finished. InitPostgres() relies upon this behaviour to ensure
-		 * that InitXLOGAccess() is called at backend startup.  (If you change
-		 * this, see also LocalSetXLogInsertAllowed.)
-		 */
-		if (!LocalRecoveryInProgress)
-		{
-			/*
-			 * If we just exited recovery, make sure we read TimeLineID and
-			 * RedoRecPtr after SharedRecoveryState (for machines with weak
-			 * memory ordering).
-			 */
-			pg_memory_barrier();
-			InitXLOGAccess();
-		}
-
 		/*
 		 * Note: We don't need a memory barrier when we're still in recovery.
 		 * We might exit recovery immediately after return, so the caller
@@ -8551,9 +8533,6 @@ LocalSetXLogInsertAllowed(void)
 
 	LocalXLogInsertAllowed = 1;
 
-	/* Initialize as RecoveryInProgress() would do when switching state */
-	InitXLOGAccess();
-
 	return oldXLogAllowed;
 }
 
@@ -8660,25 +8639,6 @@ ReadCheckpointRecord(XLogReaderState *xlogreader, XLogRecPtr RecPtr,
 	return record;
 }
 
-/*
- * This must be called in a backend process before creating WAL records
- * (except in a standalone backend, which does StartupXLOG instead).  We need
- * to initialize the local copy of RedoRecPtr.
- */
-void
-InitXLOGAccess(void)
-{
-	XLogCtlInsert *Insert = &XLogCtl->Insert;
-
-	/* set wal_segment_size */
-	wal_segment_size = ControlFile->xlog_seg_size;
-
-	/* Use GetRedoRecPtr to copy the RedoRecPtr safely */
-	(void) GetRedoRecPtr();
-	/* Also update our copy of doPageWrites. */
-	doPageWrites = (Insert->fullPageWrites || Insert->forcePageWrites);
-}
-
 /*
  * Return the current Redo pointer from shared memory.
  *
@@ -8716,6 +8676,25 @@ GetRedoRecPtr(void)
 void
 GetFullPageWriteInfo(XLogRecPtr *RedoRecPtr_p, bool *doPageWrites_p)
 {
+	/*
+	 * As a special case, if RedoRecPtr is InvalidXLogRecPtr, then it's never
+	 * been used for anything and the correct value must be different. In that
+	 * case, get the current value from shared memory.
+	 *
+	 * Nothing really bad would happen if we didn't do this, because the
+	 * caller can't rely on this information being accurate anyway, but it
+	 * might, for example, result in an extra call to XLogRecordAssemble.
+	 */
+	if (unlikely(RedoRecPtr == InvalidXLogRecPtr))
+	{
+		XLogCtlInsert *Insert = &XLogCtl->Insert;
+
+		/* Use GetRedoRecPtr to copy the RedoRecPtr safely */
+		(void) GetRedoRecPtr();
+		/* Also update our copy of doPageWrites. */
+		doPageWrites = (Insert->fullPageWrites || Insert->forcePageWrites);
+	}
+
 	*RedoRecPtr_p = RedoRecPtr;
 	*doPageWrites_p = doPageWrites;
 }
diff --git a/src/backend/postmaster/auxprocess.c b/src/backend/postmaster/auxprocess.c
index 7452f908b2..43497676ab 100644
--- a/src/backend/postmaster/auxprocess.c
+++ b/src/backend/postmaster/auxprocess.c
@@ -154,7 +154,6 @@ AuxiliaryProcessMain(AuxProcType auxtype)
 			proc_exit(1);
 
 		case WalWriterProcess:
-			InitXLOGAccess();
 			WalWriterMain();
 			proc_exit(1);
 
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 0c56c38a14..e5d0852080 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -624,25 +624,14 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
 	}
 
 	/*
-	 * Initialize local process's access to XLOG.
+	 * If this is either a bootstrap process nor a standalone backend, start
+	 * up the XLOG machinery, and register to have it closed down at exit.
+	 * In other cases, the startup process is responsible for starting up
+	 * the XLOG machinery, and the checkpointer for closing it down.
 	 */
-	if (IsUnderPostmaster)
-	{
-		/*
-		 * The postmaster already started the XLOG machinery, but we need to
-		 * call InitXLOGAccess(), if the system isn't in hot-standby mode.
-		 * This is handled by calling RecoveryInProgress and ignoring the
-		 * result.
-		 */
-		(void) RecoveryInProgress();
-	}
-	else
+	if (!IsUnderPostmaster)
 	{
 		/*
-		 * We are either a bootstrap process or a standalone backend. Either
-		 * way, start up the XLOG machinery, and register to have it closed
-		 * down at exit.
-		 *
 		 * We don't yet have an aux-process resource owner, but StartupXLOG
 		 * and ShutdownXLOG will need one.  Hence, create said resource owner
 		 * (and register a callback to clean it up after ShutdownXLOG runs).
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 898df2ee03..34f6c89f06 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -299,7 +299,6 @@ extern void BootStrapXLOG(void);
 extern void LocalProcessControlFile(bool reset);
 extern void StartupXLOG(void);
 extern void ShutdownXLOG(int code, Datum arg);
-extern void InitXLOGAccess(void);
 extern void CreateCheckPoint(int flags);
 extern bool CreateRestartPoint(int flags);
 extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN);
-- 
2.24.3 (Apple Git-128)

Bossart, Nathan

bossartn@amazon.com

about 4 years ago

In reply to: Robert Haas (#1)

Re: RecoveryInProgress() has critical side effects

On 11/11/21, 9:20 AM, "Robert Haas" <robertmhaas@gmail.com> wrote:

memory, and not much is lost if they are set up and not used. So, in
0001, I propose to move the call to InitXLogInsert() from
InitXLOGAccess() to BaseInit(). That means every backend will do it
the first time it starts up. It also means that CreateCheckPoint()
will no longer need a special case call to InitXLogInsert(), which
seems like a good thing.

0001 looks reasonable to me.

connected. However, it does have the effect of introducing a branch
into GetFullPageWritesInfo(), which I'm not 100% sure is cheap enough
not to matter. I think the obvious alternative is to just not
initialize RedoRecPtr and doPageWrites at all and document in the
comments that the first time each backend does something with them
it's going to end up retrying once; perhaps that is preferable. Going

Unless there's a demonstrable performance benefit from adding the
branch, my preference is to leave it out. I could be off-base, but it
seems possible that future changes might end up depending on any side
effects from this new branch, which is exactly what you're trying to
fix. Plus, always using the retry path is a good way to test that it
works as expected.

Nathan

Robert Haas

robertmhaas@gmail.com

about 4 years ago

In reply to: Bossart, Nathan (#2)

2 attachment(s)

Re: RecoveryInProgress() has critical side effects

On Fri, Nov 12, 2021 at 10:23 AM Bossart, Nathan <bossartn@amazon.com> wrote:

0001 looks reasonable to me.

Cool.

Unless there's a demonstrable performance benefit from adding the
branch, my preference is to leave it out. I could be off-base, but it
seems possible that future changes might end up depending on any side
effects from this new branch, which is exactly what you're trying to
fix. Plus, always using the retry path is a good way to test that it
works as expected.

Here's a new version that does it that way. Any other opinions?

The best thing I could come up with for a test case for this was to
try repeatedly making a new connection and running "SELECT
txid_current()", which will cause just one WAL record to be generated.
Unfortunately that path has overhead from a lot of other causes so I'm
not sure the results are very meaningful, but here they are:

v1: 0.378 ms
v2: 0.391 ms
common base commit (10eae82b2): 0.376 ms

methodology:
for i in `seq 1 1000`; do psql -c '\timing' -c 'select
txid_current();'; done | grep '^Time:' | awk '{total+=$2} END {print
total/NR}'
run twice, discarding the first result, since the very first
connection attempt after starting a new server process is notably
slower

--
Robert Haas
EDB: http://www.enterprisedb.com

Attachments:

v2-0002-Remove-InitXLOGAccess.patchapplication/octet-stream; name=v2-0002-Remove-InitXLOGAccess.patchDownload

From ab79cbe2eb08405f4c899e6d6406a1ffb376dff3 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Mon, 15 Nov 2021 15:58:15 -0500
Subject: [PATCH v2 2/2] Remove InitXLOGAccess().

It's not great that RecoveryInProgress() calls InitXLOGAccess(),
because a status inquiry function typically shouldn't have the side
effect of performing initializations. We could fix that by calling
InitXLOGAccess() from some other place, but instead, let's remove it
altogether.

One thing InitXLogAccess() did is initialize wal_segment_size, but it
doesn't need to do that. In the postmaster, PostmasterMain() calls
LocalProcessControlFile(), and all child processes will inherit that
value -- except in EXEC_BACKEND bulds, but then each backend runs
SubPostmasterMain() which also calls LocalProcessControlFile().

The other thing InitXLOGAccess() did is update RedoRecPtr and
doPageWrites, but that's not critical, because all code that uses
them will just retry if it turns out that they've changed. The
only difference is that most code will now see an initial value that
is definitely invalid instead of one that might have just been way
out of date, but that will only happen once per backend lifetime,
so it shouldn't be a big deal.
---
 src/backend/access/transam/xlog.c   | 60 ++++++++---------------------
 src/backend/postmaster/auxprocess.c |  1 -
 src/backend/utils/init/postinit.c   | 21 +++-------
 src/include/access/xlog.h           |  1 -
 4 files changed, 22 insertions(+), 61 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 355d1737c3..ad6c2e11d1 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -352,8 +352,14 @@ XLogRecPtr	XactLastCommitEnd = InvalidXLogRecPtr;
  * XLogCtl->Insert.RedoRecPtr, whenever we can safely do so (ie, when we
  * hold an insertion lock).  See XLogInsertRecord for details.  We are also
  * allowed to update from XLogCtl->RedoRecPtr if we hold the info_lck;
- * see GetRedoRecPtr.  A freshly spawned backend obtains the value during
- * InitXLOGAccess.
+ * see GetRedoRecPtr.
+ *
+ * NB: Code that uses this variable must be prepard not only for the
+ * possibility that it may be arbitrarily out of date, but also for the
+ * possibility that it might be set to InvalidXLogRecPtr. We used to
+ * initialize it as a side effect of the first call to RecoveryInProgress(),
+ * which meant that most code that might use it could assume that it had a
+ * real if perhaps stale value. That's no longer the case.
  */
 static XLogRecPtr RedoRecPtr;
 
@@ -361,6 +367,12 @@ static XLogRecPtr RedoRecPtr;
  * doPageWrites is this backend's local copy of (forcePageWrites ||
  * fullPageWrites).  It is used together with RedoRecPtr to decide whether
  * a full-page image of a page need to be taken.
+ *
+ * NB: Initially this is false, and there's no guarantee that it will be
+ * initialized to any other value before it is first used. Any code that
+ * makes use of it must recheck the value after obtaining a WALInsertLock,
+ * and respond appropriately if it turns out that the previous value wasn't
+ * accurate.
  */
 static bool doPageWrites;
 
@@ -8418,23 +8430,6 @@ RecoveryInProgress(void)
 
 		LocalRecoveryInProgress = (xlogctl->SharedRecoveryState != RECOVERY_STATE_DONE);
 
-		/*
-		 * Initialize TimeLineID and RedoRecPtr when we discover that recovery
-		 * is finished. InitPostgres() relies upon this behaviour to ensure
-		 * that InitXLOGAccess() is called at backend startup.  (If you change
-		 * this, see also LocalSetXLogInsertAllowed.)
-		 */
-		if (!LocalRecoveryInProgress)
-		{
-			/*
-			 * If we just exited recovery, make sure we read TimeLineID and
-			 * RedoRecPtr after SharedRecoveryState (for machines with weak
-			 * memory ordering).
-			 */
-			pg_memory_barrier();
-			InitXLOGAccess();
-		}
-
 		/*
 		 * Note: We don't need a memory barrier when we're still in recovery.
 		 * We might exit recovery immediately after return, so the caller
@@ -8551,9 +8546,6 @@ LocalSetXLogInsertAllowed(void)
 
 	LocalXLogInsertAllowed = 1;
 
-	/* Initialize as RecoveryInProgress() would do when switching state */
-	InitXLOGAccess();
-
 	return oldXLogAllowed;
 }
 
@@ -8660,25 +8652,6 @@ ReadCheckpointRecord(XLogReaderState *xlogreader, XLogRecPtr RecPtr,
 	return record;
 }
 
-/*
- * This must be called in a backend process before creating WAL records
- * (except in a standalone backend, which does StartupXLOG instead).  We need
- * to initialize the local copy of RedoRecPtr.
- */
-void
-InitXLOGAccess(void)
-{
-	XLogCtlInsert *Insert = &XLogCtl->Insert;
-
-	/* set wal_segment_size */
-	wal_segment_size = ControlFile->xlog_seg_size;
-
-	/* Use GetRedoRecPtr to copy the RedoRecPtr safely */
-	(void) GetRedoRecPtr();
-	/* Also update our copy of doPageWrites. */
-	doPageWrites = (Insert->fullPageWrites || Insert->forcePageWrites);
-}
-
 /*
  * Return the current Redo pointer from shared memory.
  *
@@ -8710,8 +8683,9 @@ GetRedoRecPtr(void)
  * full-page image to be included in the WAL record.
  *
  * The returned values are cached copies from backend-private memory, and
- * possibly out-of-date.  XLogInsertRecord will re-check them against
- * up-to-date values, while holding the WAL insert lock.
+ * possibly out-of-date or, indeed, uninitalized, in which case they will
+ * be InvalidXLogRecPtr and false, respectively.  XLogInsertRecord will
+ * re-check them against up-to-date values, while holding the WAL insert lock.
  */
 void
 GetFullPageWriteInfo(XLogRecPtr *RedoRecPtr_p, bool *doPageWrites_p)
diff --git a/src/backend/postmaster/auxprocess.c b/src/backend/postmaster/auxprocess.c
index 7452f908b2..43497676ab 100644
--- a/src/backend/postmaster/auxprocess.c
+++ b/src/backend/postmaster/auxprocess.c
@@ -154,7 +154,6 @@ AuxiliaryProcessMain(AuxProcType auxtype)
 			proc_exit(1);
 
 		case WalWriterProcess:
-			InitXLOGAccess();
 			WalWriterMain();
 			proc_exit(1);
 
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 0c56c38a14..e5d0852080 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -624,25 +624,14 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
 	}
 
 	/*
-	 * Initialize local process's access to XLOG.
+	 * If this is either a bootstrap process nor a standalone backend, start
+	 * up the XLOG machinery, and register to have it closed down at exit.
+	 * In other cases, the startup process is responsible for starting up
+	 * the XLOG machinery, and the checkpointer for closing it down.
 	 */
-	if (IsUnderPostmaster)
-	{
-		/*
-		 * The postmaster already started the XLOG machinery, but we need to
-		 * call InitXLOGAccess(), if the system isn't in hot-standby mode.
-		 * This is handled by calling RecoveryInProgress and ignoring the
-		 * result.
-		 */
-		(void) RecoveryInProgress();
-	}
-	else
+	if (!IsUnderPostmaster)
 	{
 		/*
-		 * We are either a bootstrap process or a standalone backend. Either
-		 * way, start up the XLOG machinery, and register to have it closed
-		 * down at exit.
-		 *
 		 * We don't yet have an aux-process resource owner, but StartupXLOG
 		 * and ShutdownXLOG will need one.  Hence, create said resource owner
 		 * (and register a callback to clean it up after ShutdownXLOG runs).
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 898df2ee03..34f6c89f06 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -299,7 +299,6 @@ extern void BootStrapXLOG(void);
 extern void LocalProcessControlFile(bool reset);
 extern void StartupXLOG(void);
 extern void ShutdownXLOG(int code, Datum arg);
-extern void InitXLOGAccess(void);
 extern void CreateCheckPoint(int flags);
 extern bool CreateRestartPoint(int flags);
 extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN);
-- 
2.24.3 (Apple Git-128)

v2-0001-Move-InitXLogInsert-call-from-InitXLOGAccess-to-B.patchapplication/octet-stream; name=v2-0001-Move-InitXLogInsert-call-from-InitXLOGAccess-to-B.patchDownload

From a104cfdfea2571a0a3a6930f97945c6948278dbd Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Mon, 8 Nov 2021 16:35:30 -0500
Subject: [PATCH v2 1/2] Move InitXLogInsert() call from InitXLOGAccess() to
 BaseInit().

At present, there is an undocumented coding rule that you must call
RecoveryInProgress(), or do something else that results in a call
to InitXLogInsert(), before trying to write WAL. Otherwise, the
WAL construction buffers won't be initialized, resulting in
failures.

Since it's not good to rely on a status inquiry function like
RecoveryInProgress() having the side effect of initializing
critical data structures, instead do the initialization eariler,
when the backend first starts up.
---
 src/backend/access/transam/xlog.c | 13 -------------
 src/backend/utils/init/postinit.c |  6 ++++++
 2 files changed, 6 insertions(+), 13 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index e073121a7e..355d1737c3 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -8677,9 +8677,6 @@ InitXLOGAccess(void)
 	(void) GetRedoRecPtr();
 	/* Also update our copy of doPageWrites. */
 	doPageWrites = (Insert->fullPageWrites || Insert->forcePageWrites);
-
-	/* Also initialize the working areas for constructing WAL records */
-	InitXLogInsert();
 }
 
 /*
@@ -9129,16 +9126,6 @@ CreateCheckPoint(int flags)
 	if (RecoveryInProgress() && (flags & CHECKPOINT_END_OF_RECOVERY) == 0)
 		elog(ERROR, "can't create a checkpoint during recovery");
 
-	/*
-	 * Initialize InitXLogInsert working areas before entering the critical
-	 * section.  Normally, this is done by the first call to
-	 * RecoveryInProgress() or LocalSetXLogInsertAllowed(), but when creating
-	 * an end-of-recovery checkpoint, the LocalSetXLogInsertAllowed call is
-	 * done below in a critical section, and InitXLogInsert cannot be called
-	 * in a critical section.
-	 */
-	InitXLogInsert();
-
 	/*
 	 * Prepare to accumulate statistics.
 	 *
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 78bc64671e..0c56c38a14 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -541,6 +541,12 @@ BaseInit(void)
 	 * file shutdown hook can report temporary file statistics.
 	 */
 	InitTemporaryFileAccess();
+
+	/*
+	 * Initialize local buffers for WAL record construction, in case we
+	 * ever try to insert XLOG.
+	 */
+	InitXLogInsert();
 }
 
 
-- 
2.24.3 (Apple Git-128)

Bossart, Nathan

bossartn@amazon.com

about 4 years ago

In reply to: Robert Haas (#3)

Re: RecoveryInProgress() has critical side effects

On 11/15/21, 1:30 PM, "Robert Haas" <robertmhaas@gmail.com> wrote:

Here's a new version that does it that way. Any other opinions?

LGTM

The best thing I could come up with for a test case for this was to
try repeatedly making a new connection and running "SELECT
txid_current()", which will cause just one WAL record to be generated.
Unfortunately that path has overhead from a lot of other causes so I'm
not sure the results are very meaningful, but here they are:

v1: 0.378 ms
v2: 0.391 ms
common base commit (10eae82b2): 0.376 ms

I'm personally not too worried about a ~4% regression in this
particular benchmark...

Nathan

Michael Paquier

michael@paquier.xyz

about 4 years ago

In reply to: Bossart, Nathan (#4)

Re: RecoveryInProgress() has critical side effects

On Mon, Nov 15, 2021 at 10:09:17PM +0000, Bossart, Nathan wrote:

On 11/15/21, 1:30 PM, "Robert Haas" <robertmhaas@gmail.com> wrote:

Here's a new version that does it that way. Any other opinions?

LGTM

Patch 0001 means that the startup process would set up the structures
to be able to build WAL records before running any kind of recovery
action rather than doing it when it really needs it. That's fine by
me.

Is patch 0002 actually right regarding the handling of doPageWrites?
Once applied, we finish by setting it when the startup process starts
and not anymore at the end of recovery based on its the state of
Insert, but this could have changed while in recovery when replaying
one or more XLOG_FPW_CHANGE records.

I'm personally not too worried about a ~4% regression in this
particular benchmark...

This is not a hot code path, that should be fine.
--
Michael

Robert Haas

robertmhaas@gmail.com

about 4 years ago

In reply to: Michael Paquier (#5)

Re: RecoveryInProgress() has critical side effects

On Mon, Nov 15, 2021 at 9:34 PM Michael Paquier <michael@paquier.xyz> wrote:

Patch 0001 means that the startup process would set up the structures
to be able to build WAL records before running any kind of recovery
action rather than doing it when it really needs it. That's fine by
me.

Great, thanks. I think I'll go ahead and commit this one, then.

Is patch 0002 actually right regarding the handling of doPageWrites?
Once applied, we finish by setting it when the startup process starts
and not anymore at the end of recovery based on its the state of
Insert, but this could have changed while in recovery when replaying
one or more XLOG_FPW_CHANGE records.

Maybe I'm not understanding you properly here, but it seems like you
might be forgetting that this is a local variable and thus every
backend is going to have something different. In the startup process,
it will be initialized by StartupXLOG(); in other processes, it's
currently initialized by RecoveryInProgress(), but with this patch it
wouldn't be. Either way, it's then updated by future calls to
XLogInsertRecord() as required. XLOG_FPW_CHANGE records might affect
the new value that gets set the next time XLogInsertRecord(), but they
don't directly affect doPageWrites.

I'm personally not too worried about a ~4% regression in this
particular benchmark...

This is not a hot code path, that should be fine.

OK. I'll wait a while and see if anyone else wants to weigh in.

--
Robert Haas
EDB: http://www.enterprisedb.com

Andres Freund

andres@anarazel.de

about 4 years ago

In reply to: Robert Haas (#1)

Re: RecoveryInProgress() has critical side effects

Hi,

On 2021-11-11 12:15:03 -0500, Robert Haas wrote:

The reason why RecoveryInProgress() has critical side effects is that
it calls InitXLOGAccess(). It turns out that the only critical thing
that InitXLOGAccess() does is call InitXLogInsert().[1] If a backend
doesn't call InitXLogInsert(), max_rdatas won't be initialized, and
the first attempt to insert WAL will probably fail with something like
"ERROR: too much WAL data". But there's no real reason why
InitXLogInsert() needs to happen only after recovery is finished.

I think we should consider allocating that memory in postmaster, on !windows
at least. Or, perhaps even better, to initially use some static variable, and
only use a separate memory context when XLogEnsureRecordSpace(). Reserving
all that memory in every process just needlessly increases our per-connection
memory usage, for no good reason.

The
work that it does is just setting up data structures in backend-local
memory, and not much is lost if they are set up and not used. So, in
0001, I propose to move the call to InitXLogInsert() from
InitXLOGAccess() to BaseInit(). That means every backend will do it
the first time it starts up. It also means that CreateCheckPoint()
will no longer need a special case call to InitXLogInsert(), which
seems like a good thing.

Yes. Seems making BaseInit() being executed at a halfway consistent point is
starting to have some benefits at least ;)

Nevertheless, what I did in 0002 is remove InitXLOGAccess() but move
the initialization of RedoRecPtr and doPageWrites into
GetFullPageWritesInfo(), only in the case where RedoRecPtr hasn't been
set yet. This has one property that I really really like, which is
that makes the code more understandable. There is no suggestion that
the correctness of WAL insertion somehow depends on a
RecoveryInProgress() call which may or may not have occurred at some
arbitrarily distant time in the past. Initializing the value at the
point of first use is just way clearer than initializing it as a
side-effect of some other function that's not even that tightly
connected. However, it does have the effect of introducing a branch
into GetFullPageWritesInfo(), which I'm not 100% sure is cheap enough
not to matter.

Hm. Compared to the other costs of the XLogInsert do/while loop it probably
doesn't matter. One nearly-always-false branch is vastly cheaper than going
through the loop unnecessarily. Sure, the unnecessarily iteration saved will
only be the first (for now, it might be worth to refresh the values more
often), but there's enough branches in the body of the loop that I can't
really see it mattering.

Maybe kinda conceivably the cost of that branch could be an argument if
GetFullPageWritesInfo() were inline in XLogInsert(). But it isn't.

I think the obvious alternative is to just not
initialize RedoRecPtr and doPageWrites at all and document in the
comments that the first time each backend does something with them
it's going to end up retrying once; perhaps that is preferable. Going
the other way, we could get rid of the global variables altogether and
have GetFullPageWrites() read the data from shared memory. However,
unless 8-byte atomicity is guaranteed, that requires a spinlock, so it
seems likely unappealing.

I think it'd be fine to just make them 8byte atomics, which'd be lock-free on
relevant platforms (at least once the aarch64 thing is fixed, there's a
patch).

XLogCtlInsert already takes care to ensure that RedoRecPtr/fullPageWrites are
on a cacheline that's not constantly modified.

If we really wanted to optimize the no-8-byte-single-copy-atomicity path, we
could introduce a counter counting how many times RedoRecPtr/fullPageWrites
have changed. But it seems unlikely to be worth it.

Greetings,

Andres Freund

andres@anarazel.de

about 4 years ago

In reply to: Robert Haas (#3)

Re: RecoveryInProgress() has critical side effects

Hi,

On 2021-11-15 16:29:28 -0500, Robert Haas wrote:

v1: 0.378 ms
v2: 0.391 ms
common base commit (10eae82b2): 0.376 ms

Is this with assertions enabled? Because on my workstation (which likely is
slower on a single-core basis than your laptop) I get around half of this with
an optimized build (and a non-optimized build rarely is worth using as a
measuring stick). It could also be that your psqlrc does something
heavyweight?

methodology:
for i in `seq 1 1000`; do psql -c '\timing' -c 'select
txid_current();'; done | grep '^Time:' | awk '{total+=$2} END {print
total/NR}'
run twice, discarding the first result, since the very first
connection attempt after starting a new server process is notably
slower

Hm. I think this might included a bunch of convoluting factors that make it
hard to judge the actual size of the performance difference. It'll
e.g. include a fair bit of syscache initialization overhead that won't change
between the two approaches. This could be addressed by doing something like -c
"SELECT 'txid_current'::regproc;" first.

Also, the psql invocations itself use up a fair bit of time, even if you
ignored them from the result with the \timing trick. A pgbench -C doing 1k
SELECT 1;s is about ~1.5s for me, whereas psql is ~5.7s.

Just to size up the order of magnitude of overhead of the connection
establishment and syscache initialization, I ran a pgbench with a script of

SELECT 1;
SELECT 1;
SELECT 'txid_current()'::regprocedure;
SELECT 'txid_current()'::regprocedure;
SELECT txid_current();
SELECT txid_current();

and ran that with pgbench -n -f /tmp/txid.sql -C -t1000 -P1 --report-latencies
and got
statement latencies in milliseconds:
0.125 SELECT 1;
0.024 SELECT 1;
0.095 SELECT 'txid_current()'::regprocedure;
0.025 SELECT 'txid_current()'::regprocedure;
0.033 SELECT txid_current();
0.024 SELECT txid_current();

which nicely shows how much of the overhead is not related to the actual
txid_current() invocation.

Greetings,

Andres Freund

Robert Haas

robertmhaas@gmail.com

about 4 years ago

In reply to: Andres Freund (#8)

Re: RecoveryInProgress() has critical side effects

On Tue, Nov 16, 2021 at 2:27 PM Andres Freund <andres@anarazel.de> wrote:

Is this with assertions enabled? Because on my workstation (which likely is
slower on a single-core basis than your laptop) I get around half of this with
an optimized build (and a non-optimized build rarely is worth using as a
measuring stick). It could also be that your psqlrc does something
heavyweight?

Assertions disabled. ~/.psqlrc does not exist.

Hm. I think this might included a bunch of convoluting factors that make it
hard to judge the actual size of the performance difference.

Yes, I think so, too.

--
Robert Haas
EDB: http://www.enterprisedb.com

#10

Andres Freund

andres@anarazel.de

about 4 years ago

In reply to: Robert Haas (#9)

Re: RecoveryInProgress() has critical side effects

Hi,

On 2021-11-16 15:19:19 -0500, Robert Haas wrote:

Hm. I think this might included a bunch of convoluting factors that make it
hard to judge the actual size of the performance difference.

Yes, I think so, too.

FWIW I ran that pgench thing I presented upthread, and I didn't see any
meaningful and repeatable performance difference 354a1f8d220, ad26ee28250 and
0002 applied ontop of ad26ee28250. The run-to-run variance is way higher than
the difference between the changes.

Greetings,

Andres Freund

#11

Robert Haas

robertmhaas@gmail.com

about 4 years ago

In reply to: Andres Freund (#10)

Re: RecoveryInProgress() has critical side effects

On Tue, Nov 16, 2021 at 3:42 PM Andres Freund <andres@anarazel.de> wrote:

On 2021-11-16 15:19:19 -0500, Robert Haas wrote:

Hm. I think this might included a bunch of convoluting factors that make it
hard to judge the actual size of the performance difference.

Yes, I think so, too.

FWIW I ran that pgench thing I presented upthread, and I didn't see any
meaningful and repeatable performance difference 354a1f8d220, ad26ee28250 and
0002 applied ontop of ad26ee28250. The run-to-run variance is way higher than
the difference between the changes.

Thanks. I suspected that the results I was seeing were not meaningful,
but it's hard to be sure when the results seem to be repeatable
locally.

I'm still not entirely clear on whether you prefer v1-0002, v2-0002,
or something else.

--
Robert Haas
EDB: http://www.enterprisedb.com

#12

Andres Freund

andres@anarazel.de

about 4 years ago

In reply to: Robert Haas (#11)

Re: RecoveryInProgress() has critical side effects

On 2021-11-16 16:30:27 -0500, Robert Haas wrote:

I'm still not entirely clear on whether you prefer v1-0002, v2-0002,
or something else.

I think it basically doesn't matter much. It's such a small piece of the cost
compared to either the cost of a single insert or the ratio between
RedoRecPtr/FPW changes and the number of inserted records.

I guess v2-0002 is mildly simpler, so I very weakly lean towards that.

#13

Heikki Linnakangas

hlinnaka@iki.fi

about 4 years ago

In reply to: Andres Freund (#12)

Re: RecoveryInProgress() has critical side effects

On 17/11/2021 00:04, Andres Freund wrote:

On 2021-11-16 16:30:27 -0500, Robert Haas wrote:

I'm still not entirely clear on whether you prefer v1-0002, v2-0002,
or something else.

I think it basically doesn't matter much. It's such a small piece of the cost
compared to either the cost of a single insert or the ratio between
RedoRecPtr/FPW changes and the number of inserted records.

I guess v2-0002 is mildly simpler, so I very weakly lean towards that.

I was going to argue for v1, because I was worried about generating
full-page writes unnecessarily. But looking closer, it doesn't do that.
It will initially think that no full page images are needed, and perform
the extra loop in XLogInsert() to try again with full page images.
Similarly in heap_update(), it will fail in the direction of sometimes
doing the "prefix-suffix compression" unnecessarily, which means a
little bit of extra CPU work, but it won't affect the WAL record that
gets written.

So either way is fine, really. That's a bit subtle, though, a comment
somewhere would be good.

But here's yet another idea: We could initialize RedoRecPtr and
doPageWrites in InitXLogInsert(). It would seem like a pretty natural
place for it.

- Heikki

PS. typo in patch v2: s/prepard/prepared/

#14

Robert Haas

robertmhaas@gmail.com

about 4 years ago

In reply to: Heikki Linnakangas (#13)

1 attachment(s)

Re: RecoveryInProgress() has critical side effects

On Sat, Nov 20, 2021 at 3:22 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

But here's yet another idea: We could initialize RedoRecPtr and
doPageWrites in InitXLogInsert(). It would seem like a pretty natural
place for it.

I considered that, but it seemed like an abstraction violation to me,
because that code is part of xloginsert.c, which is mostly concerned
with assembly of write-ahead log records, not the associated
shared-memory state. Also, I don't think there's any guarantee that
the state in shared memory is initialized at the time we call that
function, so we might just be copying uninitialized memory into other
uninitialized memory.

PS. typo in patch v2: s/prepard/prepared/

Thanks, fixed.

--
Robert Haas
EDB: http://www.enterprisedb.com

Attachments:

v3-0001-Remove-InitXLOGAccess.patchapplication/octet-stream; name=v3-0001-Remove-InitXLOGAccess.patchDownload

From ed4d459615e36d020560588497779bb0c042135b Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Wed, 1 Dec 2021 10:32:11 -0500
Subject: [PATCH v3] Remove InitXLOGAccess().

It's not great that RecoveryInProgress() calls InitXLOGAccess(),
because a status inquiry function typically shouldn't have the side
effect of performing initializations. We could fix that by calling
InitXLOGAccess() from some other place, but instead, let's remove it
altogether.

One thing InitXLogAccess() did is initialize wal_segment_size, but it
doesn't need to do that. In the postmaster, PostmasterMain() calls
LocalProcessControlFile(), and all child processes will inherit that
value -- except in EXEC_BACKEND bulds, but then each backend runs
SubPostmasterMain() which also calls LocalProcessControlFile().

The other thing InitXLOGAccess() did is update RedoRecPtr and
doPageWrites, but that's not critical, because all code that uses
them will just retry if it turns out that they've changed. The
only difference is that most code will now see an initial value that
is definitely invalid instead of one that might have just been way
out of date, but that will only happen once per backend lifetime,
so it shouldn't be a big deal.

Patch by me, reviewed by Nathan Bossart, Michael Paquier, Andres
Freund, and Heikki Linnakangas.

Discussion: http://postgr.es/m/CA+TgmoY7b65qRjzHN_tWUk8B4sJqk1vj1d31uepVzmgPnZKeLg@mail.gmail.com
---
 src/backend/access/transam/xlog.c   | 63 ++++++++---------------------
 src/backend/postmaster/auxprocess.c |  1 -
 src/backend/utils/init/postinit.c   | 21 +++-------
 src/include/access/xlog.h           |  1 -
 4 files changed, 22 insertions(+), 64 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index d894af310a..c3753105b5 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -350,8 +350,14 @@ XLogRecPtr	XactLastCommitEnd = InvalidXLogRecPtr;
  * XLogCtl->Insert.RedoRecPtr, whenever we can safely do so (ie, when we
  * hold an insertion lock).  See XLogInsertRecord for details.  We are also
  * allowed to update from XLogCtl->RedoRecPtr if we hold the info_lck;
- * see GetRedoRecPtr.  A freshly spawned backend obtains the value during
- * InitXLOGAccess.
+ * see GetRedoRecPtr.
+ *
+ * NB: Code that uses this variable must be prepared not only for the
+ * possibility that it may be arbitrarily out of date, but also for the
+ * possibility that it might be set to InvalidXLogRecPtr. We used to
+ * initialize it as a side effect of the first call to RecoveryInProgress(),
+ * which meant that most code that might use it could assume that it had a
+ * real if perhaps stale value. That's no longer the case.
  */
 static XLogRecPtr RedoRecPtr;
 
@@ -359,6 +365,12 @@ static XLogRecPtr RedoRecPtr;
  * doPageWrites is this backend's local copy of (forcePageWrites ||
  * fullPageWrites).  It is used together with RedoRecPtr to decide whether
  * a full-page image of a page need to be taken.
+ *
+ * NB: Initially this is false, and there's no guarantee that it will be
+ * initialized to any other value before it is first used. Any code that
+ * makes use of it must recheck the value after obtaining a WALInsertLock,
+ * and respond appropriately if it turns out that the previous value wasn't
+ * accurate.
  */
 static bool doPageWrites;
 
@@ -8390,9 +8402,6 @@ PerformRecoveryXLogAction(void)
  *
  * Unlike testing InRecovery, this works in any process that's connected to
  * shared memory.
- *
- * As a side-effect, we initialize the local RedoRecPtr variable the first
- * time we see that recovery is finished.
  */
 bool
 RecoveryInProgress(void)
@@ -8414,23 +8423,6 @@ RecoveryInProgress(void)
 
 		LocalRecoveryInProgress = (xlogctl->SharedRecoveryState != RECOVERY_STATE_DONE);
 
-		/*
-		 * Initialize TimeLineID and RedoRecPtr when we discover that recovery
-		 * is finished. InitPostgres() relies upon this behaviour to ensure
-		 * that InitXLOGAccess() is called at backend startup.  (If you change
-		 * this, see also LocalSetXLogInsertAllowed.)
-		 */
-		if (!LocalRecoveryInProgress)
-		{
-			/*
-			 * If we just exited recovery, make sure we read TimeLineID and
-			 * RedoRecPtr after SharedRecoveryState (for machines with weak
-			 * memory ordering).
-			 */
-			pg_memory_barrier();
-			InitXLOGAccess();
-		}
-
 		/*
 		 * Note: We don't need a memory barrier when we're still in recovery.
 		 * We might exit recovery immediately after return, so the caller
@@ -8547,9 +8539,6 @@ LocalSetXLogInsertAllowed(void)
 
 	LocalXLogInsertAllowed = 1;
 
-	/* Initialize as RecoveryInProgress() would do when switching state */
-	InitXLOGAccess();
-
 	return oldXLogAllowed;
 }
 
@@ -8656,25 +8645,6 @@ ReadCheckpointRecord(XLogReaderState *xlogreader, XLogRecPtr RecPtr,
 	return record;
 }
 
-/*
- * This must be called in a backend process before creating WAL records
- * (except in a standalone backend, which does StartupXLOG instead).  We need
- * to initialize the local copy of RedoRecPtr.
- */
-void
-InitXLOGAccess(void)
-{
-	XLogCtlInsert *Insert = &XLogCtl->Insert;
-
-	/* set wal_segment_size */
-	wal_segment_size = ControlFile->xlog_seg_size;
-
-	/* Use GetRedoRecPtr to copy the RedoRecPtr safely */
-	(void) GetRedoRecPtr();
-	/* Also update our copy of doPageWrites. */
-	doPageWrites = (Insert->fullPageWrites || Insert->forcePageWrites);
-}
-
 /*
  * Return the current Redo pointer from shared memory.
  *
@@ -8706,8 +8676,9 @@ GetRedoRecPtr(void)
  * full-page image to be included in the WAL record.
  *
  * The returned values are cached copies from backend-private memory, and
- * possibly out-of-date.  XLogInsertRecord will re-check them against
- * up-to-date values, while holding the WAL insert lock.
+ * possibly out-of-date or, indeed, uninitalized, in which case they will
+ * be InvalidXLogRecPtr and false, respectively.  XLogInsertRecord will
+ * re-check them against up-to-date values, while holding the WAL insert lock.
  */
 void
 GetFullPageWriteInfo(XLogRecPtr *RedoRecPtr_p, bool *doPageWrites_p)
diff --git a/src/backend/postmaster/auxprocess.c b/src/backend/postmaster/auxprocess.c
index 7452f908b2..43497676ab 100644
--- a/src/backend/postmaster/auxprocess.c
+++ b/src/backend/postmaster/auxprocess.c
@@ -154,7 +154,6 @@ AuxiliaryProcessMain(AuxProcType auxtype)
 			proc_exit(1);
 
 		case WalWriterProcess:
-			InitXLOGAccess();
 			WalWriterMain();
 			proc_exit(1);
 
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 646126edee..7292e51f7d 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -624,25 +624,14 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
 	}
 
 	/*
-	 * Initialize local process's access to XLOG.
+	 * If this is either a bootstrap process nor a standalone backend, start
+	 * up the XLOG machinery, and register to have it closed down at exit.
+	 * In other cases, the startup process is responsible for starting up
+	 * the XLOG machinery, and the checkpointer for closing it down.
 	 */
-	if (IsUnderPostmaster)
-	{
-		/*
-		 * The postmaster already started the XLOG machinery, but we need to
-		 * call InitXLOGAccess(), if the system isn't in hot-standby mode.
-		 * This is handled by calling RecoveryInProgress and ignoring the
-		 * result.
-		 */
-		(void) RecoveryInProgress();
-	}
-	else
+	if (!IsUnderPostmaster)
 	{
 		/*
-		 * We are either a bootstrap process or a standalone backend. Either
-		 * way, start up the XLOG machinery, and register to have it closed
-		 * down at exit.
-		 *
 		 * We don't yet have an aux-process resource owner, but StartupXLOG
 		 * and ShutdownXLOG will need one.  Hence, create said resource owner
 		 * (and register a callback to clean it up after ShutdownXLOG runs).
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 898df2ee03..34f6c89f06 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -299,7 +299,6 @@ extern void BootStrapXLOG(void);
 extern void LocalProcessControlFile(bool reset);
 extern void StartupXLOG(void);
 extern void ShutdownXLOG(int code, Datum arg);
-extern void InitXLOGAccess(void);
 extern void CreateCheckPoint(int flags);
 extern bool CreateRestartPoint(int flags);
 extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN);
-- 
2.24.3 (Apple Git-128)

#15

Alvaro Herrera

alvherre@alvh.no-ip.org

about 4 years ago

In reply to: Robert Haas (#14)

Re: RecoveryInProgress() has critical side effects

There are a couple typos. "uninitalized," in one place.

Also,

+ * If this is either a bootstrap process nor a standalone backend, start

Is "either .. nor" correct? It looks very wrong to me. I think you
meant "either .. or".

Overall, this patch seems mostly about documenting some undesirable
existing behavior. Is part of your plan to get such things fixed/
replaced entirely?

--
Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/
"We're here to devour each other alive" (Hobbes)

#16

Michael Paquier

michael@paquier.xyz

about 4 years ago

In reply to: Robert Haas (#6)

Re: RecoveryInProgress() has critical side effects

On Tue, Nov 16, 2021 at 09:41:58AM -0500, Robert Haas wrote:

Maybe I'm not understanding you properly here, but it seems like you
might be forgetting that this is a local variable and thus every
backend is going to have something different. In the startup process,
it will be initialized by StartupXLOG(); in other processes, it's
currently initialized by RecoveryInProgress(), but with this patch it
wouldn't be. Either way, it's then updated by future calls to
XLogInsertRecord() as required. XLOG_FPW_CHANGE records might affect
the new value that gets set the next time XLogInsertRecord(), but they
don't directly affect doPageWrites.

My main worry here is that this changes slightly the definition of
doPageWrites across stable branches at the end of recovery as there
could be records generated there. Note that GetFullPageWriteInfo()
gets called in XLogInsert(), while Insert->fullPageWrites gets updated
before CleanupAfterArchiveRecovery(). And it may influence
the value of doPageWrites in the startup process.
--
Michael

#17

Robert Haas

robertmhaas@gmail.com

about 4 years ago

In reply to: Michael Paquier (#16)

Re: RecoveryInProgress() has critical side effects

On Sat, Dec 4, 2021 at 7:44 PM Michael Paquier <michael@paquier.xyz> wrote:

My main worry here is that this changes slightly the definition of
doPageWrites across stable branches at the end of recovery as there
could be records generated there. Note that GetFullPageWriteInfo()
gets called in XLogInsert(), while Insert->fullPageWrites gets updated
before CleanupAfterArchiveRecovery(). And it may influence
the value of doPageWrites in the startup process.

But ... so what? All the code that uses it retries if the value that
was tentatively used turns out to be wrong.

--
Robert Haas
EDB: http://www.enterprisedb.com

#18

Robert Haas

robertmhaas@gmail.com

about 4 years ago

In reply to: Robert Haas (#17)

Re: RecoveryInProgress() has critical side effects

On Tue, Dec 7, 2021 at 5:55 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Sat, Dec 4, 2021 at 7:44 PM Michael Paquier <michael@paquier.xyz> wrote:

My main worry here is that this changes slightly the definition of
doPageWrites across stable branches at the end of recovery as there
could be records generated there. Note that GetFullPageWriteInfo()
gets called in XLogInsert(), while Insert->fullPageWrites gets updated
before CleanupAfterArchiveRecovery(). And it may influence
the value of doPageWrites in the startup process.

But ... so what? All the code that uses it retries if the value that
was tentatively used turns out to be wrong.

Despite the fact that this question didn't get further discussion, and
the fact that nobody seems quite sure what the best way of proceeding
here is, my interpretation of the comments made so far is that nobody
thinks that what we have now is superior to either of the proposed
alternatives, and there's a weak preference for v2 over v1. So I
committed that one.

--
Robert Haas
EDB: http://www.enterprisedb.com