why pg_walfile_name() cannot be executed during recovery?

Started by SATYANARAYANA NARLAPURAMalmost 5 years ago13 messages

SATYANARAYANA NARLAPURAM

satyanarlapuram@gmail.com

almost 5 years ago

Hello Hackers,

Why pg_walfile_name() can't be executed under recovery? What is the best
way for me to get the current timeline and/or the file being recovering on
the standby using a postgres query? I know I can get it via process title
but don't want to go that route.

Thanks,
Satya

robertmhaas@gmail.com

almost 5 years ago

In reply to: SATYANARAYANA NARLAPURAM (#1)

Re: why pg_walfile_name() cannot be executed during recovery?

On Fri, Apr 2, 2021 at 4:23 AM SATYANARAYANA NARLAPURAM
<satyanarlapuram@gmail.com> wrote:

Why pg_walfile_name() can't be executed under recovery?

I believe the issue is that the backend executing the function might
not have an accurate idea about which TLI to use. But I don't
understand why we can't find some solution to that problem.

What is the best way for me to get the current timeline and/or the file being recovering on the standby using a postgres query? I know I can get it via process title but don't want to go that route.

pg_stat_wal_receiver has LSN and TLI information, but probably won't
help except when WAL receiver is actually active.
pg_last_wal_receive_lsn() and pg_last_wal_replay_lsn() will give the
LSN at any point during recovery, but not the TLI. We might have some
gaps in this area...

--
Robert Haas
EDB: http://www.enterprisedb.com

Jehan-Guillaume de Rorthais

jgdr@dalibo.com

almost 5 years ago

In reply to: Robert Haas (#2)

Re: why pg_walfile_name() cannot be executed during recovery?

On Fri, 2 Apr 2021 08:22:09 -0400
Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Apr 2, 2021 at 4:23 AM SATYANARAYANA NARLAPURAM
<satyanarlapuram@gmail.com> wrote:

Why pg_walfile_name() can't be executed under recovery?

I believe the issue is that the backend executing the function might
not have an accurate idea about which TLI to use. But I don't
understand why we can't find some solution to that problem.

What is the best way for me to get the current timeline and/or the file
being recovering on the standby using a postgres query? I know I can get it
via process title but don't want to go that route.

pg_stat_wal_receiver has LSN and TLI information, but probably won't
help except when WAL receiver is actually active.
pg_last_wal_receive_lsn() and pg_last_wal_replay_lsn() will give the
LSN at any point during recovery, but not the TLI. We might have some
gaps in this area...

Yep, see previous discussion:
/messages/by-id/20190723180518.635ac554@firost

The status by the time was to consider a new view eg. pg_stat_recovery, to
report various recovery stats.

But maybe the best place now would be to include it in the new pg_stat_wal view?

As I'm interesting with this feature as well, I volunteer to work on it as
author or reviewer.

Regards,

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 4 years ago

In reply to: Robert Haas (#2)

1 attachment(s)

Re: why pg_walfile_name() cannot be executed during recovery?

On Fri, Apr 2, 2021 at 5:52 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Apr 2, 2021 at 4:23 AM SATYANARAYANA NARLAPURAM
<satyanarlapuram@gmail.com> wrote:

Why pg_walfile_name() can't be executed under recovery?

I believe the issue is that the backend executing the function might
not have an accurate idea about which TLI to use. But I don't
understand why we can't find some solution to that problem.

What is the best way for me to get the current timeline and/or the file being recovering on the standby using a postgres query? I know I can get it via process title but don't want to go that route.

pg_stat_wal_receiver has LSN and TLI information, but probably won't
help except when WAL receiver is actually active.
pg_last_wal_receive_lsn() and pg_last_wal_replay_lsn() will give the
LSN at any point during recovery, but not the TLI. We might have some
gaps in this area...

I spent some time today to allow pg_walfile_{name, name_offset} run in
recovery. Timeline ID is computed while in recovery as follows - WAL
receiver's last received and flushed WAL record's TLI if it's
streaming, otherwise the last replayed WAL record's TLI. This way,
these functions can be used on standby or PITR server or even in crash
recovery if the server opens up for read-only connections.

Please have a look at the attached patch.

If the approach looks okay, I can add notes in the documentation.

Regards,
Bharath Rupireddy.

Attachments:

v1-0001-Allow-pg_walfile_-name-name_offset-to-run-in-reco.patchapplication/octet-stream; name=v1-0001-Allow-pg_walfile_-name-name_offset-to-run-in-reco.patchDownload

From 6ebdff0cf9b5314fcee8227bcab92e21f8d8a866 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Thu, 7 Apr 2022 13:23:04 +0000
Subject: [PATCH v1] Allow pg_walfile_{name, name_offset} to run in recovery

Right now, pg_walfile_name and pg_walfile_name_offset don't run
while the server is in recovery (standby, PITR, crash). This
reduces their usability if the server opens up for read-only
connections in recovery.

This patch enables them to compute WAL file name even in recovery
with a timeline ID of last received and flushed WAL record of WAL
receiver if it's streaming, otherwise of the last replayed WAL
record.
---
 src/backend/access/transam/xlogfuncs.c     | 68 +++++++++++++++++-----
 src/backend/replication/walreceiverfuncs.c | 15 +++++
 src/include/replication/walreceiver.h      |  1 +
 3 files changed, 68 insertions(+), 16 deletions(-)

diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index b61ae6c0b4..65c04caec3 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -44,6 +44,8 @@
 static StringInfo label_file;
 static StringInfo tblspc_map_file;
 
+static TimeLineID GetTLIForWALFileNameComputation(void);
+
 /*
  * pg_backup_start: set up for taking an on-line backup dump
  *
@@ -316,6 +318,50 @@ pg_last_wal_replay_lsn(PG_FUNCTION_ARGS)
 	PG_RETURN_LSN(recptr);
 }
 
+/*
+ * Get timeLine ID for computing WAL file name.
+ *
+ * When not in recovery, it returns the current timeline of the server.
+ *
+ * When in recovery and WAL receiver is streaming, it returns the timeline of
+ * the last WAL record received and flushed to disk.
+ *
+ * When in recovery and WAL receiver is not streaming, it returns the timeline
+ * of last WAL record successfully replayed.
+ */
+static TimeLineID
+GetTLIForWALFileNameComputation(void)
+{
+	TimeLineID	tli;
+
+	if (!RecoveryInProgress())
+		tli = GetWALInsertionTimeLine();
+	else
+	{
+		/*
+		 * When WAL receiver isn't streaming, do not read timeline ID from its
+		 * shared memory as we don't know yet whether it will come up and start
+		 * streaming.
+		 */
+		if (WalRcvGetState() == WALRCV_STREAMING)
+			(void) GetWalRcvFlushRecPtr(NULL, &tli);
+		else
+			(void) GetXLogReplayRecPtr(&tli);
+	}
+
+	/*
+	 * The computed timeline ID can be invalid, especially when in recovery.
+	 * Let's be cautious and emit an error, otherwise the caller may give wrong
+	 * WAL file name.
+	 */
+	if (tli == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+				 errmsg("TimeLine ID %u is invalid", tli)));
+
+	return tli;
+}
+
 /*
  * Compute an xlog file name and decimal byte offset given a WAL location,
  * such as is returned by pg_backup_stop() or pg_switch_wal().
@@ -336,13 +382,9 @@ pg_walfile_name_offset(PG_FUNCTION_ARGS)
 	TupleDesc	resultTupleDesc;
 	HeapTuple	resultHeapTuple;
 	Datum		result;
+	TimeLineID	tli;
 
-	if (RecoveryInProgress())
-		ereport(ERROR,
-				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
-				 errmsg("recovery is in progress"),
-				 errhint("%s cannot be executed during recovery.",
-						 "pg_walfile_name_offset()")));
+	tli = GetTLIForWALFileNameComputation();
 
 	/*
 	 * Construct a tuple descriptor for the result row.  This must match this
@@ -360,8 +402,7 @@ pg_walfile_name_offset(PG_FUNCTION_ARGS)
 	 * xlogfilename
 	 */
 	XLByteToPrevSeg(locationpoint, xlogsegno, wal_segment_size);
-	XLogFileName(xlogfilename, GetWALInsertionTimeLine(), xlogsegno,
-				 wal_segment_size);
+	XLogFileName(xlogfilename, tli, xlogsegno, wal_segment_size);
 
 	values[0] = CStringGetTextDatum(xlogfilename);
 	isnull[0] = false;
@@ -394,17 +435,12 @@ pg_walfile_name(PG_FUNCTION_ARGS)
 	XLogSegNo	xlogsegno;
 	XLogRecPtr	locationpoint = PG_GETARG_LSN(0);
 	char		xlogfilename[MAXFNAMELEN];
+	TimeLineID	tli;
 
-	if (RecoveryInProgress())
-		ereport(ERROR,
-				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
-				 errmsg("recovery is in progress"),
-				 errhint("%s cannot be executed during recovery.",
-						 "pg_walfile_name()")));
+	tli = GetTLIForWALFileNameComputation();
 
 	XLByteToPrevSeg(locationpoint, xlogsegno, wal_segment_size);
-	XLogFileName(xlogfilename, GetWALInsertionTimeLine(), xlogsegno,
-				 wal_segment_size);
+	XLogFileName(xlogfilename, tli, xlogsegno, wal_segment_size);
 
 	PG_RETURN_TEXT_P(cstring_to_text(xlogfilename));
 }
diff --git a/src/backend/replication/walreceiverfuncs.c b/src/backend/replication/walreceiverfuncs.c
index 90798b9d53..0a17a5b4cd 100644
--- a/src/backend/replication/walreceiverfuncs.c
+++ b/src/backend/replication/walreceiverfuncs.c
@@ -171,6 +171,21 @@ WalRcvStreaming(void)
 		return false;
 }
 
+/*
+ * Get walreceiver state.
+ */
+WalRcvState
+WalRcvGetState(void)
+{
+	WalRcvState state;
+
+	SpinLockAcquire(&WalRcv->mutex);
+	state = WalRcv->walRcvState;
+	SpinLockRelease(&WalRcv->mutex);
+
+	return state;
+}
+
 /*
  * Stop walreceiver (if running) and wait for it to die.
  * Executed by the Startup process.
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index 92f73a55b8..d950425ab2 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -460,6 +460,7 @@ extern void WalRcvShmemInit(void);
 extern void ShutdownWalRcv(void);
 extern bool WalRcvStreaming(void);
 extern bool WalRcvRunning(void);
+extern WalRcvState WalRcvGetState(void);
 extern void RequestXLogStreaming(TimeLineID tli, XLogRecPtr recptr,
 								 const char *conninfo, const char *slotname,
 								 bool create_temp_slot);
-- 
2.25.1

robertmhaas@gmail.com

almost 4 years ago

In reply to: Bharath Rupireddy (#4)

Re: why pg_walfile_name() cannot be executed during recovery?

On Thu, Apr 7, 2022 at 9:32 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

I spent some time today to allow pg_walfile_{name, name_offset} run in
recovery. Timeline ID is computed while in recovery as follows - WAL
receiver's last received and flushed WAL record's TLI if it's
streaming, otherwise the last replayed WAL record's TLI. This way,
these functions can be used on standby or PITR server or even in crash
recovery if the server opens up for read-only connections.

I don't think this is a good definition. Suppose I ask for
pg_walfile_name() using an older LSN. With this approach, we're going
to get a filename based on the idea that the TLI that was in effect
back then is the same one as the TLI that is in effect now, which
might not be true. For example, suppose that the current TLI is 2 and
it branched off of timeline 1 at 10/0. If I ask for
pg_walfile_name('F/0'), it's going to give me the name of a WAL file
that has never existed. That seems bad.

It's also worth noting that there's a bit of a definitional problem
here. If in the same situation, I ask for pg_walfile_name('11/0'),
it's going to give me a filename based on TLI 2, but there's also a
WAL file for that LSN with TLI 1. How do we know which one the user
wants? Perhaps one idea would be to say that the relevant TLI is the
one which was in effect at the time that LSN was replayed. If we do
that, what about future LSNs? We could assume that for future LSNs,
the TLI should be the same as the current TLI, but maybe that's also
misleading, because recovery_target_timeline could be set.

I think it's really important to start by being precise about the
question that we think pg_walfile_name() ought to be answering. If we
don't know that, then we really can't say what TLI it should be using.
It's not hard to make the function return SOME answer using SOME TLI,
but then it's not clear that the answer is the right one for any
particular purpose. And in that case the function is more dangerous
than useful, because people will write code that uses it to do stuff,
and then that stuff won't actually work correctly under all
circumstances.

--
Robert Haas
EDB: http://www.enterprisedb.com

Michael Paquier

michael@paquier.xyz

almost 4 years ago

In reply to: Robert Haas (#5)

Re: why pg_walfile_name() cannot be executed during recovery?

On Thu, Apr 07, 2022 at 11:37:15AM -0400, Robert Haas wrote:

It's also worth noting that there's a bit of a definitional problem
here. If in the same situation, I ask for pg_walfile_name('11/0'),
it's going to give me a filename based on TLI 2, but there's also a
WAL file for that LSN with TLI 1. How do we know which one the user
wants? Perhaps one idea would be to say that the relevant TLI is the
one which was in effect at the time that LSN was replayed. If we do
that, what about future LSNs? We could assume that for future LSNs,
the TLI should be the same as the current TLI, but maybe that's also
misleading, because recovery_target_timeline could be set.

FWIW, for future positions, I'd be rather on board with the concept of
using the TLI currently being replayed, but as you say that comes down
to the definition borders we want to use. Another possibility would
be to return an error and kick the can down the road if we are unsure
of what the right behavior is. For past positions, this should go
through a lookup of the timeline history file (the patch does not do
that at quick glance).

I think it's really important to start by being precise about the
question that we think pg_walfile_name() ought to be answering. If we
don't know that, then we really can't say what TLI it should be using.
It's not hard to make the function return SOME answer using SOME TLI,
but then it's not clear that the answer is the right one for any
particular purpose. And in that case the function is more dangerous
than useful, because people will write code that uses it to do stuff,
and then that stuff won't actually work correctly under all
circumstances.

Agreed.
--
Michael

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 4 years ago

In reply to: Robert Haas (#5)

Re: why pg_walfile_name() cannot be executed during recovery?

On Thu, Apr 7, 2022 at 9:07 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Apr 7, 2022 at 9:32 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

I spent some time today to allow pg_walfile_{name, name_offset} run in
recovery. Timeline ID is computed while in recovery as follows - WAL
receiver's last received and flushed WAL record's TLI if it's
streaming, otherwise the last replayed WAL record's TLI. This way,
these functions can be used on standby or PITR server or even in crash
recovery if the server opens up for read-only connections.

I don't think this is a good definition. Suppose I ask for
pg_walfile_name() using an older LSN. With this approach, we're going
to get a filename based on the idea that the TLI that was in effect
back then is the same one as the TLI that is in effect now, which
might not be true. For example, suppose that the current TLI is 2 and
it branched off of timeline 1 at 10/0. If I ask for
pg_walfile_name('F/0'), it's going to give me the name of a WAL file
that has never existed. That seems bad.

It's also worth noting that there's a bit of a definitional problem
here. If in the same situation, I ask for pg_walfile_name('11/0'),
it's going to give me a filename based on TLI 2, but there's also a
WAL file for that LSN with TLI 1. How do we know which one the user
wants? Perhaps one idea would be to say that the relevant TLI is the
one which was in effect at the time that LSN was replayed. If we do
that, what about future LSNs? We could assume that for future LSNs,
the TLI should be the same as the current TLI, but maybe that's also
misleading, because recovery_target_timeline could be set.

Fundamental question - should the pg_walfile_{name, name_offset} check
whether the file with the computed WAL file name exists on the server
right now or ever existed earlier? Right now, they don't do that, see
[1]: postgres=# select * from pg_walfile_name('50000/dfdf'); pg_walfile_name -------------------------- 000000010005000000000000 (1 row) postgres=# select * from pg_walfile_name_offset('50000/dfdf'); file_name | file_offset --------------------------+------------- 000000010005000000000000 | 57311 (1 row)

I think we can make the functions more robust:
pg_walfile_{name, name_offset}(lsn, check_if_file_exists = false, tli
= invalid_timelineid) - when check_if_file_exists is true checks for
the computed WAL file existence and when a valid tli is provided uses
it in computing the WAL file name. When tli isn't provided, it
continues to use insert tli for primary, and in recovery it uses tli
as proposed in my patch. Perhaps, it can also do (as Michael
suggested) this - if check_if_file_exists is true and tli isn't
provided and there's timeline history, then it can go look at all the
timelines and whether the file exists with the computed name with
history tli.

I think it's really important to start by being precise about the
question that we think pg_walfile_name() ought to be answering. If we
don't know that, then we really can't say what TLI it should be using.
It's not hard to make the function return SOME answer using SOME TLI,
but then it's not clear that the answer is the right one for any
particular purpose. And in that case the function is more dangerous
than useful, because people will write code that uses it to do stuff,
and then that stuff won't actually work correctly under all
circumstances.

Yes, once we agree on the semantics of these functions, having better
documentation will help.

Thoughts?

[1]: postgres=# select * from pg_walfile_name('50000/dfdf'); pg_walfile_name -------------------------- 000000010005000000000000 (1 row) postgres=# select * from pg_walfile_name_offset('50000/dfdf'); file_name | file_offset --------------------------+------------- 000000010005000000000000 | 57311 (1 row)
postgres=# select * from pg_walfile_name('50000/dfdf');
pg_walfile_name
--------------------------
000000010005000000000000
(1 row)
postgres=# select * from pg_walfile_name_offset('50000/dfdf');
file_name | file_offset
--------------------------+-------------
000000010005000000000000 | 57311
(1 row)

Regards,
Bharath Rupireddy.

robertmhaas@gmail.com

almost 4 years ago

In reply to: Bharath Rupireddy (#7)

Re: why pg_walfile_name() cannot be executed during recovery?

On Fri, Apr 8, 2022 at 9:31 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Fundamental question - should the pg_walfile_{name, name_offset} check
whether the file with the computed WAL file name exists on the server
right now or ever existed earlier? Right now, they don't do that, see
[1].

I don't think that checking whether the file exists is the right
approach. However, I do think that it's important to be precise about
which TLI is going to be used. I think it would be reasonable to
redefine this function (on both the primary and the standby) so that
the TLI that is used is the one that was in effect at the time record
at the given LSN was either written or replayed. Then, you could
potentially use this function to figure out whether you still have the
WAL files that are needed to replay up to some previous point in the
WAL stream. However, what about the segments where we switched from
one TLI to the next in the middle of the segment? There, you probably
need both the old and the new segments, or maybe if you're trying to
stream them you only need the new one because we have some weird
special case that will send the segment from the new timeline when the
segment from the old timeline is requested. So you couldn't just call
this function on one LSN per segment and call it good, and it wouldn't
necessarily be the case that the filenames you got back were exactly
the ones you needed.

So I'm not entirely sure this proposal is good enough, but it at least
would have the advantage of meaning that the filename you get back is
one that existed at some point in time and somebody used it for
something.

--
Robert Haas
EDB: http://www.enterprisedb.com

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 4 years ago

In reply to: Robert Haas (#8)

1 attachment(s)

Re: why pg_walfile_name() cannot be executed during recovery?

On Fri, Apr 8, 2022 at 7:28 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Apr 8, 2022 at 9:31 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Fundamental question - should the pg_walfile_{name, name_offset} check
whether the file with the computed WAL file name exists on the server
right now or ever existed earlier? Right now, they don't do that, see
[1].

I don't think that checking whether the file exists is the right
approach. However, I do think that it's important to be precise about
which TLI is going to be used. I think it would be reasonable to
redefine this function (on both the primary and the standby) so that
the TLI that is used is the one that was in effect at the time record
at the given LSN was either written or replayed. Then, you could
potentially use this function to figure out whether you still have the
WAL files that are needed to replay up to some previous point in the
WAL stream. However, what about the segments where we switched from
one TLI to the next in the middle of the segment? There, you probably
need both the old and the new segments, or maybe if you're trying to
stream them you only need the new one because we have some weird
special case that will send the segment from the new timeline when the
segment from the old timeline is requested. So you couldn't just call
this function on one LSN per segment and call it good, and it wouldn't
necessarily be the case that the filenames you got back were exactly
the ones you needed.

So I'm not entirely sure this proposal is good enough, but it at least
would have the advantage of meaning that the filename you get back is
one that existed at some point in time and somebody used it for
something.

Using insert tli when not in recovery and using tli of the last WAL
replayed record in crash/archive/standby recovery, seems a reasonable
choice to me. I've also added a note in the docs.

Attaching v2 with the above change. Please review it further.

Regards,
Bharath Rupireddy.

Attachments:

v2-0001-Allow-pg_walfile_-name-name_offset-to-run-in-reco.patchapplication/x-patch; name=v2-0001-Allow-pg_walfile_-name-name_offset-to-run-in-reco.patchDownload

From 37a7587abb11b6ebcb82d0fbf3cff7505355a679 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sat, 9 Apr 2022 13:14:03 +0000
Subject: [PATCH v2] Allow pg_walfile_{name, name_offset} to run in recovery

Right now, pg_walfile_name and pg_walfile_name_offset don't run
while the server is in recovery (standby, PITR/archive, crash).
This reduces their usability if the server opens up for
read-only connections in recovery.

This patch enables them to compute WAL file name even in recovery
with a timeline ID of last the last successfully replayed WAL
record. They continue to use the timeline ID with which new
WAL records are being inserted and flushed when not in recovery.
---
 doc/src/sgml/func.sgml                 |  8 +++++
 src/backend/access/transam/xlogfuncs.c | 48 +++++++++++++++++---------
 2 files changed, 40 insertions(+), 16 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 5047e090db..0a08e34813 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -28442,6 +28442,14 @@ postgres=# SELECT * FROM pg_walfile_name_offset((pg_backup_stop()).lsn);
     needs to be archived.
    </para>
 
+   <para>
+    Both <function>pg_walfile_name_offset</function> and <function>pg_walfile_name</function>
+    use a timeline ID internally to the extract write-ahead log file name. When
+    not in recovery, they use the timeline ID with which new write-ahead log
+    records are being inserted and flushed. When in recovery, they use the
+    timeline ID of the last successfully replayed write-ahead log record.
+   </para>
+
   </sect2>
 
   <sect2 id="functions-recovery-control">
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index b61ae6c0b4..795fd8d26b 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -44,6 +44,8 @@
 static StringInfo label_file;
 static StringInfo tblspc_map_file;
 
+static TimeLineID GetTLIForWALFileNameComputation(void);
+
 /*
  * pg_backup_start: set up for taking an on-line backup dump
  *
@@ -316,6 +318,30 @@ pg_last_wal_replay_lsn(PG_FUNCTION_ARGS)
 	PG_RETURN_LSN(recptr);
 }
 
+/*
+ * Get timeLine ID for computing WAL file name.
+ *
+ * When not in recovery, it returns the timeline into which new WAL is being
+ * inserted and flushed.
+ *
+ * When in crash/archive/standby recovery, it returns the timeline of the last
+ * WAL record that is successfully replayed.
+ */
+static TimeLineID
+GetTLIForWALFileNameComputation(void)
+{
+	TimeLineID	tli;
+
+	if (RecoveryInProgress())
+		(void) GetXLogReplayRecPtr(&tli);
+	else
+		tli = GetWALInsertionTimeLine();
+
+	Assert(tli > 0);
+
+	return tli;
+}
+
 /*
  * Compute an xlog file name and decimal byte offset given a WAL location,
  * such as is returned by pg_backup_stop() or pg_switch_wal().
@@ -336,13 +362,9 @@ pg_walfile_name_offset(PG_FUNCTION_ARGS)
 	TupleDesc	resultTupleDesc;
 	HeapTuple	resultHeapTuple;
 	Datum		result;
+	TimeLineID	tli;
 
-	if (RecoveryInProgress())
-		ereport(ERROR,
-				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
-				 errmsg("recovery is in progress"),
-				 errhint("%s cannot be executed during recovery.",
-						 "pg_walfile_name_offset()")));
+	tli = GetTLIForWALFileNameComputation();
 
 	/*
 	 * Construct a tuple descriptor for the result row.  This must match this
@@ -360,8 +382,7 @@ pg_walfile_name_offset(PG_FUNCTION_ARGS)
 	 * xlogfilename
 	 */
 	XLByteToPrevSeg(locationpoint, xlogsegno, wal_segment_size);
-	XLogFileName(xlogfilename, GetWALInsertionTimeLine(), xlogsegno,
-				 wal_segment_size);
+	XLogFileName(xlogfilename, tli, xlogsegno, wal_segment_size);
 
 	values[0] = CStringGetTextDatum(xlogfilename);
 	isnull[0] = false;
@@ -394,17 +415,12 @@ pg_walfile_name(PG_FUNCTION_ARGS)
 	XLogSegNo	xlogsegno;
 	XLogRecPtr	locationpoint = PG_GETARG_LSN(0);
 	char		xlogfilename[MAXFNAMELEN];
+	TimeLineID	tli;
 
-	if (RecoveryInProgress())
-		ereport(ERROR,
-				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
-				 errmsg("recovery is in progress"),
-				 errhint("%s cannot be executed during recovery.",
-						 "pg_walfile_name()")));
+	tli = GetTLIForWALFileNameComputation();
 
 	XLByteToPrevSeg(locationpoint, xlogsegno, wal_segment_size);
-	XLogFileName(xlogfilename, GetWALInsertionTimeLine(), xlogsegno,
-				 wal_segment_size);
+	XLogFileName(xlogfilename, tli, xlogsegno, wal_segment_size);
 
 	PG_RETURN_TEXT_P(cstring_to_text(xlogfilename));
 }
-- 
2.25.1

x4mmm@yandex-team.ru

almost 4 years ago

In reply to: Bharath Rupireddy (#9)

Re: why pg_walfile_name() cannot be executed during recovery?

On 9 Apr 2022, at 18:30, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:

Using insert tli when not in recovery and using tli of the last WAL
replayed record in crash/archive/standby recovery, seems a reasonable
choice to me.

Please excuse me if I'm not attentive enough. I've read this thread. And I could not find what is the problem that you are solving. What is the purpose of the WAL file name you want to obtain?

pg_walfile_name() - is a formatting function. With TLI as an hidden argument. If we want it to work on Standby we should just convert it to pure formatting function without access the the DB state, pass TLI as an argument.
Making implicit TLI computation with certain expectations is not a good idea IMV.

pg_walfile_name() could just read .history file, determine which TLI contains given LSN and format the name. And still there's a tricky segments during TLI switch.

Either way we can rename the function to pg_walfile_name_as_if_on_timeline_of_last_wal_replayed().

Thanks!

Best regards, Andrey Borodin.

robertmhaas@gmail.com

almost 4 years ago

In reply to: Andrey Borodin (#10)

Re: why pg_walfile_name() cannot be executed during recovery?

On Sat, Apr 9, 2022 at 12:25 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:

Please excuse me if I'm not attentive enough. I've read this thread. And I could not find what is the problem that you are solving. What is the purpose of the WAL file name you want to obtain?

Yeah, I'd also like to know this.

--
Robert Haas
EDB: http://www.enterprisedb.com

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

over 3 years ago

In reply to: Robert Haas (#11)

Re: why pg_walfile_name() cannot be executed during recovery?

On Sat, Apr 9, 2022 at 10:21 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Sat, Apr 9, 2022 at 12:25 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:

Please excuse me if I'm not attentive enough. I've read this thread. And I could not find what is the problem that you are solving. What is the purpose of the WAL file name you want to obtain?

Yeah, I'd also like to know this.

IMO, uses of pg_walfile_{name, name_offset} are plenty. Say, I have
LSNs (say, flush, insert, replayed or WAL receiver latest received)
and I would like to know the WAL file name and offset in an app
connecting to postgres or a control plane either for doing some
reporting or figuring out whether a WAL file exists given an LSN or
for some other reason. With these functions restricted when the server
is in recovery mode, the apps or control plane code can't use them and
they have to do if (!pg_is_in_recovery()) {select * from
pg_walfile_{name, name_offset}.

Am I missing any other important use-cases?

Regards,
Bharath Rupireddy.

x4mmm@yandex-team.ru

over 3 years ago

In reply to: Bharath Rupireddy (#12)

Re: why pg_walfile_name() cannot be executed during recovery?

22 апр. 2022 г., в 19:15, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> написал(а):

On Sat, Apr 9, 2022 at 10:21 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Sat, Apr 9, 2022 at 12:25 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:

Please excuse me if I'm not attentive enough. I've read this thread. And I could not find what is the problem that you are solving. What is the purpose of the WAL file name you want to obtain?

Yeah, I'd also like to know this.

IMO, uses of pg_walfile_{name, name_offset} are plenty. Say, I have
LSNs (say, flush, insert, replayed or WAL receiver latest received)

AFAIK flush, receive and replay LSNs may be on 3 different timelines rendering two names incorrect. Actually, this proves that pg_wal_filename() should not be called on Standby with a present function prototype.

and I would like to know the WAL file name and offset in an app
connecting to postgres or a control plane either for doing some
reporting

What kind of reporting?

or figuring out whether a WAL file exists given an LSN or
for some other reason.

There might me many WAL files on the same LSN. Please, specify more detailed scenario to use WAL file name.

With these functions restricted when the server
is in recovery mode, the apps or control plane code can't use them and
they have to do if (!pg_is_in_recovery()) {select * from
pg_walfile_{name, name_offset}.

Am I missing any other important use-cases?

I do not see correct use-case among these. You justify necessity to run pg_wal_filename() on Standby by having a LSN (not a problem), by doing some kind of reporting (to broad problem) and checking existence of some WAL file (more details needed). What is the problem leading to checking the existence of the file?

Thanks!

Best regards, Andrey Borodin.