pg_xlogfile_name_offset() et al and recovery

Started by Amit Langoteover 9 years ago11 messages
#1Amit Langote
Langote_Amit_f8@lab.ntt.co.jp

Currently in HEAD and 9.6, one can issue a non-exclusive backup on
standby, so this is OK:

select pg_is_in_recovery();
pg_is_in_recovery
-------------------
t
(1 row)

select pg_start_backup('sby-bkp-test', 'f', 'f');
pg_start_backup
-----------------
0/5000220
(1 row)

However the following happens:

select pg_xlogfile_name_offset(pg_start_backup('sby-bkp-test', 'f', 'f'));
ERROR: recovery is in progress
HINT: pg_xlogfile_name_offset() cannot be executed during recovery.

Should this restriction be relaxed or am I missing something?

Thanks,
Amit

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#1)
Re: pg_xlogfile_name_offset() et al and recovery

On 2016/05/19 17:34, Amit Langote wrote:

Currently in HEAD and 9.6, one can issue a non-exclusive backup on
standby, so this is OK:

select pg_is_in_recovery();
pg_is_in_recovery
-------------------
t
(1 row)

select pg_start_backup('sby-bkp-test', 'f', 'f');
pg_start_backup
-----------------
0/5000220
(1 row)

However the following happens:

select pg_xlogfile_name_offset(pg_start_backup('sby-bkp-test', 'f', 'f'));
ERROR: recovery is in progress
HINT: pg_xlogfile_name_offset() cannot be executed during recovery.

Should this restriction be relaxed or am I missing something?

Answering my own question:

While reading the thread "BUG #14230: Wrong timeline returned by
pg_stop_backup on a standby", I came to know that ThisTimelineId is
invalid on standby. And because pg_xlogfile_name_offset() uses the same
to compute its result, it makes sense to prevent it from being used on a
standby.

Thanks,
Amit

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Michael Paquier
michael.paquier@gmail.com
In reply to: Amit Langote (#2)
1 attachment(s)
Re: pg_xlogfile_name_offset() et al and recovery

On Thu, Jul 7, 2016 at 3:59 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:

While reading the thread "BUG #14230: Wrong timeline returned by
pg_stop_backup on a standby", I came to know that ThisTimelineId is
invalid on standby. And because pg_xlogfile_name_offset() uses the same
to compute its result, it makes sense to prevent it from being used on a
standby.

To be honest, I have always found that this restriction was hard to
justify on a function that basically performs a static calculation. So
what about removing this error and use GetXLogReplayRecPtr() to fetch
the timeline ID? Per se the patch attached:
=# select * from pg_xlogfile_name_offset(pg_last_xlog_replay_location());
file_name | file_offset
--------------------------+-------------
000000010000000000000003 | 2192
(1 row)

The same applies of course to pg_xlogfile_name():
=# select pg_xlogfile_name(pg_last_xlog_replay_location());
pg_xlogfile_name
--------------------------
000000010000000000000003
(1 row)
--
Michael

Attachments:

xlogfuncs-tli-recovery.patchtext/x-patch; charset=US-ASCII; name=xlogfuncs-tli-recovery.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index baef80d..46ca93e 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -17796,7 +17796,9 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
     these functions return the name of the preceding transaction log file.
     This is usually the desired behavior for managing transaction log archiving
     behavior, since the preceding file is the last one that currently
-    needs to be archived.
+    needs to be archived. The timeline number used to build the log file
+    name refers to the current timeline on a master server, and the timeline
+    currently being replayed on a server in recovery.
    </para>
 
    <para>
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 33383b4..91f8e03 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -464,12 +464,7 @@ pg_xlogfile_name_offset(PG_FUNCTION_ARGS)
 	TupleDesc	resultTupleDesc;
 	HeapTuple	resultHeapTuple;
 	Datum		result;
-
-	if (RecoveryInProgress())
-		ereport(ERROR,
-				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
-				 errmsg("recovery is in progress"),
-				 errhint("pg_xlogfile_name_offset() cannot be executed during recovery.")));
+	TimeLineID	tli;
 
 	/*
 	 * Construct a tuple descriptor for the result row.  This must match this
@@ -483,11 +478,16 @@ pg_xlogfile_name_offset(PG_FUNCTION_ARGS)
 
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
+	if (RecoveryInProgress())
+		(void) GetXLogReplayRecPtr(&tli);
+	else
+		tli = ThisTimeLineID;
+
 	/*
 	 * xlogfilename
 	 */
 	XLByteToPrevSeg(locationpoint, xlogsegno);
-	XLogFileName(xlogfilename, ThisTimeLineID, xlogsegno);
+	XLogFileName(xlogfilename, tli, xlogsegno);
 
 	values[0] = CStringGetTextDatum(xlogfilename);
 	isnull[0] = false;
@@ -520,15 +520,15 @@ pg_xlogfile_name(PG_FUNCTION_ARGS)
 	XLogSegNo	xlogsegno;
 	XLogRecPtr	locationpoint = PG_GETARG_LSN(0);
 	char		xlogfilename[MAXFNAMELEN];
+	TimeLineID	tli;
 
 	if (RecoveryInProgress())
-		ereport(ERROR,
-				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
-				 errmsg("recovery is in progress"),
-		 errhint("pg_xlogfile_name() cannot be executed during recovery.")));
+		(void) GetXLogReplayRecPtr(&tli);
+	else
+		tli = ThisTimeLineID;
 
 	XLByteToPrevSeg(locationpoint, xlogsegno);
-	XLogFileName(xlogfilename, ThisTimeLineID, xlogsegno);
+	XLogFileName(xlogfilename, tli, xlogsegno);
 
 	PG_RETURN_TEXT_P(cstring_to_text(xlogfilename));
 }
#4Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Michael Paquier (#3)
Re: pg_xlogfile_name_offset() et al and recovery

On 2016/07/07 16:26, Michael Paquier wrote:

On Thu, Jul 7, 2016 at 3:59 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:

While reading the thread "BUG #14230: Wrong timeline returned by
pg_stop_backup on a standby", I came to know that ThisTimelineId is
invalid on standby. And because pg_xlogfile_name_offset() uses the same
to compute its result, it makes sense to prevent it from being used on a
standby.

To be honest, I have always found that this restriction was hard to
justify on a function that basically performs a static calculation. So
what about removing this error and use GetXLogReplayRecPtr() to fetch
the timeline ID? Per se the patch attached:
=# select * from pg_xlogfile_name_offset(pg_last_xlog_replay_location());
file_name | file_offset
--------------------------+-------------
000000010000000000000003 | 2192
(1 row)

The same applies of course to pg_xlogfile_name():
=# select pg_xlogfile_name(pg_last_xlog_replay_location());
pg_xlogfile_name
--------------------------
000000010000000000000003
(1 row)

+1 to enabling these functions' usage on standby and the patch.

Thanks,
Amit

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Amit Langote (#4)
Re: pg_xlogfile_name_offset() et al and recovery

At Thu, 7 Jul 2016 16:48:57 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote in <3649f1e4-ac74-841e-6ae5-cfe0022aa50b@lab.ntt.co.jp>

On 2016/07/07 16:26, Michael Paquier wrote:

On Thu, Jul 7, 2016 at 3:59 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:

While reading the thread "BUG #14230: Wrong timeline returned by
pg_stop_backup on a standby", I came to know that ThisTimelineId is
invalid on standby. And because pg_xlogfile_name_offset() uses the same
to compute its result, it makes sense to prevent it from being used on a
standby.

To be honest, I have always found that this restriction was hard to
justify on a function that basically performs a static calculation. So
what about removing this error and use GetXLogReplayRecPtr() to fetch
the timeline ID? Per se the patch attached:
=# select * from pg_xlogfile_name_offset(pg_last_xlog_replay_location());
file_name | file_offset
--------------------------+-------------
000000010000000000000003 | 2192
(1 row)

The same applies of course to pg_xlogfile_name():
=# select pg_xlogfile_name(pg_last_xlog_replay_location());
pg_xlogfile_name
--------------------------
000000010000000000000003
(1 row)

+1 to enabling these functions' usage on standby and the patch.

Strongly agreed. At first I thought that using replay-loc TLI is
bogus but ThisTimeLineID for non-recovery time is also bogus at
almost the same degree.

=# select * from pg_xlogfile_name_offset('0/100'::pg_lsn);
file_name | file_offset
--------------------------+-------------
000000030000000000000000 | 256

The rest of the "almost" is master's timeline evolution. A master
won't get timeline evolution during runtime but a standby can do.

As the result, this relaxation adds more good than bad and I
agree to this.

Addition to that, I'd like to propose to add a description (or a
notice) about this restriction in documentation that the timeline
part of the file name may be wrong. Will post soon.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Kyotaro HORIGUCHI (#5)
Re: pg_xlogfile_name_offset() et al and recovery

I wrote a bit wrong thing.

At Thu, 07 Jul 2016 17:20:52 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in <20160707.172052.213690356.horiguchi.kyotaro@lab.ntt.co.jp>

To be honest, I have always found that this restriction was hard to
justify on a function that basically performs a static calculation. So
what about removing this error and use GetXLogReplayRecPtr() to fetch
the timeline ID? Per se the patch attached:
=# select * from pg_xlogfile_name_offset(pg_last_xlog_replay_location());
file_name | file_offset
--------------------------+-------------
000000010000000000000003 | 2192
(1 row)

The same applies of course to pg_xlogfile_name():
=# select pg_xlogfile_name(pg_last_xlog_replay_location());
pg_xlogfile_name
--------------------------
000000010000000000000003
(1 row)

+1 to enabling these functions' usage on standby and the patch.

Strongly agreed. At first I thought that using replay-loc TLI is
bogus but ThisTimeLineID for non-recovery time is also bogus at
almost the same degree.

=# select * from pg_xlogfile_name_offset('0/100'::pg_lsn);
file_name | file_offset
--------------------------+-------------
000000030000000000000000 | 256

The rest of the "almost" is master's timeline evolution. A master

It makes non sense. should be the follwoing,

| The rest of the "almost" is the upstreams' timeline evolution
| and promotion of itself. A master

won't get timeline evolution during runtime but a standby can do.

As the result, this relaxation adds more good than bad and I
agree to this.

Addition to that, I'd like to propose to add a description (or a
notice) about this restriction in documentation that the timeline
part of the file name may be wrong. Will post soon.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Amit Langote (#4)
1 attachment(s)
Re: pg_xlogfile_name_offset() et al and recovery

Hello,

At Thu, 07 Jul 2016 17:26:15 +0900, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in <20160707.172615.232806225.horiguchi.kyotaro@lab.ntt.co.jp>

As the result, this relaxation adds more good than bad and I
agree to this.

Addition to that, I'd like to propose to add a description (or a
notice) about this restriction in documentation that the timeline
part of the file name may be wrong. Will post soon.

The following sentense will be added by this patch. Does it make sense?

| Note that these two functions use the timeline ID at the time
| to construct the file name. Therefore they give wrong names for
| transaction log locations on the other timelines. This can
| happen during runtime on replication standby when a promotion
| occurs on itself or upstreams.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

pg_xlogfile_name_doc_note.difftext/x-patch; charset=us-asciiDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index baef80d..451f0ec 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -17796,7 +17796,11 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
     these functions return the name of the preceding transaction log file.
     This is usually the desired behavior for managing transaction log archiving
     behavior, since the preceding file is the last one that currently
-    needs to be archived.
+    needs to be archived. Note that these two functions use the timeline ID at
+    the time to construct the file name. Therefore they give wrong names for
+    transaction log locations on the other timelines. This can happen during
+    runtime on replication standby when a promotion occurs on itself or
+    upstreams.
    </para>
 
    <para>
#8Simon Riggs
simon@2ndquadrant.com
In reply to: Michael Paquier (#3)
Re: pg_xlogfile_name_offset() et al and recovery

On 7 July 2016 at 08:26, Michael Paquier <michael.paquier@gmail.com> wrote:

On Thu, Jul 7, 2016 at 3:59 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:

While reading the thread "BUG #14230: Wrong timeline returned by
pg_stop_backup on a standby", I came to know that ThisTimelineId is
invalid on standby. And because pg_xlogfile_name_offset() uses the same
to compute its result, it makes sense to prevent it from being used on a
standby.

To be honest, I have always found that this restriction was hard to
justify on a function that basically performs a static calculation.

I know its annoying behaviour, but this is not an immutable function i.e.
not a static calculation.
The timeline is not so much invalid as variable over time.

The behaviour of that function is defined in backbranches, so I suggest we
should not alter that now.

What we can do is have another function that makes it clearer that the
answer is variable.
pg_xlogfile_name_offset(offset, timelineid)
always succeeds on standby but needs to be told what timelineid to use

then we have another function
pg_last_xact_replay_timeline() that allows you to find out the current
timeline

The actual problem here is somewhat more convoluted, because we would like
to know the timeline of the basebackup at start and stop. That information
is not easily available from the backup label data returned by
pg_stop_backup().
We would like to do something like this...

select pg_xlogfile_name_offset(l.lsn, l.tli) from pg_stop_backup(false) l;

So I suggest we add another column to the output of pg_stop_backup() to
return the tli value directly.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/&gt;
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#9Michael Paquier
michael.paquier@gmail.com
In reply to: Simon Riggs (#8)
Re: pg_xlogfile_name_offset() et al and recovery

On Thu, Jul 7, 2016 at 6:26 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

The behaviour of that function is defined in backbranches, so I suggest we
should not alter that now.

Hm..

What we can do is have another function that makes it clearer that the
answer is variable.
pg_xlogfile_name_offset(offset, timelineid)
always succeeds on standby but needs to be told what timelineid to use

This has clearly value, I have wanted to override the timeline number
a couple of times now. I have always done so with a custom function
and saw many people doing it at application level, but it is a weird
design to have the 1-argument version fail for a node in recovery, and
the 2-argument version succeed. I'd rather have everything consistent
to be honest, with the same function name, and the behavior clearly
documented.

then we have another function
pg_last_xact_replay_timeline() that allows you to find out the current
timeline

It would be good to have an equivalent pg_current_xlog_timeline as
well that can run on nodes not in recovery. I agree that having a
separate function makes more sense as its equivalents for WAL
positions.

The actual problem here is somewhat more convoluted, because we would like
to know the timeline of the basebackup at start and stop. That information
is not easily available from the backup label data returned by
pg_stop_backup().
We would like to do something like this...

select pg_xlogfile_name_offset(l.lsn, l.tli) from pg_stop_backup(false) l;

So I suggest we add another column to the output of pg_stop_backup() to
return the tli value directly.

This would be useful as well, that's less custom-parsing logic for
applications based on the segment name in backup_label.

Note that I am not personally planning to write a patch for all that,
but any people are welcome to pick up this task!
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Simon Riggs
simon@2ndquadrant.com
In reply to: Michael Paquier (#9)
Re: pg_xlogfile_name_offset() et al and recovery

On 8 July 2016 at 00:43, Michael Paquier <michael.paquier@gmail.com> wrote:

On Thu, Jul 7, 2016 at 6:26 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

The behaviour of that function is defined in backbranches, so I suggest

we

should not alter that now.

Hm..

What we can do is have another function that makes it clearer that the
answer is variable.
pg_xlogfile_name_offset(offset, timelineid)
always succeeds on standby but needs to be told what timelineid to use

This has clearly value, I have wanted to override the timeline number
a couple of times now. I have always done so with a custom function
and saw many people doing it at application level, but it is a weird
design to have the 1-argument version fail for a node in recovery, and
the 2-argument version succeed. I'd rather have everything consistent
to be honest, with the same function name, and the behavior clearly
documented.

But its been like that for some time; by agreement we don't try to change
the past, just improve the future. I wish it had not been done that way,
but it was. Some things are backpatchable, others not. If you get others to
agree, I'd be fine with it, I'm just trying to explain what I understand to
be the limits of what can be changed.

What I do agree with is the need to fix something in 9.6 here. Releasing a
new feature with an obvious flaw helps nobody, so we must have a function
of some kind that allows you to find the filename for the start and stop
points of backup from a standby.

then we have another function
pg_last_xact_replay_timeline() that allows you to find out the current
timeline

It would be good to have an equivalent pg_current_xlog_timeline as
well that can run on nodes not in recovery. I agree that having a
separate function makes more sense as its equivalents for WAL
positions.

Agreed. I prefer your name for that function.

The actual problem here is somewhat more convoluted, because we would

like

to know the timeline of the basebackup at start and stop. That

information

is not easily available from the backup label data returned by
pg_stop_backup().
We would like to do something like this...

select pg_xlogfile_name_offset(l.lsn, l.tli) from pg_stop_backup(false)

l;

So I suggest we add another column to the output of pg_stop_backup() to
return the tli value directly.

This would be useful as well, that's less custom-parsing logic for
applications based on the segment name in backup_label.

Cool

Note that I am not personally planning to write a patch for all that,
but any people are welcome to pick up this task!

I'm happy to do that.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/&gt;
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#11Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#8)
Re: pg_xlogfile_name_offset() et al and recovery

On Thu, Jul 7, 2016 at 5:26 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

The behaviour of that function is defined in backbranches, so I suggest we
should not alter that now.

+1.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers