Streaming replication and pg_xlogfile_name()
Hi,
In relation to the functions added recently, I found an annoying problem;
pg_xlogfile_name(pg_last_xlog_receive/replay_location()) might report the
wrong name because pg_xlogfile_name() always uses the current timeline,
and a backend doesn't know the actual timeline related to the location
which pg_last_xlog_receive/replay_location() reports. Even if a backend
knows that, pg_xlogfile_name() would be unable to determine which timeline
should be used.
To solve this problem, I'm thiking to add the following functions:
* pg_current_timeline() reports the current timeline ID.
* pg_last_receive_timeline() reports the timeline ID which is related
to the last WAL receive location.
* pg_last_replay_timeline() reports the timeline ID which is related
to the last WAL replay location.
* pg_xlogfile_name(location text [, timeline bigint ]) reports the WAL
file name using the given timeline. By default, the current timeline
is used.
* pg_xlogfile_name_offset(location text [, timeline bigint]) reports
the WAL file name and offset using the given timeline. By default,
the current timeline is used.
If the second parameter is omitted, pg_xlogfile_name() would behave
as it does now. We can get the right WAL file name by giving it the
result of pg_last_receive/replay_timeline().
Thought? Or we should just drop the support of pg_xlogfile_name()
for pg_last_xlog_receive/replay_locadtion()?
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Fujii Masao wrote:
In relation to the functions added recently, I found an annoying problem;
pg_xlogfile_name(pg_last_xlog_receive/replay_location()) might report the
wrong name because pg_xlogfile_name() always uses the current timeline,
and a backend doesn't know the actual timeline related to the location
which pg_last_xlog_receive/replay_location() reports. Even if a backend
knows that, pg_xlogfile_name() would be unable to determine which timeline
should be used.
Hmm, I'm not sure what the use case for this is, but I agree it seems
annoying that you can almost reconstruct the exact filename, but not
quite because of the possible change in timeline ID.
To solve this problem, I'm thiking to add the following functions:
* pg_current_timeline() reports the current timeline ID.
* pg_last_receive_timeline() reports the timeline ID which is related
to the last WAL receive location.
* pg_last_replay_timeline() reports the timeline ID which is related
to the last WAL replay location.
* pg_xlogfile_name(location text [, timeline bigint ]) reports the WAL
file name using the given timeline. By default, the current timeline
is used.
* pg_xlogfile_name_offset(location text [, timeline bigint]) reports
the WAL file name and offset using the given timeline. By default,
the current timeline is used.
That gets quite complicated to use. And there's a little race condition
too: when you call pg_last_replay_timeline() and
pg_last_xlog_replay_location() functions to get the timeline and
XLogRecPtr of the last replayed record, the timeline might change in
between the calls, so you end up with a combination that was never
actually replayed.
How about extending the format of the string returned by
pg_last_xlog_receive/replay_location() to include the timeline ID? When
it currently returns e.g '6/200016C', it could return '1/6/200016C',
where 1 is the timeline ID. Then just teach pg_xlogfile_name[_offset]()
to accept that format as well.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Thu, Jan 28, 2010 at 5:28 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
How about extending the format of the string returned by
pg_last_xlog_receive/replay_location() to include the timeline ID? When
it currently returns e.g '6/200016C', it could return '1/6/200016C',
where 1 is the timeline ID. Then just teach pg_xlogfile_name[_offset]()
to accept that format as well.
Sounds good. The attached patch does so. Also the code is available
in the 'replication' branch in my git repository.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Attachments:
extend_format_of_recovery_info_funcs.patchtext/x-patch; charset=US-ASCII; name=extend_format_of_recovery_info_funcs.patchDownload
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***************
*** 13152,13157 **** postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
--- 13152,13161 ----
This is usually the desired behavior for managing transaction log archiving
behavior, since the preceding file is the last one that currently
needs to be archived.
+ These functions also accept as a parameter the string that consists of timeline and
+ location, separated by a slash. In this case a transaction log file name is computed
+ by using the given timeline. On the other hand, if timeline is not supplied, the
+ current timeline is used for the computation.
</para>
<para>
***************
*** 13198,13210 **** postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
<literal><function>pg_last_xlog_receive_location</function>()</literal>
</entry>
<entry><type>text</type></entry>
! <entry>Get last transaction log location received and synced to disk during
! streaming recovery. If streaming recovery is still in progress
this will increase monotonically. If streaming recovery has completed
then this value will remain static at the value of the last WAL record
received and synced to disk during that recovery. When the server has
been started without a streaming recovery then the return value will be
! InvalidXLogRecPtr (0/0).
</entry>
</row>
<row>
--- 13202,13216 ----
<literal><function>pg_last_xlog_receive_location</function>()</literal>
</entry>
<entry><type>text</type></entry>
! <entry>Get timeline and location of last transaction log received and synced
! to disk during streaming recovery. The return string is separated by a slash,
! the first value indicates the timeline and the other the location.
! If streaming recovery is still in progress
this will increase monotonically. If streaming recovery has completed
then this value will remain static at the value of the last WAL record
received and synced to disk during that recovery. When the server has
been started without a streaming recovery then the return value will be
! <literal>0/0/0</>.
</entry>
</row>
<row>
***************
*** 13212,13223 **** postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
<literal><function>pg_last_xlog_replay_location</function>()</literal>
</entry>
<entry><type>text</type></entry>
! <entry>Get last transaction log location replayed during recovery.
If recovery is still in progress this will increase monotonically.
If recovery has completed then this value will remain static at
the value of the last WAL record applied during that recovery.
When the server has been started normally without a recovery
! then the return value will be InvalidXLogRecPtr (0/0).
</entry>
</row>
</tbody>
--- 13218,13231 ----
<literal><function>pg_last_xlog_replay_location</function>()</literal>
</entry>
<entry><type>text</type></entry>
! <entry>Get timeline and location of last transaction log replayed during
! recovery. The return string is separated by a slash, the first value
! indicates the timeline and the other the location.
If recovery is still in progress this will increase monotonically.
If recovery has completed then this value will remain static at
the value of the last WAL record applied during that recovery.
When the server has been started normally without a recovery
! then the return value will be <literal>0/0/0</>.
</entry>
</row>
</tbody>
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***************
*** 395,400 **** typedef struct XLogCtlData
--- 395,402 ----
TimestampTz recoveryLastXTime;
/* end+1 of the last record replayed */
XLogRecPtr recoveryLastRecPtr;
+ /* tli of last record replayed */
+ TimeLineID recoveryLastTLI;
slock_t info_lck; /* locks shared variables shown above */
} XLogCtlData;
***************
*** 5864,5873 **** StartupXLOG(void)
/* use volatile pointer to prevent code rearrangement */
volatile XLogCtlData *xlogctl = XLogCtl;
! /* initialize shared replayEndRecPtr and recoveryLastRecPtr */
SpinLockAcquire(&xlogctl->info_lck);
xlogctl->replayEndRecPtr = ReadRecPtr;
xlogctl->recoveryLastRecPtr = ReadRecPtr;
SpinLockRelease(&xlogctl->info_lck);
InRedo = true;
--- 5866,5882 ----
/* use volatile pointer to prevent code rearrangement */
volatile XLogCtlData *xlogctl = XLogCtl;
! /*
! * initialize shared replayEndRecPtr, recoveryLastRecPtr and
! * recoveryLastTLI. Actually, the latter two variables don't need to
! * be initialized here since they are expected to be updated at least
! * once until read only connections will have read them. But just in
! * case.
! */
SpinLockAcquire(&xlogctl->info_lck);
xlogctl->replayEndRecPtr = ReadRecPtr;
xlogctl->recoveryLastRecPtr = ReadRecPtr;
+ xlogctl->recoveryLastTLI = curFileTLI;
SpinLockRelease(&xlogctl->info_lck);
InRedo = true;
***************
*** 5995,6005 **** StartupXLOG(void)
error_context_stack = errcontext.previous;
/*
! * Update shared recoveryLastRecPtr after this record has been
! * replayed.
*/
SpinLockAcquire(&xlogctl->info_lck);
xlogctl->recoveryLastRecPtr = EndRecPtr;
SpinLockRelease(&xlogctl->info_lck);
LastRec = ReadRecPtr;
--- 6004,6015 ----
error_context_stack = errcontext.previous;
/*
! * Update shared recoveryLastRecPtr and recoveryLastTLI
! * after this record has been replayed.
*/
SpinLockAcquire(&xlogctl->info_lck);
xlogctl->recoveryLastRecPtr = EndRecPtr;
+ xlogctl->recoveryLastTLI = curFileTLI;
SpinLockRelease(&xlogctl->info_lck);
LastRec = ReadRecPtr;
***************
*** 8334,8340 **** pg_current_xlog_insert_location(PG_FUNCTION_ARGS)
}
/*
! * Report the last WAL receive location (same format as pg_start_backup etc)
*
* This is useful for determining how much of WAL is guaranteed to be received
* and synced to disk by walreceiver.
--- 8344,8350 ----
}
/*
! * Report the last WAL receive tli and location
*
* This is useful for determining how much of WAL is guaranteed to be received
* and synced to disk by walreceiver.
***************
*** 8347,8359 **** pg_last_xlog_receive_location(PG_FUNCTION_ARGS)
recptr = GetWalRcvWriteRecPtr();
! snprintf(location, sizeof(location), "%X/%X",
recptr.xlogid, recptr.xrecoff);
PG_RETURN_TEXT_P(cstring_to_text(location));
}
/*
! * Report the last WAL replay location (same format as pg_start_backup etc)
*
* This is useful for determining how much of WAL is visible to read-only
* connections during recovery.
--- 8357,8370 ----
recptr = GetWalRcvWriteRecPtr();
! snprintf(location, sizeof(location), "%X/%X/%X",
! XLogRecPtrIsInvalid(recptr) ? 0 : GetRecoveryTargetTLI(),
recptr.xlogid, recptr.xrecoff);
PG_RETURN_TEXT_P(cstring_to_text(location));
}
/*
! * Report the last WAL replay tli and location
*
* This is useful for determining how much of WAL is visible to read-only
* connections during recovery.
***************
*** 8363,8377 **** pg_last_xlog_replay_location(PG_FUNCTION_ARGS)
{
/* use volatile pointer to prevent code rearrangement */
volatile XLogCtlData *xlogctl = XLogCtl;
XLogRecPtr recptr;
char location[MAXFNAMELEN];
SpinLockAcquire(&xlogctl->info_lck);
recptr = xlogctl->recoveryLastRecPtr;
SpinLockRelease(&xlogctl->info_lck);
! snprintf(location, sizeof(location), "%X/%X",
! recptr.xlogid, recptr.xrecoff);
PG_RETURN_TEXT_P(cstring_to_text(location));
}
--- 8374,8390 ----
{
/* use volatile pointer to prevent code rearrangement */
volatile XLogCtlData *xlogctl = XLogCtl;
+ TimeLineID tli;
XLogRecPtr recptr;
char location[MAXFNAMELEN];
SpinLockAcquire(&xlogctl->info_lck);
+ tli = xlogctl->recoveryLastTLI;
recptr = xlogctl->recoveryLastRecPtr;
SpinLockRelease(&xlogctl->info_lck);
! snprintf(location, sizeof(location), "%X/%X/%X",
! tli, recptr.xlogid, recptr.xrecoff);
PG_RETURN_TEXT_P(cstring_to_text(location));
}
***************
*** 8379,8384 **** pg_last_xlog_replay_location(PG_FUNCTION_ARGS)
--- 8392,8401 ----
* Compute an xlog file name and decimal byte offset given a WAL location,
* such as is returned by pg_stop_backup() or pg_xlog_switch().
*
+ * Also use the tli for the computation if it's given with a location,
+ * such as is returned by pg_last_xlog_receive_location() or
+ * pg_last_xlog_replay_location().
+ *
* Note that a location exactly at a segment boundary is taken to be in
* the previous segment. This is usually the right thing, since the
* expected usage is to determine which xlog file(s) are ready to archive.
***************
*** 8388,8398 **** pg_xlogfile_name_offset(PG_FUNCTION_ARGS)
--- 8405,8417 ----
{
text *location = PG_GETARG_TEXT_P(0);
char *locationstr;
+ unsigned int utli;
unsigned int uxlogid;
unsigned int uxrecoff;
uint32 xlogid;
uint32 xlogseg;
uint32 xrecoff;
+ TimeLineID tli = ThisTimeLineID;
XLogRecPtr locationpoint;
char xlogfilename[MAXFNAMELEN];
Datum values[2];
***************
*** 8406,8412 **** pg_xlogfile_name_offset(PG_FUNCTION_ARGS)
*/
locationstr = text_to_cstring(location);
! if (sscanf(locationstr, "%X/%X", &uxlogid, &uxrecoff) != 2)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("could not parse transaction log location \"%s\"",
--- 8425,8433 ----
*/
locationstr = text_to_cstring(location);
! if (sscanf(locationstr, "%X/%X/%X", &utli, &uxlogid, &uxrecoff) == 3)
! tli = (TimeLineID) utli;
! else if (sscanf(locationstr, "%X/%X", &uxlogid, &uxrecoff) != 2)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("could not parse transaction log location \"%s\"",
***************
*** 8431,8437 **** pg_xlogfile_name_offset(PG_FUNCTION_ARGS)
* xlogfilename
*/
XLByteToPrevSeg(locationpoint, xlogid, xlogseg);
! XLogFileName(xlogfilename, ThisTimeLineID, xlogid, xlogseg);
values[0] = CStringGetTextDatum(xlogfilename);
isnull[0] = false;
--- 8452,8458 ----
* xlogfilename
*/
XLByteToPrevSeg(locationpoint, xlogid, xlogseg);
! XLogFileName(xlogfilename, tli, xlogid, xlogseg);
values[0] = CStringGetTextDatum(xlogfilename);
isnull[0] = false;
***************
*** 8457,8478 **** pg_xlogfile_name_offset(PG_FUNCTION_ARGS)
/*
* Compute an xlog file name given a WAL location,
* such as is returned by pg_stop_backup() or pg_xlog_switch().
*/
Datum
pg_xlogfile_name(PG_FUNCTION_ARGS)
{
text *location = PG_GETARG_TEXT_P(0);
char *locationstr;
unsigned int uxlogid;
unsigned int uxrecoff;
uint32 xlogid;
uint32 xlogseg;
XLogRecPtr locationpoint;
char xlogfilename[MAXFNAMELEN];
locationstr = text_to_cstring(location);
! if (sscanf(locationstr, "%X/%X", &uxlogid, &uxrecoff) != 2)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("could not parse transaction log location \"%s\"",
--- 8478,8507 ----
/*
* Compute an xlog file name given a WAL location,
* such as is returned by pg_stop_backup() or pg_xlog_switch().
+ *
+ * Also use the tli for the computation if it's given with a location,
+ * such as is returned by pg_last_xlog_receive_location() or
+ * pg_last_xlog_replay_location().
*/
Datum
pg_xlogfile_name(PG_FUNCTION_ARGS)
{
text *location = PG_GETARG_TEXT_P(0);
char *locationstr;
+ unsigned int utli;
unsigned int uxlogid;
unsigned int uxrecoff;
uint32 xlogid;
uint32 xlogseg;
+ TimeLineID tli = ThisTimeLineID;
XLogRecPtr locationpoint;
char xlogfilename[MAXFNAMELEN];
locationstr = text_to_cstring(location);
! if (sscanf(locationstr, "%X/%X/%X", &utli, &uxlogid, &uxrecoff) == 3)
! tli = (TimeLineID) utli;
! else if (sscanf(locationstr, "%X/%X", &uxlogid, &uxrecoff) != 2)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("could not parse transaction log location \"%s\"",
***************
*** 8482,8488 **** pg_xlogfile_name(PG_FUNCTION_ARGS)
locationpoint.xrecoff = uxrecoff;
XLByteToPrevSeg(locationpoint, xlogid, xlogseg);
! XLogFileName(xlogfilename, ThisTimeLineID, xlogid, xlogseg);
PG_RETURN_TEXT_P(cstring_to_text(xlogfilename));
}
--- 8511,8517 ----
locationpoint.xrecoff = uxrecoff;
XLByteToPrevSeg(locationpoint, xlogid, xlogseg);
! XLogFileName(xlogfilename, tli, xlogid, xlogseg);
PG_RETURN_TEXT_P(cstring_to_text(xlogfilename));
}
Fujii Masao wrote:
On Thu, Jan 28, 2010 at 5:28 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:How about extending the format of the string returned by
pg_last_xlog_receive/replay_location() to include the timeline ID? When
it currently returns e.g '6/200016C', it could return '1/6/200016C',
where 1 is the timeline ID. Then just teach pg_xlogfile_name[_offset]()
to accept that format as well.Sounds good. The attached patch does so. Also the code is available
in the 'replication' branch in my git repository.
--- 5866,5882 ---- /* use volatile pointer to prevent code rearrangement */ volatile XLogCtlData *xlogctl = XLogCtl;! /*
! * initialize shared replayEndRecPtr, recoveryLastRecPtr and
! * recoveryLastTLI. Actually, the latter two variables don't need to
! * be initialized here since they are expected to be updated at least
! * once until read only connections will have read them. But just in
! * case.
! */
SpinLockAcquire(&xlogctl->info_lck);
xlogctl->replayEndRecPtr = ReadRecPtr;
xlogctl->recoveryLastRecPtr = ReadRecPtr;
+ xlogctl->recoveryLastTLI = curFileTLI;
SpinLockRelease(&xlogctl->info_lck);InRedo = true;
Thinking about this again, I'm not sure this is a good idea. Using
curFileTLI makes sense if you're going to call pg_xlogfile_name() and
would expect it to return the filename of the file containing the WAL
record being replayed. But in other contexts, it seems strange for
pg_last_replay_timeline() to return the TLI of the first record in the
file, rather than the actual record replayed.
I don't have any better ideas, though.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Mon, Feb 22, 2010 at 9:30 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
Thinking about this again, I'm not sure this is a good idea. Using
curFileTLI makes sense if you're going to call pg_xlogfile_name() and
would expect it to return the filename of the file containing the WAL
record being replayed. But in other contexts, it seems strange for
pg_last_replay_timeline() to return the TLI of the first record in the
file, rather than the actual record replayed.
Umm... though I might misunderstand your point, curFileTLI is the TLI
appearing in the name of WAL file. So it's not the TLI of the first
record in the file, isn't it?
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Fujii Masao wrote:
On Mon, Feb 22, 2010 at 9:30 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:Thinking about this again, I'm not sure this is a good idea. Using
curFileTLI makes sense if you're going to call pg_xlogfile_name() and
would expect it to return the filename of the file containing the WAL
record being replayed. But in other contexts, it seems strange for
pg_last_replay_timeline() to return the TLI of the first record in the
file, rather than the actual record replayed.Umm... though I might misunderstand your point, curFileTLI is the TLI
appearing in the name of WAL file.
Yes.
So it's not the TLI of the first record in the file, isn't it?
Hmm, or is it the TLI of the last record? Not sure. Anyway, if there's a
TLI switch in the current WAL file, curFileTLI doesn't always represent
the TLI of the current record.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Thu, 2010-01-28 at 10:28 +0200, Heikki Linnakangas wrote:
Fujii Masao wrote:
In relation to the functions added recently, I found an annoying problem;
pg_xlogfile_name(pg_last_xlog_receive/replay_location()) might report the
wrong name because pg_xlogfile_name() always uses the current timeline,
and a backend doesn't know the actual timeline related to the location
which pg_last_xlog_receive/replay_location() reports. Even if a backend
knows that, pg_xlogfile_name() would be unable to determine which timeline
should be used.Hmm, I'm not sure what the use case for this is
Agreed. What is the use case for this?
--
Simon Riggs www.2ndQuadrant.com
On Tue, Feb 23, 2010 at 4:08 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
So it's not the TLI of the first record in the file, isn't it?
Hmm, or is it the TLI of the last record? Not sure. Anyway, if there's a
TLI switch in the current WAL file, curFileTLI doesn't always represent
the TLI of the current record.
Hmm. How about using lastPageTLI instead of curFileTLI? lastPageTLI
would always represent the TLI of the current record.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Wed, Feb 24, 2010 at 7:56 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On Thu, 2010-01-28 at 10:28 +0200, Heikki Linnakangas wrote:
Fujii Masao wrote:
In relation to the functions added recently, I found an annoying problem;
pg_xlogfile_name(pg_last_xlog_receive/replay_location()) might report the
wrong name because pg_xlogfile_name() always uses the current timeline,
and a backend doesn't know the actual timeline related to the location
which pg_last_xlog_receive/replay_location() reports. Even if a backend
knows that, pg_xlogfile_name() would be unable to determine which timeline
should be used.Hmm, I'm not sure what the use case for this is
Agreed. What is the use case for this?
Since the current behavior would annoy many users (e.g., [*1]),
I proposed to change it.
[*1]
http://archives.postgresql.org/pgsql-hackers/2010-02/msg02014.php
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Thu, 2010-02-25 at 12:02 +0900, Fujii Masao wrote:
On Wed, Feb 24, 2010 at 7:56 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On Thu, 2010-01-28 at 10:28 +0200, Heikki Linnakangas wrote:
Fujii Masao wrote:
In relation to the functions added recently, I found an annoying problem;
pg_xlogfile_name(pg_last_xlog_receive/replay_location()) might report the
wrong name because pg_xlogfile_name() always uses the current timeline,
and a backend doesn't know the actual timeline related to the location
which pg_last_xlog_receive/replay_location() reports. Even if a backend
knows that, pg_xlogfile_name() would be unable to determine which timeline
should be used.Hmm, I'm not sure what the use case for this is
Agreed. What is the use case for this?
Since the current behavior would annoy many users (e.g., [*1]),
I proposed to change it.[*1]
http://archives.postgresql.org/pgsql-hackers/2010-02/msg02014.php
OK, go for it.
If we expose the timeline as part of an "xlog location", then we should
do that everywhere as a change for 9.0. Clearly, "xlog location" has no
meaning without the timeline anyway, so this seems like a necessary
change not just a quick fix. It breaks compatibility, but since we're
changing replication in 9.0 that shouldn't be a problem.
--
Simon Riggs www.2ndQuadrant.com
On Thu, Feb 25, 2010 at 6:33 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
If we expose the timeline as part of an "xlog location", then we should
do that everywhere as a change for 9.0.
Everywhere? You mean changing the format of the return value of all
the following functions?
- pg_start_backup()
- pg_stop_backup()
- pg_switch_xlog()
- pg_current_xlog_location()
- pg_current_xlog_insert_location()
Clearly, "xlog location" has no
meaning without the timeline anyway, so this seems like a necessary
change not just a quick fix. It breaks compatibility, but since we're
changing replication in 9.0 that shouldn't be a problem.
Umm... ISTM a large number of users would complain about that
change because of compatibility.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Thu, Feb 25, 2010 at 11:57 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
On Tue, Feb 23, 2010 at 4:08 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:So it's not the TLI of the first record in the file, isn't it?
Hmm, or is it the TLI of the last record? Not sure. Anyway, if there's a
TLI switch in the current WAL file, curFileTLI doesn't always represent
the TLI of the current record.Hmm. How about using lastPageTLI instead of curFileTLI? lastPageTLI
would always represent the TLI of the current record.
I attached the revised patch which uses lastPageTLI instead of curFileTLI
as the timeline of the last applied record.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Attachments:
extend_format_of_recovery_info_funcs_v2.patchtext/x-diff; charset=US-ASCII; name=extend_format_of_recovery_info_funcs_v2.patchDownload
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***************
*** 13199,13204 **** postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
--- 13199,13208 ----
This is usually the desired behavior for managing transaction log archiving
behavior, since the preceding file is the last one that currently
needs to be archived.
+ These functions also accept as a parameter the string that consists of timeline and
+ location, separated by a slash. In this case a transaction log file name is computed
+ by using the given timeline. On the other hand, if timeline is not supplied, the
+ current timeline is used for the computation.
</para>
<para>
***************
*** 13245,13257 **** postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
<literal><function>pg_last_xlog_receive_location</function>()</literal>
</entry>
<entry><type>text</type></entry>
! <entry>Get last transaction log location received and synced to disk during
! streaming recovery. If streaming recovery is still in progress
this will increase monotonically. If streaming recovery has completed
then this value will remain static at the value of the last WAL record
received and synced to disk during that recovery. When the server has
been started without a streaming recovery then the return value will be
! InvalidXLogRecPtr (0/0).
</entry>
</row>
<row>
--- 13249,13263 ----
<literal><function>pg_last_xlog_receive_location</function>()</literal>
</entry>
<entry><type>text</type></entry>
! <entry>Get timeline and location of last transaction log received and synced
! to disk during streaming recovery. The return string is separated by a slash,
! the first value indicates the timeline and the other the location.
! If streaming recovery is still in progress
this will increase monotonically. If streaming recovery has completed
then this value will remain static at the value of the last WAL record
received and synced to disk during that recovery. When the server has
been started without a streaming recovery then the return value will be
! <literal>0/0/0</>.
</entry>
</row>
<row>
***************
*** 13259,13270 **** postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
<literal><function>pg_last_xlog_replay_location</function>()</literal>
</entry>
<entry><type>text</type></entry>
! <entry>Get last transaction log location replayed during recovery.
If recovery is still in progress this will increase monotonically.
If recovery has completed then this value will remain static at
the value of the last WAL record applied during that recovery.
When the server has been started normally without a recovery
! then the return value will be InvalidXLogRecPtr (0/0).
</entry>
</row>
</tbody>
--- 13265,13278 ----
<literal><function>pg_last_xlog_replay_location</function>()</literal>
</entry>
<entry><type>text</type></entry>
! <entry>Get timeline and location of last transaction log replayed during
! recovery. The return string is separated by a slash, the first value
! indicates the timeline and the other the location.
If recovery is still in progress this will increase monotonically.
If recovery has completed then this value will remain static at
the value of the last WAL record applied during that recovery.
When the server has been started normally without a recovery
! then the return value will be <literal>0/0/0</>.
</entry>
</row>
</tbody>
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***************
*** 392,397 **** typedef struct XLogCtlData
--- 392,399 ----
TimestampTz recoveryLastXTime;
/* end+1 of the last record replayed */
XLogRecPtr recoveryLastRecPtr;
+ /* tli of last record replayed */
+ TimeLineID recoveryLastTLI;
slock_t info_lck; /* locks shared variables shown above */
} XLogCtlData;
***************
*** 5776,5785 **** StartupXLOG(void)
/* use volatile pointer to prevent code rearrangement */
volatile XLogCtlData *xlogctl = XLogCtl;
! /* initialize shared replayEndRecPtr and recoveryLastRecPtr */
SpinLockAcquire(&xlogctl->info_lck);
xlogctl->replayEndRecPtr = ReadRecPtr;
xlogctl->recoveryLastRecPtr = ReadRecPtr;
SpinLockRelease(&xlogctl->info_lck);
InRedo = true;
--- 5778,5794 ----
/* use volatile pointer to prevent code rearrangement */
volatile XLogCtlData *xlogctl = XLogCtl;
! /*
! * initialize shared replayEndRecPtr, recoveryLastRecPtr and
! * recoveryLastTLI. Actually, the latter two variables don't need to
! * be initialized here since they are expected to be updated at least
! * once until read only connections will have read them. But just in
! * case.
! */
SpinLockAcquire(&xlogctl->info_lck);
xlogctl->replayEndRecPtr = ReadRecPtr;
xlogctl->recoveryLastRecPtr = ReadRecPtr;
+ xlogctl->recoveryLastTLI = lastPageTLI;
SpinLockRelease(&xlogctl->info_lck);
InRedo = true;
***************
*** 5907,5917 **** StartupXLOG(void)
error_context_stack = errcontext.previous;
/*
! * Update shared recoveryLastRecPtr after this record has been
! * replayed.
*/
SpinLockAcquire(&xlogctl->info_lck);
xlogctl->recoveryLastRecPtr = EndRecPtr;
SpinLockRelease(&xlogctl->info_lck);
LastRec = ReadRecPtr;
--- 5916,5927 ----
error_context_stack = errcontext.previous;
/*
! * Update shared recoveryLastRecPtr and recoveryLastTLI
! * after this record has been replayed.
*/
SpinLockAcquire(&xlogctl->info_lck);
xlogctl->recoveryLastRecPtr = EndRecPtr;
+ xlogctl->recoveryLastTLI = lastPageTLI;
SpinLockRelease(&xlogctl->info_lck);
LastRec = ReadRecPtr;
***************
*** 8269,8275 **** pg_current_xlog_insert_location(PG_FUNCTION_ARGS)
}
/*
! * Report the last WAL receive location (same format as pg_start_backup etc)
*
* This is useful for determining how much of WAL is guaranteed to be received
* and synced to disk by walreceiver.
--- 8279,8285 ----
}
/*
! * Report the last WAL receive tli and location
*
* This is useful for determining how much of WAL is guaranteed to be received
* and synced to disk by walreceiver.
***************
*** 8282,8294 **** pg_last_xlog_receive_location(PG_FUNCTION_ARGS)
recptr = GetWalRcvWriteRecPtr();
! snprintf(location, sizeof(location), "%X/%X",
recptr.xlogid, recptr.xrecoff);
PG_RETURN_TEXT_P(cstring_to_text(location));
}
/*
! * Report the last WAL replay location (same format as pg_start_backup etc)
*
* This is useful for determining how much of WAL is visible to read-only
* connections during recovery.
--- 8292,8305 ----
recptr = GetWalRcvWriteRecPtr();
! snprintf(location, sizeof(location), "%X/%X/%X",
! XLogRecPtrIsInvalid(recptr) ? 0 : GetRecoveryTargetTLI(),
recptr.xlogid, recptr.xrecoff);
PG_RETURN_TEXT_P(cstring_to_text(location));
}
/*
! * Report the last WAL replay tli and location
*
* This is useful for determining how much of WAL is visible to read-only
* connections during recovery.
***************
*** 8298,8312 **** pg_last_xlog_replay_location(PG_FUNCTION_ARGS)
{
/* use volatile pointer to prevent code rearrangement */
volatile XLogCtlData *xlogctl = XLogCtl;
XLogRecPtr recptr;
char location[MAXFNAMELEN];
SpinLockAcquire(&xlogctl->info_lck);
recptr = xlogctl->recoveryLastRecPtr;
SpinLockRelease(&xlogctl->info_lck);
! snprintf(location, sizeof(location), "%X/%X",
! recptr.xlogid, recptr.xrecoff);
PG_RETURN_TEXT_P(cstring_to_text(location));
}
--- 8309,8325 ----
{
/* use volatile pointer to prevent code rearrangement */
volatile XLogCtlData *xlogctl = XLogCtl;
+ TimeLineID tli;
XLogRecPtr recptr;
char location[MAXFNAMELEN];
SpinLockAcquire(&xlogctl->info_lck);
+ tli = xlogctl->recoveryLastTLI;
recptr = xlogctl->recoveryLastRecPtr;
SpinLockRelease(&xlogctl->info_lck);
! snprintf(location, sizeof(location), "%X/%X/%X",
! tli, recptr.xlogid, recptr.xrecoff);
PG_RETURN_TEXT_P(cstring_to_text(location));
}
***************
*** 8314,8319 **** pg_last_xlog_replay_location(PG_FUNCTION_ARGS)
--- 8327,8336 ----
* Compute an xlog file name and decimal byte offset given a WAL location,
* such as is returned by pg_stop_backup() or pg_xlog_switch().
*
+ * Also use the tli for the computation if it's given with a location,
+ * such as is returned by pg_last_xlog_receive_location() or
+ * pg_last_xlog_replay_location().
+ *
* Note that a location exactly at a segment boundary is taken to be in
* the previous segment. This is usually the right thing, since the
* expected usage is to determine which xlog file(s) are ready to archive.
***************
*** 8323,8333 **** pg_xlogfile_name_offset(PG_FUNCTION_ARGS)
--- 8340,8352 ----
{
text *location = PG_GETARG_TEXT_P(0);
char *locationstr;
+ unsigned int utli;
unsigned int uxlogid;
unsigned int uxrecoff;
uint32 xlogid;
uint32 xlogseg;
uint32 xrecoff;
+ TimeLineID tli = ThisTimeLineID;
XLogRecPtr locationpoint;
char xlogfilename[MAXFNAMELEN];
Datum values[2];
***************
*** 8341,8347 **** pg_xlogfile_name_offset(PG_FUNCTION_ARGS)
*/
locationstr = text_to_cstring(location);
! if (sscanf(locationstr, "%X/%X", &uxlogid, &uxrecoff) != 2)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("could not parse transaction log location \"%s\"",
--- 8360,8368 ----
*/
locationstr = text_to_cstring(location);
! if (sscanf(locationstr, "%X/%X/%X", &utli, &uxlogid, &uxrecoff) == 3)
! tli = (TimeLineID) utli;
! else if (sscanf(locationstr, "%X/%X", &uxlogid, &uxrecoff) != 2)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("could not parse transaction log location \"%s\"",
***************
*** 8366,8372 **** pg_xlogfile_name_offset(PG_FUNCTION_ARGS)
* xlogfilename
*/
XLByteToPrevSeg(locationpoint, xlogid, xlogseg);
! XLogFileName(xlogfilename, ThisTimeLineID, xlogid, xlogseg);
values[0] = CStringGetTextDatum(xlogfilename);
isnull[0] = false;
--- 8387,8393 ----
* xlogfilename
*/
XLByteToPrevSeg(locationpoint, xlogid, xlogseg);
! XLogFileName(xlogfilename, tli, xlogid, xlogseg);
values[0] = CStringGetTextDatum(xlogfilename);
isnull[0] = false;
***************
*** 8392,8413 **** pg_xlogfile_name_offset(PG_FUNCTION_ARGS)
/*
* Compute an xlog file name given a WAL location,
* such as is returned by pg_stop_backup() or pg_xlog_switch().
*/
Datum
pg_xlogfile_name(PG_FUNCTION_ARGS)
{
text *location = PG_GETARG_TEXT_P(0);
char *locationstr;
unsigned int uxlogid;
unsigned int uxrecoff;
uint32 xlogid;
uint32 xlogseg;
XLogRecPtr locationpoint;
char xlogfilename[MAXFNAMELEN];
locationstr = text_to_cstring(location);
! if (sscanf(locationstr, "%X/%X", &uxlogid, &uxrecoff) != 2)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("could not parse transaction log location \"%s\"",
--- 8413,8442 ----
/*
* Compute an xlog file name given a WAL location,
* such as is returned by pg_stop_backup() or pg_xlog_switch().
+ *
+ * Also use the tli for the computation if it's given with a location,
+ * such as is returned by pg_last_xlog_receive_location() or
+ * pg_last_xlog_replay_location().
*/
Datum
pg_xlogfile_name(PG_FUNCTION_ARGS)
{
text *location = PG_GETARG_TEXT_P(0);
char *locationstr;
+ unsigned int utli;
unsigned int uxlogid;
unsigned int uxrecoff;
uint32 xlogid;
uint32 xlogseg;
+ TimeLineID tli = ThisTimeLineID;
XLogRecPtr locationpoint;
char xlogfilename[MAXFNAMELEN];
locationstr = text_to_cstring(location);
! if (sscanf(locationstr, "%X/%X/%X", &utli, &uxlogid, &uxrecoff) == 3)
! tli = (TimeLineID) utli;
! else if (sscanf(locationstr, "%X/%X", &uxlogid, &uxrecoff) != 2)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("could not parse transaction log location \"%s\"",
***************
*** 8417,8423 **** pg_xlogfile_name(PG_FUNCTION_ARGS)
locationpoint.xrecoff = uxrecoff;
XLByteToPrevSeg(locationpoint, xlogid, xlogseg);
! XLogFileName(xlogfilename, ThisTimeLineID, xlogid, xlogseg);
PG_RETURN_TEXT_P(cstring_to_text(xlogfilename));
}
--- 8446,8452 ----
locationpoint.xrecoff = uxrecoff;
XLByteToPrevSeg(locationpoint, xlogid, xlogseg);
! XLogFileName(xlogfilename, tli, xlogid, xlogseg);
PG_RETURN_TEXT_P(cstring_to_text(xlogfilename));
}
On Thu, February 25, 2010 17:34, Fujii Masao wrote:
I attached the revised patch which uses lastPageTLI instead of curFileTLI
as the timeline of the last applied record.
With this patch the standby compiles, tests, installs OK.
I wanted to check with you if the following is expected.
With standby (correctly) as follows :
LOG: redo starts at 0/1000020
LOG: consistent recovery state reached at 0/2000000
LOG: database system is ready to accept read only connections
This is OK.
However, initially (even after the above 'ready' message)
the timeline value as reported by
pg_xlogfile_name_offset(pg_last_xlog_replay_location())
is zero.
After 5 minutes or so (without any activity on primary
or standby), it proceeds to 1 (see below):
(standby)
2010.02.25 21:58:21 $ psql
psql (9.0devel)
Type "help" for help.
replicas=# \x
Expanded display is on.
replicas=# select
pg_last_xlog_replay_location()
, pg_xlogfile_name_offset(pg_last_xlog_replay_location())
, pg_last_xlog_receive_location()
, pg_xlogfile_name_offset(pg_last_xlog_receive_location())
, now();
-[ RECORD 1 ]-----------------+------------------------------------
pg_last_xlog_replay_location | 0/0/2000000
pg_xlogfile_name_offset | (000000000000000000000001,16777216)
pg_last_xlog_receive_location | 1/0/2000000
pg_xlogfile_name_offset | (000000010000000000000001,16777216)
now | 2010-02-25 22:03:41.585808+01
replicas=# select
pg_last_xlog_replay_location()
, pg_xlogfile_name_offset(pg_last_xlog_replay_location())
, pg_last_xlog_receive_location()
, pg_xlogfile_name_offset(pg_last_xlog_receive_location())
, now();
-[ RECORD 1 ]-----------------+------------------------------------
pg_last_xlog_replay_location | 0/0/2000000
pg_xlogfile_name_offset | (000000000000000000000001,16777216)
pg_last_xlog_receive_location | 1/0/2000000
pg_xlogfile_name_offset | (000000010000000000000001,16777216)
now | 2010-02-25 22:06:56.008181+01
replicas=# select
pg_last_xlog_replay_location()
, pg_xlogfile_name_offset(pg_last_xlog_replay_location())
, pg_last_xlog_receive_location()
, pg_xlogfile_name_offset(pg_last_xlog_receive_location())
, now();
-[ RECORD 1 ]-----------------+-------------------------------
pg_last_xlog_replay_location | 1/0/20000B8
pg_xlogfile_name_offset | (000000010000000000000002,184)
pg_last_xlog_receive_location | 1/0/20000B8
pg_xlogfile_name_offset | (000000010000000000000002,184)
now | 2010-02-25 22:07:51.368363+01
I not sure this qualifies as a bug, but if not, it should probably be mentioned somewhere in the
documentation.
(Oh, and to answer Heikki's earlier question, "what you trying to achieve?": I am trying to keep
track of how far behind the standby is when I restore a large dump (500 GB or so) into the primary
(eventually I want at the same time run pgbench on both).)
thanks,
Erik Rijkers
Sorry for the delay.
On Fri, Feb 26, 2010 at 6:26 AM, Erik Rijkers <er@xs4all.nl> wrote:
With this patch the standby compiles, tests, installs OK.
I wanted to check with you if the following is expected.
Thanks for the test and bug report!
With standby (correctly) as follows :
LOG: redo starts at 0/1000020
LOG: consistent recovery state reached at 0/2000000
LOG: database system is ready to accept read only connectionsThis is OK.
However, initially (even after the above 'ready' message)
the timeline value as reported by
pg_xlogfile_name_offset(pg_last_xlog_replay_location())
is zero.
When we try to read the WAL record discontinuously (e.g., the REDO
starting record and the last applied record), the lastPageTLI is
always reset. If that record is not in the buffer, it's read from
the disk and the lastPageTLI is set to the right timeline. Otherwise,
the lastPageTLI remains at zero wrongly. This is the cause of the
problem that you reported.
I revised the patch so that the lastPageTLI is always set correctly.
Please try this new patch.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Attachments:
extend_format_of_recovery_info_funcs_v3.patchtext/x-patch; charset=US-ASCII; name=extend_format_of_recovery_info_funcs_v3.patchDownload
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***************
*** 13199,13204 **** postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
--- 13199,13208 ----
This is usually the desired behavior for managing transaction log archiving
behavior, since the preceding file is the last one that currently
needs to be archived.
+ These functions also accept as a parameter the string that consists of timeline and
+ location, separated by a slash. In this case a transaction log file name is computed
+ by using the given timeline. On the other hand, if timeline is not supplied, the
+ current timeline is used for the computation.
</para>
<para>
***************
*** 13245,13257 **** postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
<literal><function>pg_last_xlog_receive_location</function>()</literal>
</entry>
<entry><type>text</type></entry>
! <entry>Get last transaction log location received and synced to disk during
! streaming recovery. If streaming recovery is still in progress
this will increase monotonically. If streaming recovery has completed
then this value will remain static at the value of the last WAL record
received and synced to disk during that recovery. When the server has
been started without a streaming recovery then the return value will be
! InvalidXLogRecPtr (0/0).
</entry>
</row>
<row>
--- 13249,13263 ----
<literal><function>pg_last_xlog_receive_location</function>()</literal>
</entry>
<entry><type>text</type></entry>
! <entry>Get timeline and location of last transaction log received and synced
! to disk during streaming recovery. The return string is separated by a slash,
! the first value indicates the timeline and the other the location.
! If streaming recovery is still in progress
this will increase monotonically. If streaming recovery has completed
then this value will remain static at the value of the last WAL record
received and synced to disk during that recovery. When the server has
been started without a streaming recovery then the return value will be
! <literal>0/0/0</>.
</entry>
</row>
<row>
***************
*** 13259,13270 **** postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
<literal><function>pg_last_xlog_replay_location</function>()</literal>
</entry>
<entry><type>text</type></entry>
! <entry>Get last transaction log location replayed during recovery.
If recovery is still in progress this will increase monotonically.
If recovery has completed then this value will remain static at
the value of the last WAL record applied during that recovery.
When the server has been started normally without a recovery
! then the return value will be InvalidXLogRecPtr (0/0).
</entry>
</row>
</tbody>
--- 13265,13278 ----
<literal><function>pg_last_xlog_replay_location</function>()</literal>
</entry>
<entry><type>text</type></entry>
! <entry>Get timeline and location of last transaction log replayed during
! recovery. The return string is separated by a slash, the first value
! indicates the timeline and the other the location.
If recovery is still in progress this will increase monotonically.
If recovery has completed then this value will remain static at
the value of the last WAL record applied during that recovery.
When the server has been started normally without a recovery
! then the return value will be <literal>0/0/0</>.
</entry>
</row>
</tbody>
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***************
*** 392,397 **** typedef struct XLogCtlData
--- 392,399 ----
TimestampTz recoveryLastXTime;
/* end+1 of the last record replayed */
XLogRecPtr recoveryLastRecPtr;
+ /* tli of last record replayed */
+ TimeLineID recoveryLastTLI;
slock_t info_lck; /* locks shared variables shown above */
} XLogCtlData;
***************
*** 3581,3594 **** ReadRecord(XLogRecPtr *RecPtr, int emode_arg, bool fetching_ckpt)
RecPtr->xlogid, RecPtr->xrecoff)));
/*
! * Since we are going to a random position in WAL, forget any prior
! * state about what timeline we were in, and allow it to be any
! * timeline in expectedTLIs. We also set a flag to allow curFileTLI
! * to go backwards (but we can't reset that variable right here, since
! * we might not change files at all).
*/
! lastPageTLI = 0; /* see comment in ValidXLOGHeader */
! randAccess = true; /* allow curFileTLI to go backwards too */
}
/* Read the page containing the record */
--- 3583,3593 ----
RecPtr->xlogid, RecPtr->xrecoff)));
/*
! * Since we are going to a random position in WAL, set a flag
! * to allow curFileTLI to go backwards (but we can't reset that
! * variable right here, since we might not change files at all).
*/
! randAccess = true; /* allow curFileTLI to go backwards */
}
/* Read the page containing the record */
***************
*** 5782,5791 **** StartupXLOG(void)
/* use volatile pointer to prevent code rearrangement */
volatile XLogCtlData *xlogctl = XLogCtl;
! /* initialize shared replayEndRecPtr and recoveryLastRecPtr */
SpinLockAcquire(&xlogctl->info_lck);
xlogctl->replayEndRecPtr = ReadRecPtr;
xlogctl->recoveryLastRecPtr = ReadRecPtr;
SpinLockRelease(&xlogctl->info_lck);
InRedo = true;
--- 5781,5797 ----
/* use volatile pointer to prevent code rearrangement */
volatile XLogCtlData *xlogctl = XLogCtl;
! /*
! * initialize shared replayEndRecPtr, recoveryLastRecPtr and
! * recoveryLastTLI. Actually, the latter two variables don't need to
! * be initialized here since they are expected to be updated at least
! * once until read only connections will have read them. But just in
! * case.
! */
SpinLockAcquire(&xlogctl->info_lck);
xlogctl->replayEndRecPtr = ReadRecPtr;
xlogctl->recoveryLastRecPtr = ReadRecPtr;
+ xlogctl->recoveryLastTLI = lastPageTLI;
SpinLockRelease(&xlogctl->info_lck);
InRedo = true;
***************
*** 5913,5923 **** StartupXLOG(void)
error_context_stack = errcontext.previous;
/*
! * Update shared recoveryLastRecPtr after this record has been
! * replayed.
*/
SpinLockAcquire(&xlogctl->info_lck);
xlogctl->recoveryLastRecPtr = EndRecPtr;
SpinLockRelease(&xlogctl->info_lck);
LastRec = ReadRecPtr;
--- 5919,5930 ----
error_context_stack = errcontext.previous;
/*
! * Update shared recoveryLastRecPtr and recoveryLastTLI
! * after this record has been replayed.
*/
SpinLockAcquire(&xlogctl->info_lck);
xlogctl->recoveryLastRecPtr = EndRecPtr;
+ xlogctl->recoveryLastTLI = lastPageTLI;
SpinLockRelease(&xlogctl->info_lck);
LastRec = ReadRecPtr;
***************
*** 8274,8280 **** pg_current_xlog_insert_location(PG_FUNCTION_ARGS)
}
/*
! * Report the last WAL receive location (same format as pg_start_backup etc)
*
* This is useful for determining how much of WAL is guaranteed to be received
* and synced to disk by walreceiver.
--- 8281,8287 ----
}
/*
! * Report the last WAL receive tli and location
*
* This is useful for determining how much of WAL is guaranteed to be received
* and synced to disk by walreceiver.
***************
*** 8287,8299 **** pg_last_xlog_receive_location(PG_FUNCTION_ARGS)
recptr = GetWalRcvWriteRecPtr();
! snprintf(location, sizeof(location), "%X/%X",
recptr.xlogid, recptr.xrecoff);
PG_RETURN_TEXT_P(cstring_to_text(location));
}
/*
! * Report the last WAL replay location (same format as pg_start_backup etc)
*
* This is useful for determining how much of WAL is visible to read-only
* connections during recovery.
--- 8294,8307 ----
recptr = GetWalRcvWriteRecPtr();
! snprintf(location, sizeof(location), "%X/%X/%X",
! XLogRecPtrIsInvalid(recptr) ? 0 : GetRecoveryTargetTLI(),
recptr.xlogid, recptr.xrecoff);
PG_RETURN_TEXT_P(cstring_to_text(location));
}
/*
! * Report the last WAL replay tli and location
*
* This is useful for determining how much of WAL is visible to read-only
* connections during recovery.
***************
*** 8303,8317 **** pg_last_xlog_replay_location(PG_FUNCTION_ARGS)
{
/* use volatile pointer to prevent code rearrangement */
volatile XLogCtlData *xlogctl = XLogCtl;
XLogRecPtr recptr;
char location[MAXFNAMELEN];
SpinLockAcquire(&xlogctl->info_lck);
recptr = xlogctl->recoveryLastRecPtr;
SpinLockRelease(&xlogctl->info_lck);
! snprintf(location, sizeof(location), "%X/%X",
! recptr.xlogid, recptr.xrecoff);
PG_RETURN_TEXT_P(cstring_to_text(location));
}
--- 8311,8327 ----
{
/* use volatile pointer to prevent code rearrangement */
volatile XLogCtlData *xlogctl = XLogCtl;
+ TimeLineID tli;
XLogRecPtr recptr;
char location[MAXFNAMELEN];
SpinLockAcquire(&xlogctl->info_lck);
+ tli = xlogctl->recoveryLastTLI;
recptr = xlogctl->recoveryLastRecPtr;
SpinLockRelease(&xlogctl->info_lck);
! snprintf(location, sizeof(location), "%X/%X/%X",
! tli, recptr.xlogid, recptr.xrecoff);
PG_RETURN_TEXT_P(cstring_to_text(location));
}
***************
*** 8319,8324 **** pg_last_xlog_replay_location(PG_FUNCTION_ARGS)
--- 8329,8338 ----
* Compute an xlog file name and decimal byte offset given a WAL location,
* such as is returned by pg_stop_backup() or pg_xlog_switch().
*
+ * Also use the tli for the computation if it's given with a location,
+ * such as is returned by pg_last_xlog_receive_location() or
+ * pg_last_xlog_replay_location().
+ *
* Note that a location exactly at a segment boundary is taken to be in
* the previous segment. This is usually the right thing, since the
* expected usage is to determine which xlog file(s) are ready to archive.
***************
*** 8328,8338 **** pg_xlogfile_name_offset(PG_FUNCTION_ARGS)
--- 8342,8354 ----
{
text *location = PG_GETARG_TEXT_P(0);
char *locationstr;
+ unsigned int utli;
unsigned int uxlogid;
unsigned int uxrecoff;
uint32 xlogid;
uint32 xlogseg;
uint32 xrecoff;
+ TimeLineID tli = ThisTimeLineID;
XLogRecPtr locationpoint;
char xlogfilename[MAXFNAMELEN];
Datum values[2];
***************
*** 8346,8352 **** pg_xlogfile_name_offset(PG_FUNCTION_ARGS)
*/
locationstr = text_to_cstring(location);
! if (sscanf(locationstr, "%X/%X", &uxlogid, &uxrecoff) != 2)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("could not parse transaction log location \"%s\"",
--- 8362,8370 ----
*/
locationstr = text_to_cstring(location);
! if (sscanf(locationstr, "%X/%X/%X", &utli, &uxlogid, &uxrecoff) == 3)
! tli = (TimeLineID) utli;
! else if (sscanf(locationstr, "%X/%X", &uxlogid, &uxrecoff) != 2)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("could not parse transaction log location \"%s\"",
***************
*** 8371,8377 **** pg_xlogfile_name_offset(PG_FUNCTION_ARGS)
* xlogfilename
*/
XLByteToPrevSeg(locationpoint, xlogid, xlogseg);
! XLogFileName(xlogfilename, ThisTimeLineID, xlogid, xlogseg);
values[0] = CStringGetTextDatum(xlogfilename);
isnull[0] = false;
--- 8389,8395 ----
* xlogfilename
*/
XLByteToPrevSeg(locationpoint, xlogid, xlogseg);
! XLogFileName(xlogfilename, tli, xlogid, xlogseg);
values[0] = CStringGetTextDatum(xlogfilename);
isnull[0] = false;
***************
*** 8397,8418 **** pg_xlogfile_name_offset(PG_FUNCTION_ARGS)
/*
* Compute an xlog file name given a WAL location,
* such as is returned by pg_stop_backup() or pg_xlog_switch().
*/
Datum
pg_xlogfile_name(PG_FUNCTION_ARGS)
{
text *location = PG_GETARG_TEXT_P(0);
char *locationstr;
unsigned int uxlogid;
unsigned int uxrecoff;
uint32 xlogid;
uint32 xlogseg;
XLogRecPtr locationpoint;
char xlogfilename[MAXFNAMELEN];
locationstr = text_to_cstring(location);
! if (sscanf(locationstr, "%X/%X", &uxlogid, &uxrecoff) != 2)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("could not parse transaction log location \"%s\"",
--- 8415,8444 ----
/*
* Compute an xlog file name given a WAL location,
* such as is returned by pg_stop_backup() or pg_xlog_switch().
+ *
+ * Also use the tli for the computation if it's given with a location,
+ * such as is returned by pg_last_xlog_receive_location() or
+ * pg_last_xlog_replay_location().
*/
Datum
pg_xlogfile_name(PG_FUNCTION_ARGS)
{
text *location = PG_GETARG_TEXT_P(0);
char *locationstr;
+ unsigned int utli;
unsigned int uxlogid;
unsigned int uxrecoff;
uint32 xlogid;
uint32 xlogseg;
+ TimeLineID tli = ThisTimeLineID;
XLogRecPtr locationpoint;
char xlogfilename[MAXFNAMELEN];
locationstr = text_to_cstring(location);
! if (sscanf(locationstr, "%X/%X/%X", &utli, &uxlogid, &uxrecoff) == 3)
! tli = (TimeLineID) utli;
! else if (sscanf(locationstr, "%X/%X", &uxlogid, &uxrecoff) != 2)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("could not parse transaction log location \"%s\"",
***************
*** 8422,8428 **** pg_xlogfile_name(PG_FUNCTION_ARGS)
locationpoint.xrecoff = uxrecoff;
XLByteToPrevSeg(locationpoint, xlogid, xlogseg);
! XLogFileName(xlogfilename, ThisTimeLineID, xlogid, xlogseg);
PG_RETURN_TEXT_P(cstring_to_text(xlogfilename));
}
--- 8448,8454 ----
locationpoint.xrecoff = uxrecoff;
XLByteToPrevSeg(locationpoint, xlogid, xlogseg);
! XLogFileName(xlogfilename, tli, xlogid, xlogseg);
PG_RETURN_TEXT_P(cstring_to_text(xlogfilename));
}
***************
*** 8721,8726 **** XLogPageRead(XLogRecPtr *RecPtr, int emode, bool fetching_ckpt,
--- 8747,8760 ----
return true;
/*
+ * Since we are going to a random position in WAL, forget any prior
+ * state about what timeline we were in, and allow it to be any
+ * timeline in expectedTLIs.
+ */
+ if (randAccess)
+ lastPageTLI = 0; /* see comment in ValidXLOGHeader */
+
+ /*
* See if we need to switch to a new segment because the requested record
* is not in the currently open one.
*/
Fujii Masao wrote:
On Fri, Feb 26, 2010 at 6:26 AM, Erik Rijkers <er@xs4all.nl> wrote:
With this patch the standby compiles, tests, installs OK.
I wanted to check with you if the following is expected.Thanks for the test and bug report!
With standby (correctly) as follows :
LOG: redo starts at 0/1000020
LOG: consistent recovery state reached at 0/2000000
LOG: database system is ready to accept read only connectionsThis is OK.
However, initially (even after the above 'ready' message)
the timeline value as reported by
pg_xlogfile_name_offset(pg_last_xlog_replay_location())
is zero.When we try to read the WAL record discontinuously (e.g., the REDO
starting record and the last applied record), the lastPageTLI is
always reset. If that record is not in the buffer, it's read from
the disk and the lastPageTLI is set to the right timeline. Otherwise,
the lastPageTLI remains at zero wrongly. This is the cause of the
problem that you reported.I revised the patch so that the lastPageTLI is always set correctly.
Please try this new patch.
This still suffers from ambiguity around a shutdown checkpoint that
changes the TLI. On the page the shutdown checkpoint is on, what is the
TLI in the page header? The TLI before the checkpoint record, I presume.
Now consider a record on the same page after the checkpoint record. It's
on the new timeline, but pg_last_xlog_replay_location() will return the
old TLI, because that's on the page header.
It's not clear what it should return, a TLI corresponding the filename
of the WAL segment the record was replayed from, so that you can use
pg_xlogfile_name() to find out the filename of the WAL segment being
replayed, or the accurate TLI of the record being replayed. I'm leaning
towards the latter, it feels more correct and accurate, but you could
argue for the former too. In any case, it needs to be well-defined.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Tue, Mar 2, 2010 at 8:54 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
This still suffers from ambiguity around a shutdown checkpoint that
changes the TLI. On the page the shutdown checkpoint is on, what is the
TLI in the page header? The TLI before the checkpoint record, I presume.
Now consider a record on the same page after the checkpoint record. It's
on the new timeline, but pg_last_xlog_replay_location() will return the
old TLI, because that's on the page header.
Oh, I see. You are right.
It's not clear what it should return, a TLI corresponding the filename
of the WAL segment the record was replayed from, so that you can use
pg_xlogfile_name() to find out the filename of the WAL segment being
replayed, or the accurate TLI of the record being replayed. I'm leaning
towards the latter, it feels more correct and accurate, but you could
argue for the former too. In any case, it needs to be well-defined.
I agree with you that the latter is more correct and accurate. The simple
fix is updating the lastPageTLI with the CheckPoint->ThisTimeLineID when
replaying the shutdown checkpoint record. Though we might need to use new
variable to keep the last applied timeline instead of the lastPageTLI.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Tue, Mar 2, 2010 at 10:52 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
It's not clear what it should return, a TLI corresponding the filename
of the WAL segment the record was replayed from, so that you can use
pg_xlogfile_name() to find out the filename of the WAL segment being
replayed, or the accurate TLI of the record being replayed. I'm leaning
towards the latter, it feels more correct and accurate, but you could
argue for the former too. In any case, it needs to be well-defined.I agree with you that the latter is more correct and accurate. The simple
fix is updating the lastPageTLI with the CheckPoint->ThisTimeLineID when
replaying the shutdown checkpoint record. Though we might need to use new
variable to keep the last applied timeline instead of the lastPageTLI.
Here is the revised patch. I used new local variable instead of lastPageTLI
to track the tli of last applied record. It is updated with the tli of the
log page header when reading the page, and with the tli of the checkpoint
record when replaying the checkpoint shutdown record that changes the tli.
So pg_last_xlog_replay_location() can return the accurate tli of the last
applied record.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Attachments:
extend_format_of_recovery_info_funcs_v4.patchtext/x-patch; charset=US-ASCII; name=extend_format_of_recovery_info_funcs_v4.patchDownload
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***************
*** 13199,13204 **** postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
--- 13199,13208 ----
This is usually the desired behavior for managing transaction log archiving
behavior, since the preceding file is the last one that currently
needs to be archived.
+ These functions also accept as a parameter the string that consists of timeline and
+ location, separated by a slash. In this case a transaction log file name is computed
+ by using the given timeline. On the other hand, if timeline is not supplied, the
+ current timeline is used for the computation.
</para>
<para>
***************
*** 13245,13257 **** postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
<literal><function>pg_last_xlog_receive_location</function>()</literal>
</entry>
<entry><type>text</type></entry>
! <entry>Get last transaction log location received and synced to disk during
! streaming recovery. If streaming recovery is still in progress
this will increase monotonically. If streaming recovery has completed
then this value will remain static at the value of the last WAL record
received and synced to disk during that recovery. When the server has
been started without a streaming recovery then the return value will be
! InvalidXLogRecPtr (0/0).
</entry>
</row>
<row>
--- 13249,13263 ----
<literal><function>pg_last_xlog_receive_location</function>()</literal>
</entry>
<entry><type>text</type></entry>
! <entry>Get timeline and location of last transaction log received and synced
! to disk during streaming recovery. The return string is separated by a slash,
! the first value indicates the timeline and the other the location.
! If streaming recovery is still in progress
this will increase monotonically. If streaming recovery has completed
then this value will remain static at the value of the last WAL record
received and synced to disk during that recovery. When the server has
been started without a streaming recovery then the return value will be
! <literal>0/0/0</>.
</entry>
</row>
<row>
***************
*** 13259,13270 **** postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
<literal><function>pg_last_xlog_replay_location</function>()</literal>
</entry>
<entry><type>text</type></entry>
! <entry>Get last transaction log location replayed during recovery.
If recovery is still in progress this will increase monotonically.
If recovery has completed then this value will remain static at
the value of the last WAL record applied during that recovery.
When the server has been started normally without a recovery
! then the return value will be InvalidXLogRecPtr (0/0).
</entry>
</row>
</tbody>
--- 13265,13278 ----
<literal><function>pg_last_xlog_replay_location</function>()</literal>
</entry>
<entry><type>text</type></entry>
! <entry>Get timeline and location of last transaction log replayed during
! recovery. The return string is separated by a slash, the first value
! indicates the timeline and the other the location.
If recovery is still in progress this will increase monotonically.
If recovery has completed then this value will remain static at
the value of the last WAL record applied during that recovery.
When the server has been started normally without a recovery
! then the return value will be <literal>0/0/0</>.
</entry>
</row>
</tbody>
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***************
*** 392,397 **** typedef struct XLogCtlData
--- 392,399 ----
TimestampTz recoveryLastXTime;
/* end+1 of the last record replayed */
XLogRecPtr recoveryLastRecPtr;
+ /* tli of last record replayed */
+ TimeLineID recoveryLastTLI;
slock_t info_lck; /* locks shared variables shown above */
} XLogCtlData;
***************
*** 471,476 **** static uint32 readRecordBufSize = 0;
--- 473,479 ----
static XLogRecPtr ReadRecPtr; /* start of last record read */
static XLogRecPtr EndRecPtr; /* end+1 of last record read */
static TimeLineID lastPageTLI = 0;
+ static TimeLineID lastRecTLI = 0; /* tli of last record read */
static XLogRecPtr minRecoveryPoint; /* local copy of
* ControlFile->minRecoveryPoint */
***************
*** 3943,3949 **** ValidXLOGHeader(XLogPageHeader hdr, int emode)
readId, readSeg, readOff)));
return false;
}
! lastPageTLI = hdr->xlp_tli;
return true;
}
--- 3946,3952 ----
readId, readSeg, readOff)));
return false;
}
! lastRecTLI = lastPageTLI = hdr->xlp_tli;
return true;
}
***************
*** 5782,5791 **** StartupXLOG(void)
/* use volatile pointer to prevent code rearrangement */
volatile XLogCtlData *xlogctl = XLogCtl;
! /* initialize shared replayEndRecPtr and recoveryLastRecPtr */
SpinLockAcquire(&xlogctl->info_lck);
xlogctl->replayEndRecPtr = ReadRecPtr;
xlogctl->recoveryLastRecPtr = ReadRecPtr;
SpinLockRelease(&xlogctl->info_lck);
InRedo = true;
--- 5785,5801 ----
/* use volatile pointer to prevent code rearrangement */
volatile XLogCtlData *xlogctl = XLogCtl;
! /*
! * initialize shared replayEndRecPtr, recoveryLastRecPtr and
! * recoveryLastTLI. Actually, the latter two variables don't need to
! * be initialized here since they are expected to be updated at least
! * once until read only connections will have read them. But just in
! * case.
! */
SpinLockAcquire(&xlogctl->info_lck);
xlogctl->replayEndRecPtr = ReadRecPtr;
xlogctl->recoveryLastRecPtr = ReadRecPtr;
+ xlogctl->recoveryLastTLI = lastRecTLI;
SpinLockRelease(&xlogctl->info_lck);
InRedo = true;
***************
*** 5913,5923 **** StartupXLOG(void)
error_context_stack = errcontext.previous;
/*
! * Update shared recoveryLastRecPtr after this record has been
! * replayed.
*/
SpinLockAcquire(&xlogctl->info_lck);
xlogctl->recoveryLastRecPtr = EndRecPtr;
SpinLockRelease(&xlogctl->info_lck);
LastRec = ReadRecPtr;
--- 5923,5934 ----
error_context_stack = errcontext.previous;
/*
! * Update shared recoveryLastRecPtr and recoveryLastTLI
! * after this record has been replayed.
*/
SpinLockAcquire(&xlogctl->info_lck);
xlogctl->recoveryLastRecPtr = EndRecPtr;
+ xlogctl->recoveryLastTLI = lastRecTLI;
SpinLockRelease(&xlogctl->info_lck);
LastRec = ReadRecPtr;
***************
*** 7479,7484 **** xlog_redo(XLogRecPtr lsn, XLogRecord *record)
--- 7490,7496 ----
/* Following WAL records should be run with new TLI */
ThisTimeLineID = checkPoint.ThisTimeLineID;
}
+ lastRecTLI = ThisTimeLineID;
RecoveryRestartPoint(&checkPoint);
}
***************
*** 8274,8280 **** pg_current_xlog_insert_location(PG_FUNCTION_ARGS)
}
/*
! * Report the last WAL receive location (same format as pg_start_backup etc)
*
* This is useful for determining how much of WAL is guaranteed to be received
* and synced to disk by walreceiver.
--- 8286,8292 ----
}
/*
! * Report the last WAL receive tli and location
*
* This is useful for determining how much of WAL is guaranteed to be received
* and synced to disk by walreceiver.
***************
*** 8287,8299 **** pg_last_xlog_receive_location(PG_FUNCTION_ARGS)
recptr = GetWalRcvWriteRecPtr();
! snprintf(location, sizeof(location), "%X/%X",
recptr.xlogid, recptr.xrecoff);
PG_RETURN_TEXT_P(cstring_to_text(location));
}
/*
! * Report the last WAL replay location (same format as pg_start_backup etc)
*
* This is useful for determining how much of WAL is visible to read-only
* connections during recovery.
--- 8299,8312 ----
recptr = GetWalRcvWriteRecPtr();
! snprintf(location, sizeof(location), "%X/%X/%X",
! XLogRecPtrIsInvalid(recptr) ? 0 : GetRecoveryTargetTLI(),
recptr.xlogid, recptr.xrecoff);
PG_RETURN_TEXT_P(cstring_to_text(location));
}
/*
! * Report the last WAL replay tli and location
*
* This is useful for determining how much of WAL is visible to read-only
* connections during recovery.
***************
*** 8303,8317 **** pg_last_xlog_replay_location(PG_FUNCTION_ARGS)
{
/* use volatile pointer to prevent code rearrangement */
volatile XLogCtlData *xlogctl = XLogCtl;
XLogRecPtr recptr;
char location[MAXFNAMELEN];
SpinLockAcquire(&xlogctl->info_lck);
recptr = xlogctl->recoveryLastRecPtr;
SpinLockRelease(&xlogctl->info_lck);
! snprintf(location, sizeof(location), "%X/%X",
! recptr.xlogid, recptr.xrecoff);
PG_RETURN_TEXT_P(cstring_to_text(location));
}
--- 8316,8332 ----
{
/* use volatile pointer to prevent code rearrangement */
volatile XLogCtlData *xlogctl = XLogCtl;
+ TimeLineID tli;
XLogRecPtr recptr;
char location[MAXFNAMELEN];
SpinLockAcquire(&xlogctl->info_lck);
+ tli = xlogctl->recoveryLastTLI;
recptr = xlogctl->recoveryLastRecPtr;
SpinLockRelease(&xlogctl->info_lck);
! snprintf(location, sizeof(location), "%X/%X/%X",
! tli, recptr.xlogid, recptr.xrecoff);
PG_RETURN_TEXT_P(cstring_to_text(location));
}
***************
*** 8319,8324 **** pg_last_xlog_replay_location(PG_FUNCTION_ARGS)
--- 8334,8343 ----
* Compute an xlog file name and decimal byte offset given a WAL location,
* such as is returned by pg_stop_backup() or pg_xlog_switch().
*
+ * Also use the tli for the computation if it's given with a location,
+ * such as is returned by pg_last_xlog_receive_location() or
+ * pg_last_xlog_replay_location().
+ *
* Note that a location exactly at a segment boundary is taken to be in
* the previous segment. This is usually the right thing, since the
* expected usage is to determine which xlog file(s) are ready to archive.
***************
*** 8328,8338 **** pg_xlogfile_name_offset(PG_FUNCTION_ARGS)
--- 8347,8359 ----
{
text *location = PG_GETARG_TEXT_P(0);
char *locationstr;
+ unsigned int utli;
unsigned int uxlogid;
unsigned int uxrecoff;
uint32 xlogid;
uint32 xlogseg;
uint32 xrecoff;
+ TimeLineID tli = ThisTimeLineID;
XLogRecPtr locationpoint;
char xlogfilename[MAXFNAMELEN];
Datum values[2];
***************
*** 8346,8352 **** pg_xlogfile_name_offset(PG_FUNCTION_ARGS)
*/
locationstr = text_to_cstring(location);
! if (sscanf(locationstr, "%X/%X", &uxlogid, &uxrecoff) != 2)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("could not parse transaction log location \"%s\"",
--- 8367,8375 ----
*/
locationstr = text_to_cstring(location);
! if (sscanf(locationstr, "%X/%X/%X", &utli, &uxlogid, &uxrecoff) == 3)
! tli = (TimeLineID) utli;
! else if (sscanf(locationstr, "%X/%X", &uxlogid, &uxrecoff) != 2)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("could not parse transaction log location \"%s\"",
***************
*** 8371,8377 **** pg_xlogfile_name_offset(PG_FUNCTION_ARGS)
* xlogfilename
*/
XLByteToPrevSeg(locationpoint, xlogid, xlogseg);
! XLogFileName(xlogfilename, ThisTimeLineID, xlogid, xlogseg);
values[0] = CStringGetTextDatum(xlogfilename);
isnull[0] = false;
--- 8394,8400 ----
* xlogfilename
*/
XLByteToPrevSeg(locationpoint, xlogid, xlogseg);
! XLogFileName(xlogfilename, tli, xlogid, xlogseg);
values[0] = CStringGetTextDatum(xlogfilename);
isnull[0] = false;
***************
*** 8397,8418 **** pg_xlogfile_name_offset(PG_FUNCTION_ARGS)
/*
* Compute an xlog file name given a WAL location,
* such as is returned by pg_stop_backup() or pg_xlog_switch().
*/
Datum
pg_xlogfile_name(PG_FUNCTION_ARGS)
{
text *location = PG_GETARG_TEXT_P(0);
char *locationstr;
unsigned int uxlogid;
unsigned int uxrecoff;
uint32 xlogid;
uint32 xlogseg;
XLogRecPtr locationpoint;
char xlogfilename[MAXFNAMELEN];
locationstr = text_to_cstring(location);
! if (sscanf(locationstr, "%X/%X", &uxlogid, &uxrecoff) != 2)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("could not parse transaction log location \"%s\"",
--- 8420,8449 ----
/*
* Compute an xlog file name given a WAL location,
* such as is returned by pg_stop_backup() or pg_xlog_switch().
+ *
+ * Also use the tli for the computation if it's given with a location,
+ * such as is returned by pg_last_xlog_receive_location() or
+ * pg_last_xlog_replay_location().
*/
Datum
pg_xlogfile_name(PG_FUNCTION_ARGS)
{
text *location = PG_GETARG_TEXT_P(0);
char *locationstr;
+ unsigned int utli;
unsigned int uxlogid;
unsigned int uxrecoff;
uint32 xlogid;
uint32 xlogseg;
+ TimeLineID tli = ThisTimeLineID;
XLogRecPtr locationpoint;
char xlogfilename[MAXFNAMELEN];
locationstr = text_to_cstring(location);
! if (sscanf(locationstr, "%X/%X/%X", &utli, &uxlogid, &uxrecoff) == 3)
! tli = (TimeLineID) utli;
! else if (sscanf(locationstr, "%X/%X", &uxlogid, &uxrecoff) != 2)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("could not parse transaction log location \"%s\"",
***************
*** 8422,8428 **** pg_xlogfile_name(PG_FUNCTION_ARGS)
locationpoint.xrecoff = uxrecoff;
XLByteToPrevSeg(locationpoint, xlogid, xlogseg);
! XLogFileName(xlogfilename, ThisTimeLineID, xlogid, xlogseg);
PG_RETURN_TEXT_P(cstring_to_text(xlogfilename));
}
--- 8453,8459 ----
locationpoint.xrecoff = uxrecoff;
XLByteToPrevSeg(locationpoint, xlogid, xlogseg);
! XLogFileName(xlogfilename, tli, xlogid, xlogseg);
PG_RETURN_TEXT_P(cstring_to_text(xlogfilename));
}
On Wed, March 3, 2010 15:03, Fujii Masao wrote:
On Tue, Mar 2, 2010 at 10:52 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
Here is the revised patch. I used new local variable instead of lastPageTLI
to track the tli of last applied record. It is updated with the tli of the
log page header when reading the page, and with the tli of the checkpoint
record when replaying the checkpoint shutdown record that changes the tli.
So pg_last_xlog_replay_location() can return the accurate tli of the last
applied record.extend_format_of_recovery_info_funcs_v4.patch
looks good: on the standby, the initial xlog file_name immediately after startup is now
000000010000000000000001, as expected.
I'll do my further testing of HS/SR with this patch included.
thanks,
Erik Rijekrs