pg_stop_backup wait bug fix

Started by Simon Riggsover 17 years ago8 messages
#1Simon Riggs
simon@2ndQuadrant.com
1 attachment(s)

Minor bug fix for pg_stop_backup() to prevent it waiting longer than
necessary in certain circumstances.

Was originally part of recovery_infrastructure patch.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

Attachments:

pg_stop_backup_wait.v1.patchtext/x-patch; charset=UTF-8; name=pg_stop_backup_wait.v1.patchDownload
Index: src/backend/access/transam/xlog.c
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.319
diff -c -r1.319 xlog.c
*** src/backend/access/transam/xlog.c	23 Sep 2008 09:20:35 -0000	1.319
--- src/backend/access/transam/xlog.c	8 Oct 2008 13:17:24 -0000
***************
*** 6639,6651 ****
  	LWLockRelease(WALInsertLock);
  
  	/*
! 	 * Force a switch to a new xlog segment file, so that the backup is valid
  	 * as soon as archiver moves out the current segment file. We'll report
  	 * the end address of the XLOG SWITCH record as the backup stopping point.
  	 */
  	stoppoint = RequestXLogSwitch();
  
  	XLByteToSeg(stoppoint, _logId, _logSeg);
  	XLogFileName(stopxlogfilename, ThisTimeLineID, _logId, _logSeg);
  
  	/* Use the log timezone here, not the session timezone */
--- 7002,7026 ----
  	LWLockRelease(WALInsertLock);
  
  	/*
! 	 * Request switch to a new xlog segment file, so that the backup is valid
  	 * as soon as archiver moves out the current segment file. We'll report
  	 * the end address of the XLOG SWITCH record as the backup stopping point.
  	 */
  	stoppoint = RequestXLogSwitch();
  
  	XLByteToSeg(stoppoint, _logId, _logSeg);
+ 
+ 	/*
+ 	 * If we didn't actually switch xlog files then there is nothing in
+ 	 * this file for us to wait for, so set stopxlogfilename to be the
+ 	 * previous file instead. We still report the same ending location.
+ 	 * If we don't do this pg_stop_backup() will wait until we next switch
+ 	 * log files, which could be archive_timeout, or it could be much
+ 	 * longer if archive_timeout is not enabled.
+ 	 */
+ 	if ((stoppoint.xrecoff % XLogSegSize) == 0)
+ 		PrevLogSeg(_logId, _logSeg);
+ 
  	XLogFileName(stopxlogfilename, ThisTimeLineID, _logId, _logSeg);
  
  	/* Use the log timezone here, not the session timezone */
#2Alvaro Herrera
alvherre@commandprompt.com
In reply to: Simon Riggs (#1)
Re: pg_stop_backup wait bug fix

Simon Riggs wrote:

Minor bug fix for pg_stop_backup() to prevent it waiting longer than
necessary in certain circumstances.

As far as I can tell, this patch needs to be applied to HEAD, 8.3 and
8.2 (when xlog switch was implemented), and in fact it applies cleanly
to them. Please confirm.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#3Simon Riggs
simon@2ndQuadrant.com
In reply to: Alvaro Herrera (#2)
Re: pg_stop_backup wait bug fix

On Fri, 2008-11-07 at 18:32 -0300, Alvaro Herrera wrote:

Simon Riggs wrote:

Minor bug fix for pg_stop_backup() to prevent it waiting longer than
necessary in certain circumstances.

As far as I can tell, this patch needs to be applied to HEAD, 8.3 and
8.2 (when xlog switch was implemented), and in fact it applies cleanly
to them. Please confirm.

There are other patches that we should consider along with this one.

This particular patch only has meaning when applied with the changes to
pg_stop_backup() earlier in 8.4dev cycle. It isn't a bug in xlog switch,
it is a bug in how the new waiting code interprets the return value from
xlog switch.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#4Fujii Masao
masao.fujii@gmail.com
In reply to: Simon Riggs (#1)
Re: pg_stop_backup wait bug fix

Hi,

On Wed, Oct 8, 2008 at 10:23 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

Minor bug fix for pg_stop_backup() to prevent it waiting longer than
necessary in certain circumstances.

Why don't you use XLByteToPrevSeg like pg_xlogfile_name?
I think that we should uniform the logic as much as possible.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#5Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Fujii Masao (#4)
1 attachment(s)
Re: pg_stop_backup wait bug fix

Fujii Masao wrote:

On Wed, Oct 8, 2008 at 10:23 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

Minor bug fix for pg_stop_backup() to prevent it waiting longer than
necessary in certain circumstances.

Why don't you use XLByteToPrevSeg like pg_xlogfile_name?
I think that we should uniform the logic as much as possible.

Agreed, should use XLByteToPrevSeg. But I wonder if we can just replace
the current XLByteToSeg call with XLByteToPrevSeg? That would offset the
return value of the function by one byte as well, as well as the value
printed to the backup history file. In fact, I think the original patch
got that wrong; it would return the location of the *beginning* of the
last xlog file.

I also noticed that the 2nd BackupHistoryFileName call in that function
is useless; histfilepath variable is already filled in earlier.

How does the attached patch look to you? Do you have an easy way to test
this?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Attachments:

pg_stop_backup_wait-heikki-1.patchtext/x-diff; name=pg_stop_backup_wait-heikki-1.patchDownload
Index: src/backend/access/transam/xlog.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.322
diff -c -r1.322 xlog.c
*** src/backend/access/transam/xlog.c	9 Nov 2008 17:51:15 -0000	1.322
--- src/backend/access/transam/xlog.c	2 Dec 2008 20:10:03 -0000
***************
*** 6674,6679 ****
--- 6674,6680 ----
  	char		histfilepath[MAXPGPATH];
  	char		startxlogfilename[MAXFNAMELEN];
  	char		stopxlogfilename[MAXFNAMELEN];
+ 	char		lastxlogfilename[MAXFNAMELEN];
  	uint32		_logId;
  	uint32		_logSeg;
  	FILE	   *lfp;
***************
*** 6711,6716 ****
--- 6712,6720 ----
  	XLByteToSeg(stoppoint, _logId, _logSeg);
  	XLogFileName(stopxlogfilename, ThisTimeLineID, _logId, _logSeg);
  
+ 	XLByteToPrevSeg(stoppoint, _logId, _logSeg);
+ 	XLogFileName(lastxlogfilename, ThisTimeLineID, _logId, _logSeg);
+ 
  	/* Use the log timezone here, not the session timezone */
  	stamp_time = (pg_time_t) time(NULL);
  	pg_strftime(strfbuf, sizeof(strfbuf),
***************
*** 6801,6813 ****
  	 * we assume the admin wanted his backup to work completely. If you
  	 * don't wish to wait, you can set statement_timeout.
  	 */
- 	BackupHistoryFileName(histfilepath, ThisTimeLineID, _logId, _logSeg,
- 						  startpoint.xrecoff % XLogSegSize);
- 
  	seconds_before_warning = 60;
  	waits = 0;
  
! 	while (XLogArchiveIsBusy(stopxlogfilename) ||
  		   XLogArchiveIsBusy(histfilepath))
  	{
  		CHECK_FOR_INTERRUPTS();
--- 6805,6814 ----
  	 * we assume the admin wanted his backup to work completely. If you
  	 * don't wish to wait, you can set statement_timeout.
  	 */
  	seconds_before_warning = 60;
  	waits = 0;
  
! 	while (XLogArchiveIsBusy(lastxlogfilename) ||
  		   XLogArchiveIsBusy(histfilepath))
  	{
  		CHECK_FOR_INTERRUPTS();
#6Fujii Masao
masao.fujii@gmail.com
In reply to: Heikki Linnakangas (#5)
Re: pg_stop_backup wait bug fix

On Wed, Dec 3, 2008 at 5:13 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

Fujii Masao wrote:

On Wed, Oct 8, 2008 at 10:23 PM, Simon Riggs <simon@2ndquadrant.com>
wrote:

Minor bug fix for pg_stop_backup() to prevent it waiting longer than
necessary in certain circumstances.

Why don't you use XLByteToPrevSeg like pg_xlogfile_name?
I think that we should uniform the logic as much as possible.

Agreed, should use XLByteToPrevSeg. But I wonder if we can just replace the
current XLByteToSeg call with XLByteToPrevSeg? That would offset the return
value of the function by one byte as well, as well as the value printed to
the backup history file. In fact, I think the original patch got that wrong;
it would return the location of the *beginning* of the last xlog file.

You're right. As you say, the value (stopxlogfilename) printed to the backup
history file is wrong. But, since the value is not used fortunately,
any troubles
have not come up. So, I think that we can just replace them.

I also noticed that the 2nd BackupHistoryFileName call in that function is
useless; histfilepath variable is already filled in earlier.

Somewhat confusingly, BackupHistoryFileName is called only once. Isn't 1st
(which probably you thought) BackupHistoryFilePath? In order to prevent
confusion, we should add new local variable (histfilename) for the backup
history file name?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#7Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Fujii Masao (#6)
Re: pg_stop_backup wait bug fix

Fujii Masao wrote:

On Wed, Dec 3, 2008 at 5:13 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

Agreed, should use XLByteToPrevSeg. But I wonder if we can just replace the
current XLByteToSeg call with XLByteToPrevSeg? That would offset the return
value of the function by one byte as well, as well as the value printed to
the backup history file. In fact, I think the original patch got that wrong;
it would return the location of the *beginning* of the last xlog file.

You're right. As you say, the value (stopxlogfilename) printed to the backup
history file is wrong. But, since the value is not used fortunately,
any troubles
have not come up. So, I think that we can just replace them.

Changing the return value doesn't seem like a good idea. If nothing
else, it would be complicated to explain what it returns. I committed a
patch that changes the waiting behavior, but not the return value or
what's written into the backup label file,

I also noticed that the 2nd BackupHistoryFileName call in that function is
useless; histfilepath variable is already filled in earlier.

Somewhat confusingly, BackupHistoryFileName is called only once. Isn't 1st
(which probably you thought) BackupHistoryFilePath?

Ouch, you're right. That's subtle.

In order to prevent
confusion, we should add new local variable (histfilename) for the backup
history file name?

Agreed. I included that in the patch.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#8Ibrar Ahmed
ibrar.ahmad@gmail.com
In reply to: Heikki Linnakangas (#7)
Re: pg_stop_backup wait bug fix

Hi,

I have looked at the patch and it looks OK to me. BTW I am not too
much familiar with this area of code, so I am not at the position to
argue that patch -:) . I haven't found an easy way to test the patch.

On Wed, Dec 3, 2008 at 1:24 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

Fujii Masao wrote:

On Wed, Dec 3, 2008 at 5:13 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

Agreed, should use XLByteToPrevSeg. But I wonder if we can just replace
the
current XLByteToSeg call with XLByteToPrevSeg? That would offset the
return
value of the function by one byte as well, as well as the value printed
to
the backup history file. In fact, I think the original patch got that
wrong;
it would return the location of the *beginning* of the last xlog file.

You're right. As you say, the value (stopxlogfilename) printed to the
backup
history file is wrong. But, since the value is not used fortunately,
any troubles
have not come up. So, I think that we can just replace them.

Changing the return value doesn't seem like a good idea. If nothing else, it
would be complicated to explain what it returns. I committed a patch that
changes the waiting behavior, but not the return value or what's written
into the backup label file,

I also noticed that the 2nd BackupHistoryFileName call in that function
is
useless; histfilepath variable is already filled in earlier.

Somewhat confusingly, BackupHistoryFileName is called only once. Isn't 1st
(which probably you thought) BackupHistoryFilePath?

Ouch, you're right. That's subtle.

In order to prevent
confusion, we should add new local variable (histfilename) for the backup
history file name?

Agreed. I included that in the patch.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
Ibrar Ahmed
EnterpriseDB http://www.enterprisedb.com