WIP archive_timeout patch

Started by Simon Riggsover 19 years ago11 messages
#1Simon Riggs
simon@2ndquadrant.com
1 attachment(s)

WIP archive_timeout.

All we need to do is add LWLock support to archiver.
Thoughts/ideas/hints welcome.

This is a patch-on-patch atop the xswitch.patch recently posted.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

Attachments:

archivetimeout.patchtext/x-patch; charset=UTF-8; name=archivetimeout.patchDownload
Index: doc/src/sgml/backup.sgml
===================================================================
RCS file: /projects/cvsroot/pgsql/doc/src/sgml/backup.sgml,v
retrieving revision 2.81
diff -c -r2.81 backup.sgml
*** doc/src/sgml/backup.sgml	18 Jun 2006 15:38:35 -0000	2.81
--- doc/src/sgml/backup.sgml	31 Jul 2006 20:34:25 -0000
***************
*** 573,600 ****
      the <filename>pg_xlog/</> directory will contain large numbers of
      not-yet-archived segment files, which could eventually exceed available
      disk space. You are advised to monitor the archiving process to ensure that
!     it is working as you intend.
     </para>
  
     <para>
!     If you are concerned about being able to recover right up to the
!     current instant, you may want to take additional steps to ensure that
!     the current, partially-filled WAL segment is also copied someplace.
!     This is particularly important if your server generates only little WAL
!     traffic (or has slack periods where it does so), since it could take a
!     long time before a WAL segment file is completely filled and ready to
!     archive.  One possible way to handle this is to set up a
!     <application>cron</> job that periodically (once a minute, perhaps)
!     identifies the current WAL segment file and saves it someplace safe.
!     Then the combination of the archived WAL segments and the saved current
!     segment will be enough to ensure you can always restore to within a
!     minute of current time.  This behavior is not presently built into
!     <productname>PostgreSQL</> because we did not want to complicate the
!     definition of the <xref linkend="guc-archive-command"> by requiring it
!     to keep track of successively archived, but different, copies of the
!     same WAL file.  The <xref linkend="guc-archive-command"> is only
!     invoked on completed WAL segments. Except in the case of retrying a
!     failure, it will be called only once for any given file name.
     </para>
  
     <para>
--- 573,593 ----
      the <filename>pg_xlog/</> directory will contain large numbers of
      not-yet-archived segment files, which could eventually exceed available
      disk space. You are advised to monitor the archiving process to ensure that
!     it is working as you intend. 
     </para>
  
     <para>
!     The <xref linkend="guc-archive-command"> is only invoked on completed 
!     WAL segments. This could lead to delays in producing the next archive
!     if your server generates only little WAL traffic (or has slack periods 
!     where it does so). To ensure regular archives are produced you can 
!     specify an <xref linkend="guc-archive-timeout"> which will automatically
!     switch to a new WAL segment file during quieter periods. Archived files
!     produced in this way are still the same length as completely full files,
!     though entries made after the final processing instruction can be ignored.
!     Switching to a new WAL segment file can be performed manually using
!     <function>pg_switch_xlog</>. A variety of other utility functions are
!     also available, listed in <xref linkend="functions-admin-backup-table">
     </para>
  
     <para>
Index: doc/src/sgml/config.sgml
===================================================================
RCS file: /projects/cvsroot/pgsql/doc/src/sgml/config.sgml,v
retrieving revision 1.71
diff -c -r1.71 config.sgml
*** doc/src/sgml/config.sgml	27 Jul 2006 08:30:41 -0000	1.71
--- doc/src/sgml/config.sgml	31 Jul 2006 20:34:31 -0000
***************
*** 1586,1591 ****
--- 1586,1614 ----
        </listitem>
       </varlistentry>
       
+      <varlistentry id="guc-archive-timeout" xreflabel="archive_timeout">
+       <term><varname>archive_timeout</varname> (<type>string</type>)</term>
+       <indexterm>
+        <primary><varname>archive_timeout</> configuration parameter</primary>
+       </indexterm>
+       <listitem>
+        <para>
+         The <xref linkend="guc-archive-command"> is only invoked on completed 
+         WAL segments. This could lead to delays in producing the next archive
+         if your server generates only little WAL traffic (or has slack periods 
+         where it does so). This parameter provides regular archiving by
+         making sure that no more than <xref linkend="guc-archive-command">
+         go by before a new WAL segment file is produced for archiving, even
+         if that means we archive a partially filled file. Zero disables this
+         feature, which is the default Valid values are from 1 to 60 seconds.
+         This parameter can only be set in the <filename>postgresql.conf</>
+         file or on the server command line. Be careful to set 
+         <varname>checkpoint_segments</> sufficiently high that you do not
+         inadvertently increase the rate at which checkpoints occur.
+        </para>
+       </listitem>
+      </varlistentry>
+      
       </variablelist>
      </sect2>
     </sect1>
Index: src/backend/access/transam/xlog.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.244
diff -c -r1.244 xlog.c
*** src/backend/access/transam/xlog.c	14 Jul 2006 14:52:17 -0000	1.244
--- src/backend/access/transam/xlog.c	31 Jul 2006 20:34:50 -0000
***************
*** 127,132 ****
--- 127,133 ----
  /* User-settable parameters */
  int			CheckPointSegments = 3;
  int			XLOGbuffers = 8;
+ int         XLogArchiveTimeout = 0;
  char	   *XLogArchiveCommand = NULL;
  char	   *XLOG_sync_method = NULL;
  const char	XLOG_sync_method_default[] = DEFAULT_SYNC_METHOD_STR;
Index: src/backend/utils/misc/guc.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/utils/misc/guc.c,v
retrieving revision 1.333
diff -c -r1.333 guc.c
*** src/backend/utils/misc/guc.c	29 Jul 2006 03:02:56 -0000	1.333
--- src/backend/utils/misc/guc.c	31 Jul 2006 20:34:57 -0000
***************
*** 1020,1025 ****
--- 1020,1034 ----
  static struct config_int ConfigureNamesInt[] =
  {
  	{
+ 		{"archive_timeout", PGC_SIGHUP, WAL_SETTINGS,
+ 		 gettext_noop("Will force a switch to the next xlog file if a new file has not "
+                       "been started within N seconds."),
+ 		 gettext_noop("This allows regular continuous archiving to take place.")
+ 		},
+ 		&XLogArchiveTimeout,
+ 		0, 0, 60, NULL, NULL
+ 	},
+ 	{
  		{"post_auth_delay", PGC_BACKEND, DEVELOPER_OPTIONS,
  		 gettext_noop("Waits N seconds on connection startup after authentication."),
  		 gettext_noop("This allows attaching a debugger to the process."),
Index: src/backend/utils/misc/postgresql.conf.sample
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/utils/misc/postgresql.conf.sample,v
retrieving revision 1.184
diff -c -r1.184 postgresql.conf.sample
*** src/backend/utils/misc/postgresql.conf.sample	25 Jul 2006 03:51:21 -0000	1.184
--- src/backend/utils/misc/postgresql.conf.sample	31 Jul 2006 20:34:57 -0000
***************
*** 167,174 ****
  
  # - Archiving -
  
! #archive_command = ''			# command to use to archive a logfile 
! 					# segment
  
  
  #---------------------------------------------------------------------------
--- 167,176 ----
  
  # - Archiving -
  
! # command to use to archive a logfile segment
! #archive_command = ''		
! #archive_timeout = 0        # automatic xlog switch gives regular archiving
!                             # range 0-60 in seconds, 0 is off 
  
  
  #---------------------------------------------------------------------------
Index: src/include/access/xlog.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/access/xlog.h,v
retrieving revision 1.72
diff -c -r1.72 xlog.h
*** src/include/access/xlog.h	13 Jul 2006 16:49:19 -0000	1.72
--- src/include/access/xlog.h	31 Jul 2006 20:34:58 -0000
***************
*** 139,144 ****
--- 139,145 ----
  extern int	CheckPointSegments;
  extern int	XLOGbuffers;
  extern char *XLogArchiveCommand;
+ extern int XLogArchiveTimeout;
  extern char *XLOG_sync_method;
  extern const char XLOG_sync_method_default[];
  
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#1)
Re: WIP archive_timeout patch

Simon Riggs <simon@2ndquadrant.com> writes:

WIP archive_timeout.
All we need to do is add LWLock support to archiver.
Thoughts/ideas/hints welcome.

Hint: this isn't the archiver's problem, and so you don't need to get
the archiver involved in the solution. I'd suggest bgwriter as a
reasonably appropriate place instead.

regards, tom lane

#3Simon Riggs
simon@2ndquadrant.com
In reply to: Tom Lane (#2)
Re: WIP archive_timeout patch

On Thu, 2006-08-03 at 13:38 -0400, Tom Lane wrote:

Simon Riggs <simon@2ndquadrant.com> writes:

WIP archive_timeout.
All we need to do is add LWLock support to archiver.
Thoughts/ideas/hints welcome.

Hint: this isn't the archiver's problem, and so you don't need to get
the archiver involved in the solution. I'd suggest bgwriter as a
reasonably appropriate place instead.

OK

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

#4Simon Riggs
simon@2ndquadrant.com
In reply to: Simon Riggs (#3)
Re: WIP archive_timeout patch

On Thu, 2006-08-03 at 19:03 +0100, Simon Riggs wrote:

On Thu, 2006-08-03 at 13:38 -0400, Tom Lane wrote:

Simon Riggs <simon@2ndquadrant.com> writes:

WIP archive_timeout.
All we need to do is add LWLock support to archiver.
Thoughts/ideas/hints welcome.

Hint: this isn't the archiver's problem, and so you don't need to get
the archiver involved in the solution. I'd suggest bgwriter as a
reasonably appropriate place instead.

OK

A slightly fuller answer:

Yes, thats a safer place than archiver, so I'll add it to bgwriter as
you suggest. Should have a patch complete by Tuesday, since travelling
now.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

#5Simon Riggs
simon@2ndquadrant.com
In reply to: Simon Riggs (#1)
1 attachment(s)
Re: WIP archive_timeout patch

On Wed, 2006-08-16 at 10:09 +0100, Simon Riggs wrote:

On Thu, 2006-08-03 at 19:03 +0100, Simon Riggs wrote:

On Thu, 2006-08-03 at 13:38 -0400, Tom Lane wrote:

Simon Riggs <simon@2ndquadrant.com> writes:

WIP archive_timeout.
All we need to do is add LWLock support to archiver.
Thoughts/ideas/hints welcome.

Hint: this isn't the archiver's problem, and so you don't need to get
the archiver involved in the solution. I'd suggest bgwriter as a
reasonably appropriate place instead.

Revised patch enclosed, now believed to be production ready. This
implements regular log switching using the archive_timeout GUC.

Further patch enclosed implementing these changes plus the record type
version of pg_xlogfile_name_offset()

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

Attachments:

archive_timeout++.patchtext/x-patch; charset=utf-8; name=archive_timeout++.patchDownload
Index: doc/src/sgml/backup.sgml
===================================================================
RCS file: /projects/cvsroot/pgsql/doc/src/sgml/backup.sgml,v
retrieving revision 2.82
diff -c -r2.82 backup.sgml
*** doc/src/sgml/backup.sgml	6 Aug 2006 03:53:43 -0000	2.82
--- doc/src/sgml/backup.sgml	16 Aug 2006 22:05:10 -0000
***************
*** 573,600 ****
      the <filename>pg_xlog/</> directory will contain large numbers of
      not-yet-archived segment files, which could eventually exceed available
      disk space. You are advised to monitor the archiving process to ensure that
!     it is working as you intend.
     </para>
  
     <para>
!     If you are concerned about being able to recover right up to the
!     current instant, you may want to take additional steps to ensure that
!     the current, partially-filled WAL segment is also copied someplace.
!     This is particularly important if your server generates only little WAL
!     traffic (or has slack periods where it does so), since it could take a
!     long time before a WAL segment file is completely filled and ready to
!     archive.  One possible way to handle this is to set up a
!     <application>cron</> job that periodically (once a minute, perhaps)
!     identifies the current WAL segment file and saves it someplace safe.
!     Then the combination of the archived WAL segments and the saved current
!     segment will be enough to ensure you can always restore to within a
!     minute of current time.  This behavior is not presently built into
!     <productname>PostgreSQL</> because we did not want to complicate the
!     definition of the <xref linkend="guc-archive-command"> by requiring it
!     to keep track of successively archived, but different, copies of the
!     same WAL file.  The <xref linkend="guc-archive-command"> is only
!     invoked on completed WAL segments. Except in the case of retrying a
!     failure, it will be called only once for any given file name.
     </para>
  
     <para>
--- 573,593 ----
      the <filename>pg_xlog/</> directory will contain large numbers of
      not-yet-archived segment files, which could eventually exceed available
      disk space. You are advised to monitor the archiving process to ensure that
!     it is working as you intend. 
     </para>
  
     <para>
!     The <xref linkend="guc-archive-command"> is only invoked on completed 
!     WAL segments. This could lead to delays in producing the next archive
!     if your server generates only little WAL traffic (or has slack periods 
!     where it does so). To ensure regular archives are produced you can 
!     specify an <xref linkend="guc-archive-timeout"> which will automatically
!     switch to a new WAL segment file during quieter periods. Archived files
!     produced in this way are still the same length as completely full files,
!     though entries made after the final processing instruction can be ignored.
!     Switching to a new WAL segment file can be performed manually using
!     <function>pg_switch_xlog</>. A variety of other utility functions are
!     also available, listed in <xref linkend="functions-admin-backup-table">
     </para>
  
     <para>
Index: doc/src/sgml/config.sgml
===================================================================
RCS file: /projects/cvsroot/pgsql/doc/src/sgml/config.sgml,v
retrieving revision 1.74
diff -c -r1.74 config.sgml
*** doc/src/sgml/config.sgml	15 Aug 2006 18:26:58 -0000	1.74
--- doc/src/sgml/config.sgml	16 Aug 2006 22:05:17 -0000
***************
*** 1584,1589 ****
--- 1584,1612 ----
        </listitem>
       </varlistentry>
       
+      <varlistentry id="guc-archive-timeout" xreflabel="archive_timeout">
+       <term><varname>archive_timeout</varname> (<type>string</type>)</term>
+       <indexterm>
+        <primary><varname>archive_timeout</> configuration parameter</primary>
+       </indexterm>
+       <listitem>
+        <para>
+         The <xref linkend="guc-archive-command"> is only invoked on completed 
+         WAL segments. This could lead to delays in producing the next archive
+         if your server generates only little WAL traffic (or has slack periods 
+         where it does so). This parameter provides regular archiving by
+         making sure that no more than <xref linkend="guc-archive-command">
+         go by before a new WAL segment file is produced for archiving, even
+         if that means we archive a partially filled file. Zero disables this
+         feature, which is the default Valid values are from 1 to 60 seconds.
+         This parameter can only be set in the <filename>postgresql.conf</>
+         file or on the server command line. Be careful to set 
+         <varname>checkpoint_segments</> sufficiently high that you do not
+         inadvertently increase the rate at which checkpoints occur.
+        </para>
+       </listitem>
+      </varlistentry>
+      
       </variablelist>
      </sect2>
     </sect1>
Index: src/backend/access/transam/xlog.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.247
diff -c -r1.247 xlog.c
*** src/backend/access/transam/xlog.c	7 Aug 2006 16:57:56 -0000	1.247
--- src/backend/access/transam/xlog.c	16 Aug 2006 22:05:21 -0000
***************
*** 23,28 ****
--- 23,29 ----
  #include <sys/time.h>
  
  #include "access/clog.h"
+ #include "access/heapam.h"
  #include "access/multixact.h"
  #include "access/subtrans.h"
  #include "access/transam.h"
***************
*** 32,37 ****
--- 33,40 ----
  #include "access/xlogutils.h"
  #include "catalog/catversion.h"
  #include "catalog/pg_control.h"
+ #include "catalog/pg_type.h"
+ #include "funcapi.h"
  #include "miscadmin.h"
  #include "pgstat.h"
  #include "postmaster/bgwriter.h"
***************
*** 128,133 ****
--- 131,137 ----
  /* User-settable parameters */
  int			CheckPointSegments = 3;
  int			XLOGbuffers = 8;
+ int         XLogArchiveTimeout = 0;
  char	   *XLogArchiveCommand = NULL;
  char	   *XLOG_sync_method = NULL;
  const char	XLOG_sync_method_default[] = DEFAULT_SYNC_METHOD_STR;
***************
*** 5308,5313 ****
--- 5312,5333 ----
  }
  
  /*
+  * Get the current WAL Insert pointer ... shared lock is sufficient
+  */
+ XLogRecPtr
+ GetWALInsertPtr(void)
+ {
+ 	XLogCtlInsert *Insert = &XLogCtl->Insert;
+     XLogRecPtr InsertRecPtr;
+ 
+ 	LWLockAcquire(WALInsertLock, LW_SHARED);
+ 	INSERT_RECPTR(InsertRecPtr, Insert, Insert->curridx);
+ 	LWLockRelease(WALInsertLock);
+ 
+ 	return InsertRecPtr;
+ }
+ 
+ /*
   * GetRecentNextXid - get the nextXid value saved by the most recent checkpoint
   *
   * This is currently used only by the autovacuum daemon.  To check for
***************
*** 5728,5734 ****
   * or the end+1 address of the prior segment if we did not need to
   * write a switch record because we are already at segment start.
   */
! static XLogRecPtr
  RequestXLogSwitch(void)
  {
  	XLogRecPtr	RecPtr;
--- 5748,5754 ----
   * or the end+1 address of the prior segment if we did not need to
   * write a switch record because we are already at segment start.
   */
! XLogRecPtr
  RequestXLogSwitch(void)
  {
  	XLogRecPtr	RecPtr;
***************
*** 6336,6356 ****
  
  /*
   * Report the current WAL location (same format as pg_start_backup etc)
   */
  Datum
  pg_current_xlog_location(PG_FUNCTION_ARGS)
  {
  	text	   *result;
- 	XLogCtlInsert *Insert = &XLogCtl->Insert;
- 	XLogRecPtr	current_recptr;
  	char		location[MAXFNAMELEN];
  
  	/*
! 	 * Get the current end-of-WAL position ... shared lock is sufficient
  	 */
! 	LWLockAcquire(WALInsertLock, LW_SHARED);
! 	INSERT_RECPTR(current_recptr, Insert, Insert->curridx);
! 	LWLockRelease(WALInsertLock);
  
  	snprintf(location, sizeof(location), "%X/%X",
  			 current_recptr.xlogid, current_recptr.xrecoff);
--- 6356,6407 ----
  
  /*
   * Report the current WAL location (same format as pg_start_backup etc)
+  *
+  * This is the current Write pointer, so is useful for determining the
+  * current byte offset within a WAL file that has valid data written to it. 
+  * Note that data written is not always committed yet, see XLogInsert()
   */
  Datum
  pg_current_xlog_location(PG_FUNCTION_ARGS)
  {
  	text	   *result;
  	char		location[MAXFNAMELEN];
  
  	/*
! 	 * Get the current end-of-WAL position by updating LogwrtResult
  	 */
! 	{
! 		/* use volatile pointer to prevent code rearrangement */
! 		volatile XLogCtlData *xlogctl = XLogCtl;
! 
! 		SpinLockAcquire(&xlogctl->info_lck);
! 		LogwrtResult = xlogctl->LogwrtResult;
! 		SpinLockRelease(&xlogctl->info_lck);
! 	}
! 
! 	snprintf(location, sizeof(location), "%X/%X",
! 			 LogwrtResult.Write.xlogid, LogwrtResult.Write.xrecoff);
! 
! 	result = DatumGetTextP(DirectFunctionCall1(textin,
! 											   CStringGetDatum(location)));
! 	PG_RETURN_TEXT_P(result);
! }
! 
! /*
!  * Report the current WAL location (same format as pg_start_backup etc)
!  *
!  * This is the current Insert pointer. The name is deliberately chosen
!  * to be different from pg_current_xlog_location so people do not confuse
!  * the two functions. This function is mostly for debugging purposes.
!  */
! Datum
! pg_current_wal_insert_pointer(PG_FUNCTION_ARGS)
! {
! 	text	   *result;
! 	XLogRecPtr	current_recptr;
! 	char		location[MAXFNAMELEN];
! 
!     current_recptr = GetWALInsertPtr();
  
  	snprintf(location, sizeof(location), "%X/%X",
  			 current_recptr.xlogid, current_recptr.xrecoff);
***************
*** 6372,6378 ****
  pg_xlogfile_name_offset(PG_FUNCTION_ARGS)
  {
  	text	   *location = PG_GETARG_TEXT_P(0);
- 	text	   *result;
  	char	   *locationstr;
  	unsigned int uxlogid;
  	unsigned int uxrecoff;
--- 6423,6428 ----
***************
*** 6381,6387 ****
  	uint32		xrecoff;
  	XLogRecPtr	locationpoint;
  	char		xlogfilename[MAXFNAMELEN];
! 
  	locationstr = DatumGetCString(DirectFunctionCall1(textout,
  												PointerGetDatum(location)));
  
--- 6431,6445 ----
  	uint32		xrecoff;
  	XLogRecPtr	locationpoint;
  	char		xlogfilename[MAXFNAMELEN];
!     Datum       values[2];
!     bool        isnull[2];
!     TupleDesc   resultTupleDesc;
!     HeapTuple   resultHeapTuple;
! 	Datum	    result;
! 
!     /*
!      * Read input and parse
!      */
  	locationstr = DatumGetCString(DirectFunctionCall1(textout,
  												PointerGetDatum(location)));
  
***************
*** 6394,6411 ****
  	locationpoint.xlogid = uxlogid;
  	locationpoint.xrecoff = uxrecoff;
  
  	XLByteToPrevSeg(locationpoint, xlogid, xlogseg);
  	XLogFileName(xlogfilename, ThisTimeLineID, xlogid, xlogseg);
  
  	xrecoff = locationpoint.xrecoff - xlogseg * XLogSegSize;
- 	snprintf(xlogfilename + strlen(xlogfilename),
- 			 sizeof(xlogfilename) - strlen(xlogfilename),
- 			 " %u",
- 			 (unsigned int) xrecoff);
  
! 	result = DatumGetTextP(DirectFunctionCall1(textin,
! 											   CStringGetDatum(xlogfilename)));
! 	PG_RETURN_TEXT_P(result);
  }
  
  /*
--- 6452,6493 ----
  	locationpoint.xlogid = uxlogid;
  	locationpoint.xrecoff = uxrecoff;
  
+ 	/* Construct a tuple descriptor for the result rows. */
+ 	resultTupleDesc = CreateTemplateTupleDesc(2, false);
+ 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 1, "filename",
+ 					   TEXTOID, -1, 0);
+ 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 2, "fileoffset",
+ 					   INT4OID, -1, 0);
+ 
+ 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
+ 
+     /*
+      * xlogfilename
+      */
  	XLByteToPrevSeg(locationpoint, xlogid, xlogseg);
+ 
  	XLogFileName(xlogfilename, ThisTimeLineID, xlogid, xlogseg);
  
+     values[0] = DirectFunctionCall1(textin,
+ 										CStringGetDatum(xlogfilename));
+     isnull[0] = false;
+ 
+     /*
+      * offset
+      */
  	xrecoff = locationpoint.xrecoff - xlogseg * XLogSegSize;
  
!     values[1] = UInt32GetDatum(xrecoff);
!     isnull[1] = false;
! 
!     /*
!      * Tuple jam: Having first prepared your Datums, then squash together
!      */
!     resultHeapTuple = heap_form_tuple(resultTupleDesc, values, isnull);
! 
!     result = HeapTupleGetDatum(resultHeapTuple);
! 
!     PG_RETURN_DATUM(result);
  }
  
  /*
Index: src/backend/postmaster/bgwriter.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/postmaster/bgwriter.c,v
retrieving revision 1.26
diff -c -r1.26 bgwriter.c
*** src/backend/postmaster/bgwriter.c	14 Jul 2006 14:52:22 -0000	1.26
--- src/backend/postmaster/bgwriter.c	16 Aug 2006 22:05:22 -0000
***************
*** 48,53 ****
--- 48,54 ----
  
  #include "libpq/pqsignal.h"
  #include "miscadmin.h"
+ #include "access/xlog_internal.h"
  #include "postmaster/bgwriter.h"
  #include "storage/fd.h"
  #include "storage/freespace.h"
***************
*** 144,149 ****
--- 145,154 ----
  static bool ckpt_active = false;
  
  static time_t last_checkpoint_time;
+ static time_t last_check_xlog_time;
+ static uint32 last_check_xlogid;
+ static uint32 last_check_xlogseg;
+ static XLogRecPtr pre_switch_xlog_recptr;
  
  
  static void bg_quickdie(SIGNAL_ARGS);
***************
*** 205,214 ****
  #endif
  
  	/*
! 	 * Initialize so that first time-driven checkpoint happens at the correct
  	 * time.
  	 */
! 	last_checkpoint_time = time(NULL);
  
  	/*
  	 * Create a resource owner to keep track of our resources (currently
--- 210,231 ----
  #endif
  
  	/*
! 	 * Initialize so that first time-driven event happens at the correct
  	 * time.
  	 */
! 	last_check_xlog_time = last_checkpoint_time = time(NULL);
! 
!     /*
!      * Allow bgwriter to read xlog details
!      */
!     InitXLOGAccess();
! 
!     /*
!      * Initialize the values for logid and segid, so we tell whether
!      * we need to force log switching
!      */
!     pre_switch_xlog_recptr = GetWALInsertPtr();
! 	XLByteToPrevSeg(pre_switch_xlog_recptr, last_check_xlogid, last_check_xlogseg);
  
  	/*
  	 * Create a resource owner to keep track of our resources (currently
***************
*** 309,315 ****
  		bool		do_checkpoint = false;
  		bool		force_checkpoint = false;
  		time_t		now;
! 		int			elapsed_secs;
  		long		udelay;
  
  		/*
--- 326,333 ----
  		bool		do_checkpoint = false;
  		bool		force_checkpoint = false;
  		time_t		now;
! 		int			elapsed_since_checkpoint_secs;
! 		int			elapsed_since_switch_xlog_secs;
  		long		udelay;
  
  		/*
***************
*** 348,355 ****
  		 * last one.
  		 */
  		now = time(NULL);
! 		elapsed_secs = now - last_checkpoint_time;
! 		if (elapsed_secs >= CheckPointTimeout)
  			do_checkpoint = true;
  
  		/*
--- 366,373 ----
  		 * last one.
  		 */
  		now = time(NULL);
! 		elapsed_since_checkpoint_secs = now - last_checkpoint_time;
! 		if (elapsed_since_checkpoint_secs >= CheckPointTimeout)
  			do_checkpoint = true;
  
  		/*
***************
*** 366,375 ****
  			 * CheckPointTimeout < CheckPointWarning.
  			 */
  			if (BgWriterShmem->ckpt_time_warn &&
! 				elapsed_secs < CheckPointWarning)
  				ereport(LOG,
  						(errmsg("checkpoints are occurring too frequently (%d seconds apart)",
! 								elapsed_secs),
  						 errhint("Consider increasing the configuration parameter \"checkpoint_segments\".")));
  			BgWriterShmem->ckpt_time_warn = false;
  
--- 384,393 ----
  			 * CheckPointTimeout < CheckPointWarning.
  			 */
  			if (BgWriterShmem->ckpt_time_warn &&
! 				elapsed_since_checkpoint_secs < CheckPointWarning)
  				ereport(LOG,
  						(errmsg("checkpoints are occurring too frequently (%d seconds apart)",
! 								elapsed_since_checkpoint_secs),
  						 errhint("Consider increasing the configuration parameter \"checkpoint_segments\".")));
  			BgWriterShmem->ckpt_time_warn = false;
  
***************
*** 403,408 ****
--- 421,483 ----
  		else
  			BgBufferSync();
  
+         /*
+          * Check for archive_timeout, if so, switch xlog files 
+          */
+         if (XLogArchiveTimeout > 0)
+         {
+     		/*
+     		 * If we did a checkpoint, we probably need to get the time again
+     		 * since its likely to be a while since that started
+     		 */
+             if (do_checkpoint)
+         		now = time(NULL);
+ 
+     		elapsed_since_switch_xlog_secs = now - last_check_xlog_time;
+ 
+             /*
+              * Check whether the timeout is due
+              */
+     		if (elapsed_since_switch_xlog_secs >= XLogArchiveTimeout)
+             {
+             	uint32		current_xlogid;
+             	uint32		current_xlogseg;
+ 
+                 /*
+                  * If the timeout is due, check whether or not we're still
+                  * in the same xlog file as last switch. If we are then we 
+                  * know we want to force a switch. We check the Insert
+                  * pointer here, not the Write pointer, but it's not important
+                  */
+                 pre_switch_xlog_recptr = GetWALInsertPtr();
+             	XLByteToPrevSeg(pre_switch_xlog_recptr, current_xlogid, current_xlogseg);
+ 
+                 if (current_xlogid == last_check_xlogid &&
+                     current_xlogseg == last_check_xlogseg)
+                 {
+                     XLogRecPtr  switch_xlog_recptr = RequestXLogSwitch();
+ 
+                     /*
+                      * Report activity and reset last_switch values
+                      * only if we actually performed a switch
+                      */
+                     if (XLByteLT(pre_switch_xlog_recptr, switch_xlog_recptr))
+                     {
+         				ereport(LOG,
+     						(errmsg("automatic xlog switch performed (archive_timeout=%d)",
+                                     XLogArchiveTimeout)));
+                        	XLByteToPrevSeg(switch_xlog_recptr, last_check_xlogid, last_check_xlogseg);
+                     }
+                 	last_check_xlog_time = time(NULL);
+                 }
+                 else
+                 {
+                     last_check_xlogid = current_xlogid;
+                     last_check_xlogseg = current_xlogseg;
+                 }
+             }
+         }
+ 
  		/*
  		 * Nap for the configured time, or sleep for 10 seconds if there is no
  		 * bgwriter activity configured.
***************
*** 416,425 ****
  		 */
  		if ((bgwriter_all_percent > 0.0 && bgwriter_all_maxpages > 0) ||
  			(bgwriter_lru_percent > 0.0 && bgwriter_lru_maxpages > 0))
! 			udelay = BgWriterDelay * 1000L;
! 		else
! 			udelay = 10000000L;
! 		while (udelay > 1000000L)
  		{
  			if (got_SIGHUP || checkpoint_requested || shutdown_requested)
  				break;
--- 491,503 ----
  		 */
  		if ((bgwriter_all_percent > 0.0 && bgwriter_all_maxpages > 0) ||
  			(bgwriter_lru_percent > 0.0 && bgwriter_lru_maxpages > 0))
!    			udelay = BgWriterDelay * 1000L;
! 		else if (XLogArchiveTimeout > 0)
! 			udelay = 1000000L;   /* One second */
!         else
! 			udelay = 10000000L;  /* Ten seconds */
! 
! 		while (udelay > 999999L)
  		{
  			if (got_SIGHUP || checkpoint_requested || shutdown_requested)
  				break;
***************
*** 427,432 ****
--- 505,511 ----
  			AbsorbFsyncRequests();
  			udelay -= 1000000L;
  		}
+ 
  		if (!(got_SIGHUP || checkpoint_requested || shutdown_requested))
  			pg_usleep(udelay);
  	}
Index: src/backend/utils/misc/guc.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/utils/misc/guc.c,v
retrieving revision 1.342
diff -c -r1.342 guc.c
*** src/backend/utils/misc/guc.c	15 Aug 2006 18:26:59 -0000	1.342
--- src/backend/utils/misc/guc.c	16 Aug 2006 22:05:25 -0000
***************
*** 29,34 ****
--- 29,35 ----
  #include "access/gin.h"
  #include "access/twophase.h"
  #include "access/xact.h"
+ #include "access/xlog_internal.h"
  #include "catalog/namespace.h"
  #include "commands/async.h"
  #include "commands/vacuum.h"
***************
*** 1020,1025 ****
--- 1021,1035 ----
  static struct config_int ConfigureNamesInt[] =
  {
  	{
+ 		{"archive_timeout", PGC_SIGHUP, WAL_SETTINGS,
+ 		 gettext_noop("Will force a switch to the next xlog file if a new file has not "
+                       "been started within N seconds."),
+ 		 gettext_noop("This allows regular continuous archiving to take place.")
+ 		},
+ 		&XLogArchiveTimeout,
+ 		0, 0, 60, NULL, NULL
+ 	},
+ 	{
  		{"post_auth_delay", PGC_BACKEND, DEVELOPER_OPTIONS,
  		 gettext_noop("Waits N seconds on connection startup after authentication."),
  		 gettext_noop("This allows attaching a debugger to the process."),
Index: src/backend/utils/misc/postgresql.conf.sample
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/utils/misc/postgresql.conf.sample,v
retrieving revision 1.186
diff -c -r1.186 postgresql.conf.sample
*** src/backend/utils/misc/postgresql.conf.sample	15 Aug 2006 18:26:59 -0000	1.186
--- src/backend/utils/misc/postgresql.conf.sample	16 Aug 2006 22:05:25 -0000
***************
*** 167,174 ****
  
  # - Archiving -
  
! #archive_command = ''			# command to use to archive a logfile 
! 					# segment
  
  
  #---------------------------------------------------------------------------
--- 167,176 ----
  
  # - Archiving -
  
! # command to use to archive a logfile segment
! #archive_command = ''		
! #archive_timeout = 0        # automatic xlog switch gives regular archiving
!                             # range 0-60 in seconds, 0 is off 
  
  
  #---------------------------------------------------------------------------
Index: src/include/access/xlog.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/access/xlog.h,v
retrieving revision 1.72
diff -c -r1.72 xlog.h
*** src/include/access/xlog.h	13 Jul 2006 16:49:19 -0000	1.72
--- src/include/access/xlog.h	16 Aug 2006 22:05:25 -0000
***************
*** 139,144 ****
--- 139,145 ----
  extern int	CheckPointSegments;
  extern int	XLOGbuffers;
  extern char *XLogArchiveCommand;
+ /* extern int XLogArchiveTimeout; -- included in xlog_internal.h */
  extern char *XLOG_sync_method;
  extern const char XLOG_sync_method_default[];
  
Index: src/include/access/xlog_internal.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/access/xlog_internal.h,v
retrieving revision 1.15
diff -c -r1.15 xlog_internal.h
*** src/include/access/xlog_internal.h	7 Aug 2006 16:57:57 -0000	1.15
--- src/include/access/xlog_internal.h	16 Aug 2006 22:05:25 -0000
***************
*** 237,242 ****
--- 237,249 ----
  
  extern const RmgrData RmgrTable[];
  
+ /* 
+  * These are required to allow xlog switching from bgwriter
+  */
+ extern XLogRecPtr RequestXLogSwitch(void);
+ extern XLogRecPtr GetWALInsertPtr(void);
+ extern int XLogArchiveTimeout;
+ 
  /*
   * These aren't in xlog.h because I'd rather not include fmgr.h there.
   */
***************
*** 244,249 ****
--- 251,257 ----
  extern Datum pg_stop_backup(PG_FUNCTION_ARGS);
  extern Datum pg_switch_xlog(PG_FUNCTION_ARGS);
  extern Datum pg_current_xlog_location(PG_FUNCTION_ARGS);
+ extern Datum pg_current_wal_insert_pointer(PG_FUNCTION_ARGS);
  extern Datum pg_xlogfile_name_offset(PG_FUNCTION_ARGS);
  extern Datum pg_xlogfile_name(PG_FUNCTION_ARGS);
  
Index: src/include/catalog/pg_proc.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/catalog/pg_proc.h,v
retrieving revision 1.420
diff -c -r1.420 pg_proc.h
*** src/include/catalog/pg_proc.h	6 Aug 2006 03:53:44 -0000	1.420
--- src/include/catalog/pg_proc.h	16 Aug 2006 22:05:31 -0000
***************
*** 3105,3111 ****
  DESCR("Switch to new xlog file");
  DATA(insert OID = 2849 ( pg_current_xlog_location	PGNSP PGUID 12 f f t f v 0 25 "" _null_ _null_ _null_ pg_current_xlog_location - _null_ ));
  DESCR("current xlog location");
! DATA(insert OID = 2850 ( pg_xlogfile_name_offset	PGNSP PGUID 12 f f t f i 1 25 "25" _null_ _null_ _null_ pg_xlogfile_name_offset - _null_ ));
  DESCR("xlog filename and byte offset, given an xlog location");
  DATA(insert OID = 2851 ( pg_xlogfile_name			PGNSP PGUID 12 f f t f i 1 25 "25" _null_ _null_ _null_ pg_xlogfile_name - _null_ ));
  DESCR("xlog filename, given an xlog location");
--- 3105,3113 ----
  DESCR("Switch to new xlog file");
  DATA(insert OID = 2849 ( pg_current_xlog_location	PGNSP PGUID 12 f f t f v 0 25 "" _null_ _null_ _null_ pg_current_xlog_location - _null_ ));
  DESCR("current xlog location");
! DATA(insert OID = 2852 ( pg_current_wal_insert_pointer	PGNSP PGUID 12 f f t f v 0 25 "" _null_ _null_ _null_ pg_current_wal_insert_pointer - _null_ ));
! DESCR("current wal insert pointer");
! DATA(insert OID = 2850 ( pg_xlogfile_name_offset	PGNSP PGUID 12 f f t f i 1 2249 "25" "{25,25,23}" "{i,o,o}" "{wal_location,filename,fileoffset}" pg_xlogfile_name_offset - _null_ ));
  DESCR("xlog filename and byte offset, given an xlog location");
  DATA(insert OID = 2851 ( pg_xlogfile_name			PGNSP PGUID 12 f f t f i 1 25 "25" _null_ _null_ _null_ pg_xlogfile_name - _null_ ));
  DESCR("xlog filename, given an xlog location");
#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#5)
Re: [PATCHES] WIP archive_timeout patch

Simon Riggs <simon@2ndquadrant.com> writes:

Revised patch enclosed, now believed to be production ready. This
implements regular log switching using the archive_timeout GUC.

Further patch enclosed implementing these changes plus the record type
version of pg_xlogfile_name_offset()

Applied with minor changes --- it seemed better to me to put tracking of
the last xlog switch time directly into xlog.c, instead of having the
bgwriter code try to determine whether a switch had happened recently.

I noticed a minor annoyance while testing: when the system is completely
idle, you get a forced segment switch every checkpoint_timeout seconds,
even though there is nothing useful to log. The checkpoint code is
smart enough not to do a checkpoint if nothing has happened since the
last one, and the xlog switch code is smart enough not to do a switch
if nothing has happened since the last one ... but they aren't talking
to each other and so each one's change looks like "something happened"
to the other one. I'm not sure how much trouble it's worth taking to
prevent this scenario, though. If you can't afford a WAL file switch
every five minutes, you probably shouldn't be using archive_timeout
anyway ...

regards, tom lane

#7Florian G. Pflug
fgp@phlo.org
In reply to: Tom Lane (#6)
Re: [PATCHES] WIP archive_timeout patch

Tom Lane wrote:

Simon Riggs <simon@2ndquadrant.com> writes:

Revised patch enclosed, now believed to be production ready. This
implements regular log switching using the archive_timeout GUC.

Further patch enclosed implementing these changes plus the record type
version of pg_xlogfile_name_offset()

Applied with minor changes --- it seemed better to me to put tracking of
the last xlog switch time directly into xlog.c, instead of having the
bgwriter code try to determine whether a switch had happened recently.

I noticed a minor annoyance while testing: when the system is completely
idle, you get a forced segment switch every checkpoint_timeout seconds,
even though there is nothing useful to log. The checkpoint code is
smart enough not to do a checkpoint if nothing has happened since the
last one, and the xlog switch code is smart enough not to do a switch
if nothing has happened since the last one ... but they aren't talking
to each other and so each one's change looks like "something happened"
to the other one. I'm not sure how much trouble it's worth taking to
prevent this scenario, though. If you can't afford a WAL file switch
every five minutes, you probably shouldn't be using archive_timeout
anyway ...

Actually, this behaviour IMHO even has it's advantages - if you can be
sure that at least one wal will be archived every 5 minutes, then it's
easy to monitor the replication - you can just watch the logfile if the
slave, and send a failure notice if no logfile is imported at least
every 10 minutes or so.

Of course, for this to be useful, the documentation would have to tell
people about that behaviour, and it couldn't easily be changed in the next
release...

greetings, Florian Pflug

#8Simon Riggs
simon@2ndquadrant.com
In reply to: Tom Lane (#6)
Re: [PATCHES] WIP archive_timeout patch

On Thu, 2006-08-17 at 19:11 -0400, Tom Lane wrote:

Simon Riggs <simon@2ndquadrant.com> writes:

Revised patch enclosed, now believed to be production ready. This
implements regular log switching using the archive_timeout GUC.

Further patch enclosed implementing these changes plus the record type
version of pg_xlogfile_name_offset()

Applied with minor changes --- it seemed better to me to put tracking of
the last xlog switch time directly into xlog.c, instead of having the
bgwriter code try to determine whether a switch had happened recently.

Code location: sure.

I noticed a minor annoyance while testing: when the system is completely
idle, you get a forced segment switch every checkpoint_timeout seconds,
even though there is nothing useful to log. The checkpoint code is
smart enough not to do a checkpoint if nothing has happened since the
last one, and the xlog switch code is smart enough not to do a switch
if nothing has happened since the last one ... but they aren't talking
to each other and so each one's change looks like "something happened"
to the other one. I'm not sure how much trouble it's worth taking to
prevent this scenario, though. If you can't afford a WAL file switch
every five minutes, you probably shouldn't be using archive_timeout
anyway ...

I noticed that minor annoyance and understood that I had fixed it before
submitting. That was the reason for putting the code in bgwriter to
check whether the pointer had moved before attempting the switch...
perhaps that functionality has been removed?

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

#9Zeugswetter Andreas DCP SD
ZeugswetterA@spardat.at
In reply to: Tom Lane (#6)
Re: [PATCHES] WIP archive_timeout patch

I noticed a minor annoyance while testing: when the system is
completely idle, you get a forced segment switch every
checkpoint_timeout seconds, even though there is nothing
useful to log. The checkpoint code is smart enough not to do
a checkpoint if nothing has happened since the last one, and
the xlog switch code is smart enough not to do a switch if
nothing has happened since the last one ... but they aren't
talking to each other and so each one's change looks like
"something happened"
to the other one. I'm not sure how much trouble it's worth
taking to prevent this scenario, though. If you can't afford
a WAL file switch every five minutes, you probably shouldn't
be using archive_timeout anyway ...

Um, I would have thought practical timeouts would be rather more
than 5 minutes than less. So this does seem like a problem to me :-(

Andreas

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#8)
Re: [PATCHES] WIP archive_timeout patch

Simon Riggs <simon@2ndquadrant.com> writes:

On Thu, 2006-08-17 at 19:11 -0400, Tom Lane wrote:

I noticed a minor annoyance while testing: when the system is completely
idle, you get a forced segment switch every checkpoint_timeout seconds,
even though there is nothing useful to log. The checkpoint code is
smart enough not to do a checkpoint if nothing has happened since the
last one, and the xlog switch code is smart enough not to do a switch
if nothing has happened since the last one ... but they aren't talking
to each other and so each one's change looks like "something happened"
to the other one.

I noticed that minor annoyance and understood that I had fixed it before
submitting. That was the reason for putting the code in bgwriter to
check whether the pointer had moved before attempting the switch...
perhaps that functionality has been removed?

No, the original form of the patch was equally vulnerable. AFAICS the
only way to prevent this would be for XLogRequestSwitch (or really
XLogInsert, which does the heavy lifting for this) to suppress a switch
if the current segment is empty *or* contains only a checkpoint WAL
record. Basically it'd have to pretend the checkpoint record is not
there. This is doable but seems a bit weird --- in particular, that
would mean that pg_switch_xlog sometimes returns a pointer less than
pg_current_xlog_location, which might confuse backup scripts.

On the whole I'm leaning towards not changing it. As Florian mentioned,
guaranteed segment-every-checkpoint isn't completely without its uses.
And people who are looking for low WAL volume ought to be stretching
out their checkpoint intervals anyway.

regards, tom lane

#11Simon Riggs
simon@2ndquadrant.com
In reply to: Tom Lane (#10)
Re: [PATCHES] WIP archive_timeout patch

On Fri, 2006-08-18 at 08:52 -0400, Tom Lane wrote:

Simon Riggs <simon@2ndquadrant.com> writes:

On Thu, 2006-08-17 at 19:11 -0400, Tom Lane wrote:

I noticed a minor annoyance while testing: when the system is completely
idle, you get a forced segment switch every checkpoint_timeout seconds,
even though there is nothing useful to log. The checkpoint code is
smart enough not to do a checkpoint if nothing has happened since the
last one, and the xlog switch code is smart enough not to do a switch
if nothing has happened since the last one ... but they aren't talking
to each other and so each one's change looks like "something happened"
to the other one.

I noticed that minor annoyance and understood that I had fixed it before
submitting. That was the reason for putting the code in bgwriter to
check whether the pointer had moved before attempting the switch...
perhaps that functionality has been removed?

No, the original form of the patch was equally vulnerable. AFAICS the
only way to prevent this would be for XLogRequestSwitch (or really
XLogInsert, which does the heavy lifting for this) to suppress a switch
if the current segment is empty *or* contains only a checkpoint WAL
record. Basically it'd have to pretend the checkpoint record is not
there. This is doable but seems a bit weird --- in particular, that
would mean that pg_switch_xlog sometimes returns a pointer less than
pg_current_xlog_location, which might confuse backup scripts.

On the whole I'm leaning towards not changing it. As Florian mentioned,
guaranteed segment-every-checkpoint isn't completely without its uses.
And people who are looking for low WAL volume ought to be stretching
out their checkpoint intervals anyway.

Agreed.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com