[TODO] Track number of files ready to be archived in pg_stat_archiver

Started by Julien Rouhaudover 11 years ago15 messages
#1Julien Rouhaud
julien.rouhaud@dalibo.com
1 attachment(s)

Hello,

Attached patch implements the following TODO item :

Track number of WAL files ready to be archived in pg_stat_archiver

However, it will track the total number of any file ready to be
archived, not only WAL files.

Please let me know what you think about it.

Regards.
--
Julien Rouhaud
http://dalibo.com - http://dalibo.org

Attachments:

pg_stat_archiver_ready_count-v1.patchtext/x-patch; name=pg_stat_archiver_ready_count-v1.patchDownload
*** a/doc/src/sgml/monitoring.sgml
--- b/doc/src/sgml/monitoring.sgml
***************
*** 728,733 **** postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
--- 728,738 ----
        <entry>Time of the last failed archival operation</entry>
       </row>
       <row>
+       <entry><structfield>ready_count</></entry>
+       <entry><type>bigint</type></entry>
+       <entry>Number of files waiting to be archived</entry>
+      </row>
+      <row>
        <entry><structfield>stats_reset</></entry>
        <entry><type>timestamp with time zone</type></entry>
        <entry>Time at which these statistics were last reset</entry>
*** a/src/backend/access/transam/xlogarchive.c
--- b/src/backend/access/transam/xlogarchive.c
***************
*** 24,29 ****
--- 24,30 ----
  #include "access/xlog_internal.h"
  #include "miscadmin.h"
  #include "postmaster/startup.h"
+ #include "pgstat.h"
  #include "replication/walsender.h"
  #include "storage/fd.h"
  #include "storage/ipc.h"
***************
*** 539,544 **** XLogArchiveNotify(const char *xlog)
--- 540,548 ----
  	/* Notify archiver that it's got something to do */
  	if (IsUnderPostmaster)
  		SendPostmasterSignal(PMSIGNAL_WAKEN_ARCHIVER);
+ 
+ 	/* Tell the collector about a new file waiting to be archived */
+ 	pgstat_send_archiver(xlog, ARCH_READY);
  }
  
  /*
*** a/src/backend/catalog/system_views.sql
--- b/src/backend/catalog/system_views.sql
***************
*** 697,702 **** CREATE VIEW pg_stat_archiver AS
--- 697,703 ----
          s.failed_count,
          s.last_failed_wal,
          s.last_failed_time,
+         s.ready_count,
          s.stats_reset
      FROM pg_stat_get_archiver() s;
  
*** a/src/backend/postmaster/pgarch.c
--- b/src/backend/postmaster/pgarch.c
***************
*** 491,497 **** pgarch_ArchiverCopyLoop(void)
  				 * Tell the collector about the WAL file that we successfully
  				 * archived
  				 */
! 				pgstat_send_archiver(xlog, false);
  
  				break;			/* out of inner retry loop */
  			}
--- 491,497 ----
  				 * Tell the collector about the WAL file that we successfully
  				 * archived
  				 */
! 				pgstat_send_archiver(xlog, ARCH_SUCCESS);
  
  				break;			/* out of inner retry loop */
  			}
***************
*** 501,507 **** pgarch_ArchiverCopyLoop(void)
  				 * Tell the collector about the WAL file that we failed to
  				 * archive
  				 */
! 				pgstat_send_archiver(xlog, true);
  
  				if (++failures >= NUM_ARCHIVE_RETRIES)
  				{
--- 501,507 ----
  				 * Tell the collector about the WAL file that we failed to
  				 * archive
  				 */
! 				pgstat_send_archiver(xlog, ARCH_FAIL);
  
  				if (++failures >= NUM_ARCHIVE_RETRIES)
  				{
*** a/src/backend/postmaster/pgstat.c
--- b/src/backend/postmaster/pgstat.c
***************
*** 36,41 ****
--- 36,42 ----
  #include "access/transam.h"
  #include "access/twophase_rmgr.h"
  #include "access/xact.h"
+ #include "access/xlog_internal.h"
  #include "catalog/pg_database.h"
  #include "catalog/pg_proc.h"
  #include "lib/ilist.h"
***************
*** 3084,3094 **** pgstat_send(void *msg, int len)
   * pgstat_send_archiver() -
   *
   *	Tell the collector about the WAL file that we successfully
!  *	archived or failed to archive.
   * ----------
   */
  void
! pgstat_send_archiver(const char *xlog, bool failed)
  {
  	PgStat_MsgArchiver msg;
  
--- 3085,3096 ----
   * pgstat_send_archiver() -
   *
   *	Tell the collector about the WAL file that we successfully
!  *	archived or failed to archive, or the new file waiting
!  *	to be archived.
   * ----------
   */
  void
! pgstat_send_archiver(const char *xlog, ArchiverReason reason)
  {
  	PgStat_MsgArchiver msg;
  
***************
*** 3096,3102 **** pgstat_send_archiver(const char *xlog, bool failed)
  	 * Prepare and send the message
  	 */
  	pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_ARCHIVER);
! 	msg.m_failed = failed;
  	strncpy(msg.m_xlog, xlog, sizeof(msg.m_xlog));
  	msg.m_timestamp = GetCurrentTimestamp();
  	pgstat_send(&msg, sizeof(msg));
--- 3098,3104 ----
  	 * Prepare and send the message
  	 */
  	pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_ARCHIVER);
! 	msg.m_reason = reason;
  	strncpy(msg.m_xlog, xlog, sizeof(msg.m_xlog));
  	msg.m_timestamp = GetCurrentTimestamp();
  	pgstat_send(&msg, sizeof(msg));
***************
*** 3921,3927 **** pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
  	/*
  	 * Try to open the stats file. If it doesn't exist, the backends simply
  	 * return zero for anything and the collector simply starts from scratch
! 	 * with empty counters.
  	 *
  	 * ENOENT is a possibility if the stats collector is not running or has
  	 * not yet written the stats file the first time.  Any other failure
--- 3923,3930 ----
  	/*
  	 * Try to open the stats file. If it doesn't exist, the backends simply
  	 * return zero for anything and the collector simply starts from scratch
! 	 * with empty counters, except for the .ready files count which should
! 	 * always give the real number of files.
  	 *
  	 * ENOENT is a possibility if the stats collector is not running or has
  	 * not yet written the stats file the first time.  Any other failure
***************
*** 3934,3939 **** pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
--- 3937,3970 ----
  					(errcode_for_file_access(),
  					 errmsg("could not open statistics file \"%s\": %m",
  							statfile)));
+ 
+ 		/* Initialize the archive ready counter */
+ 		char		XLogArchiveStatusDir[MAXPGPATH];
+ 		DIR		   *rldir;
+ 		struct dirent *rlde;
+ 
+ 		snprintf(XLogArchiveStatusDir, MAXPGPATH, XLOGDIR "/archive_status");
+ 		rldir = AllocateDir(XLogArchiveStatusDir);
+ 		if (rldir == NULL)
+ 			ereport(ERROR,
+ 					(errcode_for_file_access(),
+ 					 errmsg("could not open archive status directory \"%s\": %m",
+ 							XLogArchiveStatusDir)));
+ 
+ 		while ((rlde = ReadDir(rldir, XLogArchiveStatusDir)) != NULL)
+ 		{
+ 			int		basenamelen = (int) strlen(rlde->d_name) - 6;
+ 
+ 			if (basenamelen >= MIN_XFN_CHARS &&
+ 				basenamelen <= MAX_XFN_CHARS &&
+ 				strspn(rlde->d_name, VALID_XFN_CHARS) >= basenamelen &&
+ 				strcmp(rlde->d_name + basenamelen, ".ready") == 0)
+ 			{
+ 				++archiverStats.ready_count;
+ 			}
+ 		}
+ 		FreeDir(rldir);
+ 
  		return dbhash;
  	}
  
***************
*** 4842,4849 **** pgstat_recv_resetsharedcounter(PgStat_MsgResetsharedcounter *msg, int len)
  	}
  	else if (msg->m_resettarget == RESET_ARCHIVER)
  	{
! 		/* Reset the archiver statistics for the cluster. */
  		memset(&archiverStats, 0, sizeof(archiverStats));
  		archiverStats.stat_reset_timestamp = GetCurrentTimestamp();
  	}
  
--- 4873,4887 ----
  	}
  	else if (msg->m_resettarget == RESET_ARCHIVER)
  	{
! 		PgStat_Counter ready_count;
! 		/*
! 		 * Reset the archiver statistics for the cluster.
! 		 * We must keep the ready_count value as it should
! 		 * always reflect the real count.
! 		*/
! 		ready_count = archiverStats.ready_count;
  		memset(&archiverStats, 0, sizeof(archiverStats));
+ 		archiverStats.ready_count = ready_count;
  		archiverStats.stat_reset_timestamp = GetCurrentTimestamp();
  	}
  
***************
*** 4984,5004 **** pgstat_recv_analyze(PgStat_MsgAnalyze *msg, int len)
  static void
  pgstat_recv_archiver(PgStat_MsgArchiver *msg, int len)
  {
! 	if (msg->m_failed)
! 	{
! 		/* Failed archival attempt */
! 		++archiverStats.failed_count;
! 		memcpy(archiverStats.last_failed_wal, msg->m_xlog,
! 			   sizeof(archiverStats.last_failed_wal));
! 		archiverStats.last_failed_timestamp = msg->m_timestamp;
! 	}
! 	else
  	{
! 		/* Successful archival operation */
! 		++archiverStats.archived_count;
! 		memcpy(archiverStats.last_archived_wal, msg->m_xlog,
! 			   sizeof(archiverStats.last_archived_wal));
! 		archiverStats.last_archived_timestamp = msg->m_timestamp;
  	}
  }
  
--- 5022,5048 ----
  static void
  pgstat_recv_archiver(PgStat_MsgArchiver *msg, int len)
  {
! 	switch (msg->m_reason)
  	{
! 		case ARCH_FAIL:
! 			/* Failed archival attempt */
! 			++archiverStats.failed_count;
! 			memcpy(archiverStats.last_failed_wal, msg->m_xlog,
! 				   sizeof(archiverStats.last_failed_wal));
! 			archiverStats.last_failed_timestamp = msg->m_timestamp;
! 			break;
! 		case ARCH_SUCCESS:
! 			/* Successful archival operation */
! 			++archiverStats.archived_count;
! 			memcpy(archiverStats.last_archived_wal, msg->m_xlog,
! 				   sizeof(archiverStats.last_archived_wal));
! 			archiverStats.last_archived_timestamp = msg->m_timestamp;
! 			--archiverStats.ready_count;
! 		break;
! 		case ARCH_READY:
! 			/* New file waiting to be archived */
! 			++archiverStats.ready_count;
! 			break;
  	}
  }
  
*** a/src/backend/utils/adt/pgstatfuncs.c
--- b/src/backend/utils/adt/pgstatfuncs.c
***************
*** 1746,1752 **** pg_stat_get_archiver(PG_FUNCTION_ARGS)
  	MemSet(nulls, 0, sizeof(nulls));
  
  	/* Initialise attributes information in the tuple descriptor */
! 	tupdesc = CreateTemplateTupleDesc(7, false);
  	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "archived_count",
  					   INT8OID, -1, 0);
  	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "last_archived_wal",
--- 1746,1752 ----
  	MemSet(nulls, 0, sizeof(nulls));
  
  	/* Initialise attributes information in the tuple descriptor */
! 	tupdesc = CreateTemplateTupleDesc(8, false);
  	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "archived_count",
  					   INT8OID, -1, 0);
  	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "last_archived_wal",
***************
*** 1759,1765 **** pg_stat_get_archiver(PG_FUNCTION_ARGS)
  					   TEXTOID, -1, 0);
  	TupleDescInitEntry(tupdesc, (AttrNumber) 6, "last_failed_time",
  					   TIMESTAMPTZOID, -1, 0);
! 	TupleDescInitEntry(tupdesc, (AttrNumber) 7, "stats_reset",
  					   TIMESTAMPTZOID, -1, 0);
  
  	BlessTupleDesc(tupdesc);
--- 1759,1767 ----
  					   TEXTOID, -1, 0);
  	TupleDescInitEntry(tupdesc, (AttrNumber) 6, "last_failed_time",
  					   TIMESTAMPTZOID, -1, 0);
! 	TupleDescInitEntry(tupdesc, (AttrNumber) 7, "ready_count",
! 					   INT8OID, -1, 0);
! 	TupleDescInitEntry(tupdesc, (AttrNumber) 8, "stats_reset",
  					   TIMESTAMPTZOID, -1, 0);
  
  	BlessTupleDesc(tupdesc);
***************
*** 1790,1799 **** pg_stat_get_archiver(PG_FUNCTION_ARGS)
  	else
  		values[5] = TimestampTzGetDatum(archiver_stats->last_failed_timestamp);
  
  	if (archiver_stats->stat_reset_timestamp == 0)
! 		nulls[6] = true;
  	else
! 		values[6] = TimestampTzGetDatum(archiver_stats->stat_reset_timestamp);
  
  	/* Returns the record as Datum */
  	PG_RETURN_DATUM(HeapTupleGetDatum(
--- 1792,1802 ----
  	else
  		values[5] = TimestampTzGetDatum(archiver_stats->last_failed_timestamp);
  
+ 	values[6] = Int64GetDatum(archiver_stats->ready_count);
  	if (archiver_stats->stat_reset_timestamp == 0)
! 		nulls[7] = true;
  	else
! 		values[7] = TimestampTzGetDatum(archiver_stats->stat_reset_timestamp);
  
  	/* Returns the record as Datum */
  	PG_RETURN_DATUM(HeapTupleGetDatum(
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
***************
*** 2751,2757 **** DATA(insert OID = 2844 (  pg_stat_get_db_blk_read_time	PGNSP PGUID 12 1 0 0 0 f
  DESCR("statistics: block read time, in msec");
  DATA(insert OID = 2845 (  pg_stat_get_db_blk_write_time PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 701 "26" _null_ _null_ _null_ _null_ pg_stat_get_db_blk_write_time _null_ _null_ _null_ ));
  DESCR("statistics: block write time, in msec");
! DATA(insert OID = 3195 (  pg_stat_get_archiver		PGNSP PGUID 12 1 0 0 0 f f f f f f s 0 0 2249 "" "{20,25,1184,20,25,1184,1184}" "{o,o,o,o,o,o,o}" "{archived_count,last_archived_wal,last_archived_time,failed_count,last_failed_wal,last_failed_time,stats_reset}" _null_ pg_stat_get_archiver _null_ _null_ _null_ ));
  DESCR("statistics: information about WAL archiver");
  DATA(insert OID = 2769 ( pg_stat_get_bgwriter_timed_checkpoints PGNSP PGUID 12 1 0 0 0 f f f f t f s 0 0 20 "" _null_ _null_ _null_ _null_ pg_stat_get_bgwriter_timed_checkpoints _null_ _null_ _null_ ));
  DESCR("statistics: number of timed checkpoints started by the bgwriter");
--- 2751,2757 ----
  DESCR("statistics: block read time, in msec");
  DATA(insert OID = 2845 (  pg_stat_get_db_blk_write_time PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 701 "26" _null_ _null_ _null_ _null_ pg_stat_get_db_blk_write_time _null_ _null_ _null_ ));
  DESCR("statistics: block write time, in msec");
! DATA(insert OID = 3195 (  pg_stat_get_archiver		PGNSP PGUID 12 1 0 0 0 f f f f f f s 0 0 2249 "" "{20,25,1184,20,25,1184,20,1184}" "{o,o,o,o,o,o,o,o}" "{archived_count,last_archived_wal,last_archived_time,failed_count,last_failed_wal,last_failed_time,ready_count,stats_reset}" _null_ pg_stat_get_archiver _null_ _null_ _null_ ));
  DESCR("statistics: information about WAL archiver");
  DATA(insert OID = 2769 ( pg_stat_get_bgwriter_timed_checkpoints PGNSP PGUID 12 1 0 0 0 f f f f t f s 0 0 20 "" _null_ _null_ _null_ _null_ pg_stat_get_bgwriter_timed_checkpoints _null_ _null_ _null_ ));
  DESCR("statistics: number of timed checkpoints started by the bgwriter");
*** a/src/include/pgstat.h
--- b/src/include/pgstat.h
***************
*** 376,382 **** typedef struct PgStat_MsgAnalyze
  typedef struct PgStat_MsgArchiver
  {
  	PgStat_MsgHdr m_hdr;
! 	bool		m_failed;		/* Failed attempt */
  	char		m_xlog[MAX_XFN_CHARS + 1];
  	TimestampTz m_timestamp;
  } PgStat_MsgArchiver;
--- 376,382 ----
  typedef struct PgStat_MsgArchiver
  {
  	PgStat_MsgHdr m_hdr;
! 	int			m_reason;
  	char		m_xlog[MAX_XFN_CHARS + 1];
  	TimestampTz m_timestamp;
  } PgStat_MsgArchiver;
***************
*** 651,656 **** typedef struct PgStat_ArchiverStats
--- 651,657 ----
  	char		last_failed_wal[MAX_XFN_CHARS + 1];		/* WAL file involved in
  														 * last failure */
  	TimestampTz last_failed_timestamp;	/* last archival failure time */
+ 	PgStat_Counter ready_count;		/* Number of files waiting to be archived */
  	TimestampTz stat_reset_timestamp;
  } PgStat_ArchiverStats;
  
***************
*** 690,695 **** typedef enum BackendState
--- 691,707 ----
  } BackendState;
  
  /* ----------
+  * Archiver reason
+  * ----------
+  */
+ typedef enum ArchiverReason
+ {
+ 	ARCH_SUCCESS,
+ 	ARCH_FAIL,
+ 	ARCH_READY,
+ } ArchiverReason;
+ 
+ /* ----------
   * Shared-memory data structures
   * ----------
   */
***************
*** 934,940 **** extern void pgstat_twophase_postcommit(TransactionId xid, uint16 info,
  extern void pgstat_twophase_postabort(TransactionId xid, uint16 info,
  						  void *recdata, uint32 len);
  
! extern void pgstat_send_archiver(const char *xlog, bool failed);
  extern void pgstat_send_bgwriter(void);
  
  /* ----------
--- 946,952 ----
  extern void pgstat_twophase_postabort(TransactionId xid, uint16 info,
  						  void *recdata, uint32 len);
  
! extern void pgstat_send_archiver(const char *xlog, ArchiverReason reason);
  extern void pgstat_send_bgwriter(void);
  
  /* ----------
*** a/src/test/regress/expected/rules.out
--- b/src/test/regress/expected/rules.out
***************
*** 1659,1666 **** pg_stat_archiver| SELECT s.archived_count,
      s.failed_count,
      s.last_failed_wal,
      s.last_failed_time,
      s.stats_reset
!    FROM pg_stat_get_archiver() s(archived_count, last_archived_wal, last_archived_time, failed_count, last_failed_wal, last_failed_time, stats_reset);
  pg_stat_bgwriter| SELECT pg_stat_get_bgwriter_timed_checkpoints() AS checkpoints_timed,
      pg_stat_get_bgwriter_requested_checkpoints() AS checkpoints_req,
      pg_stat_get_checkpoint_write_time() AS checkpoint_write_time,
--- 1659,1667 ----
      s.failed_count,
      s.last_failed_wal,
      s.last_failed_time,
+     s.ready_count,
      s.stats_reset
!    FROM pg_stat_get_archiver() s(archived_count, last_archived_wal, last_archived_time, failed_count, last_failed_wal, last_failed_time, ready_count, stats_reset);
  pg_stat_bgwriter| SELECT pg_stat_get_bgwriter_timed_checkpoints() AS checkpoints_timed,
      pg_stat_get_bgwriter_requested_checkpoints() AS checkpoints_req,
      pg_stat_get_checkpoint_write_time() AS checkpoint_write_time,
#2Gilles Darold
gilles.darold@dalibo.com
In reply to: Julien Rouhaud (#1)
Re: [TODO] Track number of files ready to be archived in pg_stat_archiver

Le 21/08/2014 10:17, Julien Rouhaud a écrit :

Hello,

Attached patch implements the following TODO item :

Track number of WAL files ready to be archived in pg_stat_archiver

However, it will track the total number of any file ready to be
archived, not only WAL files.

Please let me know what you think about it.

Regards.

Hi,

Maybe looking at archive ready count will be more efficient if it is
done in the view definition through a function. This will avoid any
issue with incrementing/decrement of archiverStats.ready_count and the
patch will be more simple. Also I don't think we need an other memory
allocation for that, the counter information is always in the number of
.ready files in the archive_status directory and the call to
pg_stat_archiver doesn't need high speed performances.

For example having a new function called
pg_stat_get_archive_ready_count() that does the same at what you add
into pgstat_read_statsfiles() and the pg_stat_archiver defined as follow:

CREATE VIEW pg_stat_archiver AS
s.failed_count,
s.last_failed_wal,
s.last_failed_time,
pg_stat_get_archive_ready() as ready_count,
s.stats_reset
FROM pg_stat_get_archiver() s;

The function pg_stat_get_archive_ready_count() will also be available
for any other querying.

--
Gilles Darold
http://dalibo.com - http://dalibo.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Julien Rouhaud
julien.rouhaud@dalibo.com
In reply to: Gilles Darold (#2)
1 attachment(s)
Re: [TODO] Track number of files ready to be archived in pg_stat_archiver

Le 25/08/2014 19:00, Gilles Darold a écrit :

Le 21/08/2014 10:17, Julien Rouhaud a écrit :

Hello,

Attached patch implements the following TODO item :

Track number of WAL files ready to be archived in pg_stat_archiver

However, it will track the total number of any file ready to be
archived, not only WAL files.

Please let me know what you think about it.

Regards.

Hi,

Maybe looking at archive ready count will be more efficient if it is
done in the view definition through a function. This will avoid any
issue with incrementing/decrement of archiverStats.ready_count and the
patch will be more simple. Also I don't think we need an other memory
allocation for that, the counter information is always in the number of
.ready files in the archive_status directory and the call to
pg_stat_archiver doesn't need high speed performances.

For example having a new function called
pg_stat_get_archive_ready_count() that does the same at what you add
into pgstat_read_statsfiles() and the pg_stat_archiver defined as follow:

CREATE VIEW pg_stat_archiver AS
s.failed_count,
s.last_failed_wal,
s.last_failed_time,
pg_stat_get_archive_ready() as ready_count,
s.stats_reset
FROM pg_stat_get_archiver() s;

The function pg_stat_get_archive_ready_count() will also be available
for any other querying.

Indeed, this approach should be more efficient. It also avoid unexpected
results, like if someone has the bad idea to remove a .ready file in
pg_xlog/archive_status directory.

Attached v2 patch implements this approach. All the work is still done
in pg_stat_get_archiver, as I don't think that having a specific
function for that information would be really interesting.
--
Julien Rouhaud
http://dalibo.com - http://dalibo.org

Attachments:

pg_stat_archiver_ready_count-v2.patchtext/x-patch; name=pg_stat_archiver_ready_count-v2.patchDownload
*** a/doc/src/sgml/monitoring.sgml
--- b/doc/src/sgml/monitoring.sgml
***************
*** 728,733 **** postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
--- 728,738 ----
        <entry>Time of the last failed archival operation</entry>
       </row>
       <row>
+       <entry><structfield>ready_count</></entry>
+       <entry><type>bigint</type></entry>
+       <entry>Number of files waiting to be archived</entry>
+      </row>
+      <row>
        <entry><structfield>stats_reset</></entry>
        <entry><type>timestamp with time zone</type></entry>
        <entry>Time at which these statistics were last reset</entry>
*** a/src/backend/catalog/system_views.sql
--- b/src/backend/catalog/system_views.sql
***************
*** 697,702 **** CREATE VIEW pg_stat_archiver AS
--- 697,703 ----
          s.failed_count,
          s.last_failed_wal,
          s.last_failed_time,
+         s.ready_count,
          s.stats_reset
      FROM pg_stat_get_archiver() s;
  
*** a/src/backend/utils/adt/pgstatfuncs.c
--- b/src/backend/utils/adt/pgstatfuncs.c
***************
*** 15,25 ****
--- 15,27 ----
  #include "postgres.h"
  
  #include "access/htup_details.h"
+ #include "access/xlog_internal.h"
  #include "catalog/pg_type.h"
  #include "funcapi.h"
  #include "libpq/ip.h"
  #include "miscadmin.h"
  #include "pgstat.h"
+ #include "storage/fd.h"
  #include "utils/builtins.h"
  #include "utils/inet.h"
  #include "utils/timestamp.h"
***************
*** 1737,1752 **** Datum
  pg_stat_get_archiver(PG_FUNCTION_ARGS)
  {
  	TupleDesc	tupdesc;
! 	Datum		values[7];
! 	bool		nulls[7];
  	PgStat_ArchiverStats *archiver_stats;
  
  	/* Initialise values and NULL flags arrays */
  	MemSet(values, 0, sizeof(values));
  	MemSet(nulls, 0, sizeof(nulls));
  
  	/* Initialise attributes information in the tuple descriptor */
! 	tupdesc = CreateTemplateTupleDesc(7, false);
  	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "archived_count",
  					   INT8OID, -1, 0);
  	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "last_archived_wal",
--- 1739,1758 ----
  pg_stat_get_archiver(PG_FUNCTION_ARGS)
  {
  	TupleDesc	tupdesc;
! 	Datum		values[8];
! 	bool		nulls[8];
  	PgStat_ArchiverStats *archiver_stats;
+ 	char		XLogArchiveStatusDir[MAXPGPATH];
+ 	DIR		   *rldir;
+ 	struct		dirent *rlde;
+ 	int			ready_count;
  
  	/* Initialise values and NULL flags arrays */
  	MemSet(values, 0, sizeof(values));
  	MemSet(nulls, 0, sizeof(nulls));
  
  	/* Initialise attributes information in the tuple descriptor */
! 	tupdesc = CreateTemplateTupleDesc(8, false);
  	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "archived_count",
  					   INT8OID, -1, 0);
  	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "last_archived_wal",
***************
*** 1759,1765 **** pg_stat_get_archiver(PG_FUNCTION_ARGS)
  					   TEXTOID, -1, 0);
  	TupleDescInitEntry(tupdesc, (AttrNumber) 6, "last_failed_time",
  					   TIMESTAMPTZOID, -1, 0);
! 	TupleDescInitEntry(tupdesc, (AttrNumber) 7, "stats_reset",
  					   TIMESTAMPTZOID, -1, 0);
  
  	BlessTupleDesc(tupdesc);
--- 1765,1773 ----
  					   TEXTOID, -1, 0);
  	TupleDescInitEntry(tupdesc, (AttrNumber) 6, "last_failed_time",
  					   TIMESTAMPTZOID, -1, 0);
! 	TupleDescInitEntry(tupdesc, (AttrNumber) 7, "ready_count",
! 					   INT8OID, -1, 0);
! 	TupleDescInitEntry(tupdesc, (AttrNumber) 8, "stats_reset",
  					   TIMESTAMPTZOID, -1, 0);
  
  	BlessTupleDesc(tupdesc);
***************
*** 1790,1799 **** pg_stat_get_archiver(PG_FUNCTION_ARGS)
  	else
  		values[5] = TimestampTzGetDatum(archiver_stats->last_failed_timestamp);
  
  	if (archiver_stats->stat_reset_timestamp == 0)
! 		nulls[6] = true;
  	else
! 		values[6] = TimestampTzGetDatum(archiver_stats->stat_reset_timestamp);
  
  	/* Returns the record as Datum */
  	PG_RETURN_DATUM(HeapTupleGetDatum(
--- 1798,1830 ----
  	else
  		values[5] = TimestampTzGetDatum(archiver_stats->last_failed_timestamp);
  
+ 	snprintf(XLogArchiveStatusDir, MAXPGPATH, XLOGDIR "/archive_status");
+ 	rldir = AllocateDir(XLogArchiveStatusDir);
+ 	if (rldir == NULL)
+ 		ereport(ERROR,
+ 				(errcode_for_file_access(),
+ 				 errmsg("could not open archive status directory \"%s\": %m",
+ 						XLogArchiveStatusDir)));
+ 
+ 	while ((rlde = ReadDir(rldir, XLogArchiveStatusDir)) != NULL)
+ 	{
+ 		int			basenamelen = (int) strlen(rlde->d_name) - 6;
+ 
+ 		if (basenamelen >= MIN_XFN_CHARS &&
+ 			basenamelen <= MAX_XFN_CHARS &&
+ 			strspn(rlde->d_name, VALID_XFN_CHARS) >= basenamelen &&
+ 			strcmp(rlde->d_name + basenamelen, ".ready") == 0)
+ 		{
+ 			++ready_count;
+ 		}
+ 	}
+ 	FreeDir(rldir);
+ 	values[6] = Int64GetDatum(ready_count);
+ 
  	if (archiver_stats->stat_reset_timestamp == 0)
! 		nulls[7] = true;
  	else
! 		values[7] = TimestampTzGetDatum(archiver_stats->stat_reset_timestamp);
  
  	/* Returns the record as Datum */
  	PG_RETURN_DATUM(HeapTupleGetDatum(
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
***************
*** 2751,2757 **** DATA(insert OID = 2844 (  pg_stat_get_db_blk_read_time	PGNSP PGUID 12 1 0 0 0 f
  DESCR("statistics: block read time, in msec");
  DATA(insert OID = 2845 (  pg_stat_get_db_blk_write_time PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 701 "26" _null_ _null_ _null_ _null_ pg_stat_get_db_blk_write_time _null_ _null_ _null_ ));
  DESCR("statistics: block write time, in msec");
! DATA(insert OID = 3195 (  pg_stat_get_archiver		PGNSP PGUID 12 1 0 0 0 f f f f f f s 0 0 2249 "" "{20,25,1184,20,25,1184,1184}" "{o,o,o,o,o,o,o}" "{archived_count,last_archived_wal,last_archived_time,failed_count,last_failed_wal,last_failed_time,stats_reset}" _null_ pg_stat_get_archiver _null_ _null_ _null_ ));
  DESCR("statistics: information about WAL archiver");
  DATA(insert OID = 2769 ( pg_stat_get_bgwriter_timed_checkpoints PGNSP PGUID 12 1 0 0 0 f f f f t f s 0 0 20 "" _null_ _null_ _null_ _null_ pg_stat_get_bgwriter_timed_checkpoints _null_ _null_ _null_ ));
  DESCR("statistics: number of timed checkpoints started by the bgwriter");
--- 2751,2757 ----
  DESCR("statistics: block read time, in msec");
  DATA(insert OID = 2845 (  pg_stat_get_db_blk_write_time PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 701 "26" _null_ _null_ _null_ _null_ pg_stat_get_db_blk_write_time _null_ _null_ _null_ ));
  DESCR("statistics: block write time, in msec");
! DATA(insert OID = 3195 (  pg_stat_get_archiver		PGNSP PGUID 12 1 0 0 0 f f f f f f s 0 0 2249 "" "{20,25,1184,20,25,1184,20,1184}" "{o,o,o,o,o,o,o,o}" "{archived_count,last_archived_wal,last_archived_time,failed_count,last_failed_wal,last_failed_time,ready_count,stats_reset}" _null_ pg_stat_get_archiver _null_ _null_ _null_ ));
  DESCR("statistics: information about WAL archiver");
  DATA(insert OID = 2769 ( pg_stat_get_bgwriter_timed_checkpoints PGNSP PGUID 12 1 0 0 0 f f f f t f s 0 0 20 "" _null_ _null_ _null_ _null_ pg_stat_get_bgwriter_timed_checkpoints _null_ _null_ _null_ ));
  DESCR("statistics: number of timed checkpoints started by the bgwriter");
*** a/src/test/regress/expected/rules.out
--- b/src/test/regress/expected/rules.out
***************
*** 1659,1666 **** pg_stat_archiver| SELECT s.archived_count,
      s.failed_count,
      s.last_failed_wal,
      s.last_failed_time,
      s.stats_reset
!    FROM pg_stat_get_archiver() s(archived_count, last_archived_wal, last_archived_time, failed_count, last_failed_wal, last_failed_time, stats_reset);
  pg_stat_bgwriter| SELECT pg_stat_get_bgwriter_timed_checkpoints() AS checkpoints_timed,
      pg_stat_get_bgwriter_requested_checkpoints() AS checkpoints_req,
      pg_stat_get_checkpoint_write_time() AS checkpoint_write_time,
--- 1659,1667 ----
      s.failed_count,
      s.last_failed_wal,
      s.last_failed_time,
+     s.ready_count,
      s.stats_reset
!    FROM pg_stat_get_archiver() s(archived_count, last_archived_wal, last_archived_time, failed_count, last_failed_wal, last_failed_time, ready_count, stats_reset);
  pg_stat_bgwriter| SELECT pg_stat_get_bgwriter_timed_checkpoints() AS checkpoints_timed,
      pg_stat_get_bgwriter_requested_checkpoints() AS checkpoints_req,
      pg_stat_get_checkpoint_write_time() AS checkpoint_write_time,
#4Michael Paquier
michael.paquier@gmail.com
In reply to: Julien Rouhaud (#3)
Re: [TODO] Track number of files ready to be archived in pg_stat_archiver

On Thu, Aug 28, 2014 at 7:37 AM, Julien Rouhaud
<julien.rouhaud@dalibo.com> wrote:

Attached v2 patch implements this approach. All the work is still done
in pg_stat_get_archiver, as I don't think that having a specific
function for that information would be really interesting.

Please be sure to add that to the next commit fest. This is a feature
most welcome within this system view.
Regards,
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Julien Rouhaud
julien.rouhaud@dalibo.com
In reply to: Michael Paquier (#4)
Re: [TODO] Track number of files ready to be archived in pg_stat_archiver

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Le 28/08/2014 05:58, Michael Paquier a écrit :

On Thu, Aug 28, 2014 at 7:37 AM, Julien Rouhaud
<julien.rouhaud@dalibo.com> wrote:

Attached v2 patch implements this approach. All the work is still
done in pg_stat_get_archiver, as I don't think that having a
specific function for that information would be really
interesting.

Please be sure to add that to the next commit fest. This is a
feature most welcome within this system view. Regards,

I just added it.

Thanks.

- --
Julien Rouhaud
http://dalibo.com - http://dalibo.org
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iQEcBAEBAgAGBQJUAMv6AAoJELGaJ8vfEpOqZgIIAKNp0a4XaZNRtEw3+yZogxLD
RIpSnURh1COEZs5UUkdsuybvLqOqZXbCQWfK9+3B3pqoYD9LTIzlg4jcArOcbqgd
Fe43BEH4QYabjdS1DWGSzop9E0NY/Vg82ZGzyHzGYQKI1k9Y/pEeM5q74vRN3aH0
RbUbcnN0ajCMswLbjfc/nDXNCDAr96peLZoI1l2lW7fJIElkXJz/I28fNAHtj7Dg
hxmBXf8uVZ7g+pCVIhLodFm4mp4ZB0ZvTHxDHCXU9wH/p7otDD4GV0Cml9DlSfE6
cFm0CXfeMHawaihz6bs8Z1Zxntdh7Qy+lAHmBRuXZUwzaJYTDxwL/YCvnSsVE9o=
=kD4R
-----END PGP SIGNATURE-----

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Brightwell, Adam
adam.brightwell@crunchydatasolutions.com
In reply to: Julien Rouhaud (#5)
Re: [TODO] Track number of files ready to be archived in pg_stat_archiver

Julien,

The following is an initial review:

* Applies cleanly to master (f330a6d).
* Regression tests updated and pass, including 'check-world'.
* Documentation updated and builds successfully.
* Might want to consider replacing the following magic number with a
constant or perhaps calculated value.

+ int basenamelen = (int) strlen(rlde->d_name) - 6;

* Wouldn't it be easier, or perhaps more reliable to use "strrchr()" with
the following instead?

+ strcmp(rlde->d_name + basenamelen, ".ready") == 0)

char *extension = strrchr(ride->d_name, '.');
...
strcmp(extension, ".ready") == 0)

I think this approach might also help to resolve the magic number above.
For example:

char *extension = strrchr(ride->d_name, '.');
int basenamelen = (int) strlen(ride->d_name) - strlen(extension);

-Adam

--
Adam Brightwell - adam.brightwell@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com

#7Julien Rouhaud
rjuju123@gmail.com
In reply to: Brightwell, Adam (#6)
Re: [TODO] Track number of files ready to be archived in pg_stat_archiver

On Tue, Oct 21, 2014 at 7:35 AM, Brightwell, Adam <
adam.brightwell@crunchydatasolutions.com> wrote:

Julien,

The following is an initial review:

Thanks for the review.

* Applies cleanly to master (f330a6d).
* Regression tests updated and pass, including 'check-world'.
* Documentation updated and builds successfully.
* Might want to consider replacing the following magic number with a
constant or perhaps calculated value.

+ int basenamelen = (int) strlen(rlde->d_name) - 6;

* Wouldn't it be easier, or perhaps more reliable to use "strrchr()" with
the following instead?

+ strcmp(rlde->d_name + basenamelen, ".ready") == 0)

char *extension = strrchr(ride->d_name, '.');
...
strcmp(extension, ".ready") == 0)

I think this approach might also help to resolve the magic number above.
For example:

char *extension = strrchr(ride->d_name, '.');
int basenamelen = (int) strlen(ride->d_name) - strlen(extension);

Actually, I used the same loop as the archiver one (see
backend/postmaster/pgarch.c, function pgarch_readyXlog) to get the exact
same number of files.

If we change it in this patch, it would be better to change it everywhere.
What do you think ?

-Adam

Show quoted text

--
Adam Brightwell - adam.brightwell@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com

#8Brightwell, Adam
adam.brightwell@crunchydatasolutions.com
In reply to: Julien Rouhaud (#7)
Re: [TODO] Track number of files ready to be archived in pg_stat_archiver

Julien,

Actually, I used the same loop as the archiver one (see
backend/postmaster/pgarch.c, function pgarch_readyXlog) to get the exact
same number of files.

Ah, I see.

If we change it in this patch, it would be better to change it everywhere.
What do you think ?

Hmm... I'd have to defer to the better judgement of a committer on that
one. Though, I would think that the general desire would be to keep the
patch relevant ONLY to the necessary changes. I would not qualify making
those types of changes as relevant, IMHO. I do think this is potential for
cleanup, however, I would suspect that would be best done in a separate
patch. But again, I'd defer to a committer whether such changes are even
necessary/acceptable.

-Adam

--
Adam Brightwell - adam.brightwell@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com

#9Simon Riggs
simon@2ndQuadrant.com
In reply to: Julien Rouhaud (#1)
Re: [TODO] Track number of files ready to be archived in pg_stat_archiver

On 21 August 2014 09:17, Julien Rouhaud <julien.rouhaud@dalibo.com> wrote:

Track number of WAL files ready to be archived in pg_stat_archiver

Would it be OK to ask what the purpose of this TODO item is?

pg_stat_archiver already has a column for last_archived_wal and
last_failed_wal, so you can already work out how many files there must
be between then and now. Perhaps that can be added directly to the
view, to assist the user in calculating it. Reading the directory
itself to count the file is unnecessary, except as a diagnostic.

Please don't take "it is a TODO item" as "generally accepeted that
this makes sense".

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Michael Paquier
michael.paquier@gmail.com
In reply to: Simon Riggs (#9)
Re: [TODO] Track number of files ready to be archived in pg_stat_archiver

On Tue, Nov 18, 2014 at 5:47 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 21 August 2014 09:17, Julien Rouhaud <julien.rouhaud@dalibo.com> wrote:

Track number of WAL files ready to be archived in pg_stat_archiver

Would it be OK to ask what the purpose of this TODO item is?

pg_stat_archiver already has a column for last_archived_wal and
last_failed_wal, so you can already work out how many files there must
be between then and now. Perhaps that can be added directly to the
view, to assist the user in calculating it. Reading the directory
itself to count the file is unnecessary, except as a diagnostic.

Not sure if this holds true in a node freshly started from a base
backup with a set of WAL files, or with files manually copied by an
operator.

Please don't take "it is a TODO item" as "generally accepeted that
this makes sense".

On systems where the WAL archiving is slower than WAL generation at
peak time, the DBA may want to know how long is the queue of WAL files
waiting to be archived. That's IMO something we simply forgot in the
first implementation of pg_stat_archiver, and the most direct way to
know that is to count the .ready files in archive_status.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Michael Paquier
michael.paquier@gmail.com
In reply to: Brightwell, Adam (#8)
1 attachment(s)
Re: [TODO] Track number of files ready to be archived in pg_stat_archiver

On Wed, Oct 22, 2014 at 12:50 AM, Brightwell, Adam
<adam.brightwell@crunchydatasolutions.com> wrote:

Though, I would think that the general desire would be to keep the patch
relevant ONLY to the necessary changes. I would not qualify making those
types of changes as relevant, IMHO. I do think this is potential for
cleanup, however, I would suspect that would be best done in a separate
patch. But again, I'd defer to a committer whether such changes are even
necessary/acceptable.

I have been looking at this patch, and I think that it is a mistake to
count the .ready files present in archive_status when calling
pg_stat_get_archiver(). If there are many files waiting to be
archived, this penalizes the run time of this function, and the
application behind relying on those results, not to mention that
actually the loop used to count the .ready files is a copy of what is
in pgarch.c. Hence I think that we should simply count them in
pgarch_readyXlog, and then return a value back to
pgarch_ArchiverCopyLoop, value that could be decremented by 1 each
time a file is successfully archived to keep the stats as precise as
possible, and let the information know useful information when
archiver process is within a single loop process of
pgarch_ArchiverCopyLoop. This way, we just need to extend
PgStat_MsgArchiver with a new counter to track this number.

The attached patch, based on v2 sent previously, does so. Thoughts?
--
Michael

Attachments:

pg_stat_archiver_ready_count-v3.patchtext/x-patch; charset=US-ASCII; name=pg_stat_archiver_ready_count-v3.patchDownload
From e12a1aff3f1b423da5277cccf2a76ec09318567a Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@otacoo.com>
Date: Tue, 18 Nov 2014 16:30:23 +0900
Subject: [PATCH] Track number of files marked as ready for archiving in
 pg_stat_archiver

This number of files is directly tracked by the archiver process that then
reports the number it finds to the stat machinery. Note that when archiver
marks a file as successfully archived, it decrements by one the number of
files waiting to be archived, giving more precise information to the user.
---
 doc/src/sgml/monitoring.sgml         |  5 +++++
 src/backend/catalog/system_views.sql |  1 +
 src/backend/postmaster/pgarch.c      | 33 +++++++++++++++++++--------------
 src/backend/postmaster/pgstat.c      |  6 +++++-
 src/backend/utils/adt/pgstatfuncs.c  | 21 +++++++++++++++------
 src/include/catalog/pg_proc.h        |  2 +-
 src/include/pgstat.h                 |  5 ++++-
 src/test/regress/expected/rules.out  |  3 ++-
 8 files changed, 52 insertions(+), 24 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index b29e5e6..4f4ac73 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -870,6 +870,11 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
       <entry>Time of the last failed archival operation</entry>
      </row>
      <row>
+      <entry><structfield>ready_count</></entry>
+      <entry><type>bigint</type></entry>
+      <entry>Number of files waiting to be archived</entry>
+     </row>
+     <row>
       <entry><structfield>stats_reset</></entry>
       <entry><type>timestamp with time zone</type></entry>
       <entry>Time at which these statistics were last reset</entry>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index a819952..195769c 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -729,6 +729,7 @@ CREATE VIEW pg_stat_archiver AS
         s.failed_count,
         s.last_failed_wal,
         s.last_failed_time,
+        s.ready_count,
         s.stats_reset
     FROM pg_stat_get_archiver() s;
 
diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c
index 6a5c5b0..7f5b813 100644
--- a/src/backend/postmaster/pgarch.c
+++ b/src/backend/postmaster/pgarch.c
@@ -100,7 +100,7 @@ static void pgarch_waken_stop(SIGNAL_ARGS);
 static void pgarch_MainLoop(void);
 static void pgarch_ArchiverCopyLoop(void);
 static bool pgarch_archiveXlog(char *xlog);
-static bool pgarch_readyXlog(char *xlog);
+static int64 pgarch_readyXlog(char *xlog);
 static void pgarch_archiveDone(char *xlog);
 
 
@@ -440,6 +440,7 @@ static void
 pgarch_ArchiverCopyLoop(void)
 {
 	char		xlog[MAX_XFN_CHARS + 1];
+	int64		ready_count;
 
 	/*
 	 * loop through all xlogs with archive_status of .ready and archive
@@ -447,7 +448,7 @@ pgarch_ArchiverCopyLoop(void)
 	 * some backend will add files onto the list of those that need archiving
 	 * while we are still copying earlier archives
 	 */
-	while (pgarch_readyXlog(xlog))
+	while ((ready_count = pgarch_readyXlog(xlog)) != 0)
 	{
 		int			failures = 0;
 
@@ -488,11 +489,16 @@ pgarch_ArchiverCopyLoop(void)
 				pgarch_archiveDone(xlog);
 
 				/*
+				 * File has been archived, reducing by one the entries waiting
+				 * to be archived.
+				 */
+				ready_count--;
+
+				/*
 				 * Tell the collector about the WAL file that we successfully
 				 * archived
 				 */
-				pgstat_send_archiver(xlog, false);
-
+				pgstat_send_archiver(xlog, ready_count, false);
 				break;			/* out of inner retry loop */
 			}
 			else
@@ -501,7 +507,7 @@ pgarch_ArchiverCopyLoop(void)
 				 * Tell the collector about the WAL file that we failed to
 				 * archive
 				 */
-				pgstat_send_archiver(xlog, true);
+				pgstat_send_archiver(xlog, ready_count, true);
 
 				if (++failures >= NUM_ARCHIVE_RETRIES)
 				{
@@ -668,7 +674,8 @@ pgarch_archiveXlog(char *xlog)
  * No notification is set that file archiving is now in progress, so
  * this would need to be extended if multiple concurrent archival
  * tasks were created. If a failure occurs, we will completely
- * re-copy the file at the next available opportunity.
+ * re-copy the file at the next available opportunity. This function
+ * returns the number of files counted as in ready state.
  *
  * It is important that we return the oldest, so that we archive xlogs
  * in order that they were written, for two reasons:
@@ -682,7 +689,7 @@ pgarch_archiveXlog(char *xlog)
  * higher priority for archiving.  This seems okay, or at least not
  * obviously worth changing.
  */
-static bool
+static int64
 pgarch_readyXlog(char *xlog)
 {
 	/*
@@ -695,7 +702,7 @@ pgarch_readyXlog(char *xlog)
 	char		newxlog[MAX_XFN_CHARS + 6 + 1];
 	DIR		   *rldir;
 	struct dirent *rlde;
-	bool		found = false;
+	int64		ready_count = 0;
 
 	snprintf(XLogArchiveStatusDir, MAXPGPATH, XLOGDIR "/archive_status");
 	rldir = AllocateDir(XLogArchiveStatusDir);
@@ -714,27 +721,25 @@ pgarch_readyXlog(char *xlog)
 			strspn(rlde->d_name, VALID_XFN_CHARS) >= basenamelen &&
 			strcmp(rlde->d_name + basenamelen, ".ready") == 0)
 		{
-			if (!found)
-			{
+			if (ready_count == 0)
 				strcpy(newxlog, rlde->d_name);
-				found = true;
-			}
 			else
 			{
 				if (strcmp(rlde->d_name, newxlog) < 0)
 					strcpy(newxlog, rlde->d_name);
 			}
+			ready_count++;
 		}
 	}
 	FreeDir(rldir);
 
-	if (found)
+	if (ready_count > 0)
 	{
 		/* truncate off the .ready */
 		newxlog[strlen(newxlog) - 6] = '\0';
 		strcpy(xlog, newxlog);
 	}
-	return found;
+	return ready_count;
 }
 
 /*
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index c7f41a5..2e9b276 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -3088,10 +3088,12 @@ pgstat_send(void *msg, int len)
  * ----------
  */
 void
-pgstat_send_archiver(const char *xlog, bool failed)
+pgstat_send_archiver(const char *xlog, int64 ready_count, bool failed)
 {
 	PgStat_MsgArchiver msg;
 
+	Assert(ready_count >= 0);
+
 	/*
 	 * Prepare and send the message
 	 */
@@ -3099,6 +3101,7 @@ pgstat_send_archiver(const char *xlog, bool failed)
 	msg.m_failed = failed;
 	strncpy(msg.m_xlog, xlog, sizeof(msg.m_xlog));
 	msg.m_timestamp = GetCurrentTimestamp();
+	msg.m_ready_count = ready_count;
 	pgstat_send(&msg, sizeof(msg));
 }
 
@@ -5000,6 +5003,7 @@ pgstat_recv_archiver(PgStat_MsgArchiver *msg, int len)
 			   sizeof(archiverStats.last_archived_wal));
 		archiverStats.last_archived_timestamp = msg->m_timestamp;
 	}
+	archiverStats.ready_count = msg->m_ready_count;
 }
 
 /* ----------
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index d621a68..37fd7d2 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -15,11 +15,13 @@
 #include "postgres.h"
 
 #include "access/htup_details.h"
+#include "access/xlog_internal.h"
 #include "catalog/pg_type.h"
 #include "funcapi.h"
 #include "libpq/ip.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "storage/fd.h"
 #include "utils/builtins.h"
 #include "utils/inet.h"
 #include "utils/timestamp.h"
@@ -1737,8 +1739,8 @@ Datum
 pg_stat_get_archiver(PG_FUNCTION_ARGS)
 {
 	TupleDesc	tupdesc;
-	Datum		values[7];
-	bool		nulls[7];
+	Datum		values[8];
+	bool		nulls[8];
 	PgStat_ArchiverStats *archiver_stats;
 
 	/* Initialise values and NULL flags arrays */
@@ -1746,7 +1748,7 @@ pg_stat_get_archiver(PG_FUNCTION_ARGS)
 	MemSet(nulls, 0, sizeof(nulls));
 
 	/* Initialise attributes information in the tuple descriptor */
-	tupdesc = CreateTemplateTupleDesc(7, false);
+	tupdesc = CreateTemplateTupleDesc(8, false);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "archived_count",
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "last_archived_wal",
@@ -1759,7 +1761,9 @@ pg_stat_get_archiver(PG_FUNCTION_ARGS)
 					   TEXTOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 6, "last_failed_time",
 					   TIMESTAMPTZOID, -1, 0);
-	TupleDescInitEntry(tupdesc, (AttrNumber) 7, "stats_reset",
+	TupleDescInitEntry(tupdesc, (AttrNumber) 7, "ready_count",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) 8, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
 	BlessTupleDesc(tupdesc);
@@ -1790,10 +1794,15 @@ pg_stat_get_archiver(PG_FUNCTION_ARGS)
 	else
 		values[5] = TimestampTzGetDatum(archiver_stats->last_failed_timestamp);
 
-	if (archiver_stats->stat_reset_timestamp == 0)
+	if (archiver_stats->ready_count == 0)
 		nulls[6] = true;
 	else
-		values[6] = TimestampTzGetDatum(archiver_stats->stat_reset_timestamp);
+		values[6] = Int64GetDatum(archiver_stats->ready_count);
+
+	if (archiver_stats->stat_reset_timestamp == 0)
+		nulls[7] = true;
+	else
+		values[7] = TimestampTzGetDatum(archiver_stats->stat_reset_timestamp);
 
 	/* Returns the record as Datum */
 	PG_RETURN_DATUM(HeapTupleGetDatum(
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 5d4e889..28e3b46 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2788,7 +2788,7 @@ DATA(insert OID = 2844 (  pg_stat_get_db_blk_read_time	PGNSP PGUID 12 1 0 0 0 f
 DESCR("statistics: block read time, in msec");
 DATA(insert OID = 2845 (  pg_stat_get_db_blk_write_time PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 701 "26" _null_ _null_ _null_ _null_ pg_stat_get_db_blk_write_time _null_ _null_ _null_ ));
 DESCR("statistics: block write time, in msec");
-DATA(insert OID = 3195 (  pg_stat_get_archiver		PGNSP PGUID 12 1 0 0 0 f f f f f f s 0 0 2249 "" "{20,25,1184,20,25,1184,1184}" "{o,o,o,o,o,o,o}" "{archived_count,last_archived_wal,last_archived_time,failed_count,last_failed_wal,last_failed_time,stats_reset}" _null_ pg_stat_get_archiver _null_ _null_ _null_ ));
+DATA(insert OID = 3195 (  pg_stat_get_archiver		PGNSP PGUID 12 1 0 0 0 f f f f f f s 0 0 2249 "" "{20,25,1184,20,25,1184,20,1184}" "{o,o,o,o,o,o,o,o}" "{archived_count,last_archived_wal,last_archived_time,failed_count,last_failed_wal,last_failed_time,ready_count,stats_reset}" _null_ pg_stat_get_archiver _null_ _null_ _null_ ));
 DESCR("statistics: information about WAL archiver");
 DATA(insert OID = 2769 ( pg_stat_get_bgwriter_timed_checkpoints PGNSP PGUID 12 1 0 0 0 f f f f t f s 0 0 20 "" _null_ _null_ _null_ _null_ pg_stat_get_bgwriter_timed_checkpoints _null_ _null_ _null_ ));
 DESCR("statistics: number of timed checkpoints started by the bgwriter");
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 0892533..55d046d 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -377,6 +377,7 @@ typedef struct PgStat_MsgArchiver
 {
 	PgStat_MsgHdr m_hdr;
 	bool		m_failed;		/* Failed attempt */
+	uint64		m_ready_count;
 	char		m_xlog[MAX_XFN_CHARS + 1];
 	TimestampTz m_timestamp;
 } PgStat_MsgArchiver;
@@ -650,6 +651,7 @@ typedef struct PgStat_ArchiverStats
 	PgStat_Counter failed_count;	/* failed archival attempts */
 	char		last_failed_wal[MAX_XFN_CHARS + 1];		/* WAL file involved in
 														 * last failure */
+	int64		ready_count;			/* files ready to be archived */
 	TimestampTz last_failed_timestamp;	/* last archival failure time */
 	TimestampTz stat_reset_timestamp;
 } PgStat_ArchiverStats;
@@ -934,7 +936,8 @@ extern void pgstat_twophase_postcommit(TransactionId xid, uint16 info,
 extern void pgstat_twophase_postabort(TransactionId xid, uint16 info,
 						  void *recdata, uint32 len);
 
-extern void pgstat_send_archiver(const char *xlog, bool failed);
+extern void pgstat_send_archiver(const char *xlog, int64 ready_count,
+						  bool failed);
 extern void pgstat_send_bgwriter(void);
 
 /* ----------
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index c79b45c..5eaf138 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1686,8 +1686,9 @@ pg_stat_archiver| SELECT s.archived_count,
     s.failed_count,
     s.last_failed_wal,
     s.last_failed_time,
+    s.ready_count,
     s.stats_reset
-   FROM pg_stat_get_archiver() s(archived_count, last_archived_wal, last_archived_time, failed_count, last_failed_wal, last_failed_time, stats_reset);
+   FROM pg_stat_get_archiver() s(archived_count, last_archived_wal, last_archived_time, failed_count, last_failed_wal, last_failed_time, ready_count, stats_reset);
 pg_stat_bgwriter| SELECT pg_stat_get_bgwriter_timed_checkpoints() AS checkpoints_timed,
     pg_stat_get_bgwriter_requested_checkpoints() AS checkpoints_req,
     pg_stat_get_checkpoint_write_time() AS checkpoint_write_time,
-- 
2.1.3

#12Simon Riggs
simon@2ndQuadrant.com
In reply to: Michael Paquier (#10)
Re: [TODO] Track number of files ready to be archived in pg_stat_archiver

On 18 November 2014 06:20, Michael Paquier <michael.paquier@gmail.com> wrote:

the DBA may want to know how long is the queue of WAL files
waiting to be archived.

Agreed

That's IMO something we simply forgot in the
first implementation of pg_stat_archiver

That's not how it appears to me. ISTM that the information requested
is already available, it just needs some minor calculations to work
out how many files are required.

the most direct way to
know that is to count the .ready files in archive_status.

...my earlier point was...

pg_stat_archiver already has a column for last_archived_wal and
last_failed_wal, so you can already work out how many files there must
be between then and now. Perhaps that can be added directly to the
view, to assist the user in calculating it. Reading the directory
itself to count the file is unnecessary, except as a diagnostic.

As soon as we have sent the first file, we will know the queue length
at any point afterwards.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Michael Paquier
michael.paquier@gmail.com
In reply to: Michael Paquier (#11)
Re: [TODO] Track number of files ready to be archived in pg_stat_archiver

Hearing nothing from the original author, this patch that was in state
"Waiting on Author" for a couple of days is switched to "returned with
feedback".
Regards,
--
Michael

#14Julien Rouhaud
julien.rouhaud@dalibo.com
In reply to: Michael Paquier (#11)
Re: [TODO] Track number of files ready to be archived in pg_stat_archiver

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Le 18/11/2014 08:36, Michael Paquier a écrit :

On Wed, Oct 22, 2014 at 12:50 AM, Brightwell, Adam
<adam.brightwell@crunchydatasolutions.com> wrote:

Though, I would think that the general desire would be to keep
the patch relevant ONLY to the necessary changes. I would not
qualify making those types of changes as relevant, IMHO. I do
think this is potential for cleanup, however, I would suspect
that would be best done in a separate patch. But again, I'd
defer to a committer whether such changes are even
necessary/acceptable.

I have been looking at this patch, and I think that it is a mistake
to count the .ready files present in archive_status when calling
pg_stat_get_archiver(). If there are many files waiting to be
archived, this penalizes the run time of this function, and the
application behind relying on those results, not to mention that
actually the loop used to count the .ready files is a copy of what
is in pgarch.c. Hence I think that we should simply count them in
pgarch_readyXlog, and then return a value back to
pgarch_ArchiverCopyLoop, value that could be decremented by 1 each
time a file is successfully archived to keep the stats as precise
as possible, and let the information know useful information when
archiver process is within a single loop process of
pgarch_ArchiverCopyLoop. This way, we just need to extend
PgStat_MsgArchiver with a new counter to track this number.

The attached patch, based on v2 sent previously, does so.
Thoughts?

Sorry for this late answer.

I agree with you about the problems of the v2 patch I originally sent.
I think this v3 is the right way of keeping track of .ready files, so
it's ok for me. The v3 also still applies well on current head.

Regards.
- --
Julien Rouhaud
http://dalibo.com - http://dalibo.org
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iQEcBAEBAgAGBQJUjFLWAAoJELGaJ8vfEpOqV9AIAI1yTUYqiB8rMJpfM47IHiM6
92fRNJ7sGwuFKD7Vb2gcMuRLelhFVRevJ7tjhggci8Y36j6YDXgqz74kTjkXvcjN
/SlyS2CIcSleWwvJ2A/WZM0rIzbtm1DTahKupQQ8UdcjHsk3m8T+nySIGyQWdKzz
X9JiXATztlevAaC/1Mf+zsbDSzW5tiQVfIm835G1/sEqIXh43TQyyXyr/nJFlFfQ
85OPssInrxt1e2F82s3SoXb7lIBZg77fZTEusxG5zHX5ANF6uMpF7CBJiZXezRYw
xWrKKuJBLw4zSimzNsVYpxNN3jJuANEAkvzIV+glKDYD57A3DbmpYSJ+btXtDIw=
=JKhg
-----END PGP SIGNATURE-----

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Michael Paquier
michael.paquier@gmail.com
In reply to: Julien Rouhaud (#14)
Re: [TODO] Track number of files ready to be archived in pg_stat_archiver

On Sat, Dec 13, 2014 at 11:53 PM, Julien Rouhaud
<julien.rouhaud@dalibo.com> wrote:

I agree with you about the problems of the v2 patch I originally sent.
I think this v3 is the right way of keeping track of .ready files, so
it's ok for me. The v3 also still applies well on current head.

Simon got a good point mentioning that we can currently estimate the
number of files to be archived with the information that we have now
as the logic in the archiver is made as such. This information would
still be useful for a node freshly promoted that needs to promote a
bunch of files btw... But now there are as well discussions about
having a node only archive WAL files it produces, aka a master
archiving only WAL files on its current timeline, so we wouldn't
really need this patch if that's done.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers