Enable WAL archiving even in standby

Started by Fujii Masaoover 11 years ago7 messages
#1Fujii Masao
masao.fujii@gmail.com
1 attachment(s)

Hi,

I'd propose the attached WIP patch which allows us to enable WAL archiving
even in standby. The patch adds "always" as the valid value of archive_mode.
If it's set to "always", the archiver is started when the server is in standby
mode and all the WAL files that walreceiver wrote to the disk are archived by
using archive_command. Then, even after the server is promoted to master,
the archiver keeps archiving WAL files. The patch doesn't change the meanings
of the setting values "on" and "off" of archive_mode.

I think that this feature is useful for the case, e.g., where large database
needs to be replicated between remote servers. Imagine the situation where
the replicated database gets corrupted completely in the remote standby.
How should we address this problematic situation and restart the standby?

One approach is to take a fresh backup from the master and restore it onto
the standby. But since the database is large and there is long distance
between two servers, this approach might take a surprisingly long time.

Another approach is to restore the backup which was taken from the standby
before. But most of many WAL files which the backup needs might exist only
in the master (because WAL archiving cannot be enabled in the standby) and
they need to be transfered from the master to the standby via long-distance
network. So I think that this approach also would take a fairly long time.
To shorten that time, you may think that archive_command in the master can
be set so that it transfers WAL files from the master to the standby's
archival storage. I agree that this setting can accelerate the database restore
process. But this causes every WAL files to be transfered between remote
servers twice (one is by streaming replication, another is by archive_command),
and which is a waste of network bandwidth.

Enabling WAL archiving in standby is one of solutions for this situation. We
can expect that most of WAL files that are required for the backup taken from
the standby would exist in the standby's archival storage.

Back to the patch. If archive_mode is set to "always", archive_command is
always used to archive WAL files even during recovery. Do we need to separate
the command into two for master and standby, respectively? We can add
something like standby_archive_command parameter which is used to archive
only WAL files walreceiver writes. The other WAL files are archived by
archive_command. I'm not sure if it's really worth separating the command
that way. Is there any use case?

The patch doesn't allow us to enable WAL archiving *only* during recovery.
Should we support yet another archive_mode like "standby" which allows
the archiver to be running only during recovery, but makes it end just after
the server is promoted to master? I'm not sure if there is really use case for
that.

I've not included the update of document in the patch yet. If we agree to
support this feature, I will do the remaining work.

Regards,

--
Fujii Masao

Attachments:

standby_wal_archiving_v1.patchtext/x-patch; charset=US-ASCII; name=standby_wal_archiving_v1.patchDownload
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***************
*** 81,87 **** int			CheckPointSegments = 3;
  int			wal_keep_segments = 0;
  int			XLOGbuffers = -1;
  int			XLogArchiveTimeout = 0;
! bool		XLogArchiveMode = false;
  char	   *XLogArchiveCommand = NULL;
  bool		EnableHotStandby = false;
  bool		fullPageWrites = true;
--- 81,87 ----
  int			wal_keep_segments = 0;
  int			XLOGbuffers = -1;
  int			XLogArchiveTimeout = 0;
! int			XLogArchiveMode = ARCHIVE_MODE_OFF;
  char	   *XLogArchiveCommand = NULL;
  bool		EnableHotStandby = false;
  bool		fullPageWrites = true;
*** a/src/backend/postmaster/postmaster.c
--- b/src/backend/postmaster/postmaster.c
***************
*** 89,94 ****
--- 89,95 ----
  
  #include "access/transam.h"
  #include "access/xlog.h"
+ #include "access/xlogarchive.h"
  #include "bootstrap/bootstrap.h"
  #include "catalog/pg_control.h"
  #include "lib/ilist.h"
***************
*** 823,831 **** PostmasterMain(int argc, char *argv[])
  		write_stderr("%s: max_wal_senders must be less than max_connections\n", progname);
  		ExitPostmaster(1);
  	}
! 	if (XLogArchiveMode && wal_level == WAL_LEVEL_MINIMAL)
  		ereport(ERROR,
! 				(errmsg("WAL archival (archive_mode=on) requires wal_level \"archive\", \"hot_standby\", or \"logical\"")));
  	if (max_wal_senders > 0 && wal_level == WAL_LEVEL_MINIMAL)
  		ereport(ERROR,
  				(errmsg("WAL streaming (max_wal_senders > 0) requires wal_level \"archive\", \"hot_standby\", or \"logical\"")));
--- 824,832 ----
  		write_stderr("%s: max_wal_senders must be less than max_connections\n", progname);
  		ExitPostmaster(1);
  	}
! 	if (XLogArchiveMode > ARCHIVE_MODE_OFF && wal_level == WAL_LEVEL_MINIMAL)
  		ereport(ERROR,
! 				(errmsg("WAL archival (archive_mode=on/always) requires wal_level \"archive\", \"hot_standby\", or \"logical\"")));
  	if (max_wal_senders > 0 && wal_level == WAL_LEVEL_MINIMAL)
  		ereport(ERROR,
  				(errmsg("WAL streaming (max_wal_senders > 0) requires wal_level \"archive\", \"hot_standby\", or \"logical\"")));
***************
*** 1617,1624 **** ServerLoop(void)
  				start_autovac_launcher = false; /* signal processed */
  		}
  
! 		/* If we have lost the archiver, try to start a new one */
! 		if (XLogArchivingActive() && PgArchPID == 0 && pmState == PM_RUN)
  			PgArchPID = pgarch_start();
  
  		/* If we have lost the stats collector, try to start a new one */
--- 1618,1634 ----
  				start_autovac_launcher = false; /* signal processed */
  		}
  
! 		/*
! 		 * If we have lost the archiver, try to start a new one.
! 		 *
! 		 * If WAL archiving is enabled always, we try to start a new archiver
! 		 * even during recovery.
! 		 */
! 		if (PgArchPID == 0 &&
! 			((XLogArchivingActive() && pmState == PM_RUN) ||
! 			 (XLogArchivingActiveAlways() &&
! 			  (pmState == PM_RECOVERY ||
! 			   pmState == PM_HOT_STANDBY))))
  			PgArchPID = pgarch_start();
  
  		/* If we have lost the stats collector, try to start a new one */
***************
*** 4780,4785 **** sigusr1_handler(SIGNAL_ARGS)
--- 4790,4803 ----
  		Assert(BgWriterPID == 0);
  		BgWriterPID = StartBackgroundWriter();
  
+ 		/*
+ 		 * Try to start an archiver even during recovery if archiving is
+ 		 * enabled always.
+ 		 */
+ 		Assert(PgArchPID == 0);
+ 		if (XLogArchivingActiveAlways())
+ 			PgArchPID = pgarch_start();
+ 
  		pmState = PM_RECOVERY;
  	}
  	if (CheckPostmasterSignal(PMSIGNAL_BEGIN_HOT_STANDBY) &&
*** a/src/backend/replication/walreceiver.c
--- b/src/backend/replication/walreceiver.c
***************
*** 547,557 **** WalReceiverMain(void)
  								XLogFileNameP(recvFileTLI, recvSegNo))));
  
  			/*
! 			 * Create .done file forcibly to prevent the streamed segment from
! 			 * being archived later.
  			 */
  			XLogFileName(xlogfname, recvFileTLI, recvSegNo);
! 			XLogArchiveForceDone(xlogfname);
  		}
  		recvFile = -1;
  
--- 547,563 ----
  								XLogFileNameP(recvFileTLI, recvSegNo))));
  
  			/*
! 			 * Notify the archiver that the WAL segment is ready to
! 			 * copy to archival storage even during recovery
! 			 * if WAL archiving is enabled always. Otherwise,
! 			 * create .done file forcibly to prevent the streamed segment
! 			 * from being archived later.
  			 */
  			XLogFileName(xlogfname, recvFileTLI, recvSegNo);
! 			if (XLogArchivingActiveAlways())
! 				XLogArchiveNotify(xlogfname);
! 			else
! 				XLogArchiveForceDone(xlogfname);
  		}
  		recvFile = -1;
  
***************
*** 904,914 **** XLogWalRcvWrite(char *buf, Size nbytes, XLogRecPtr recptr)
  									XLogFileNameP(recvFileTLI, recvSegNo))));
  
  				/*
! 				 * Create .done file forcibly to prevent the streamed segment
  				 * from being archived later.
  				 */
  				XLogFileName(xlogfname, recvFileTLI, recvSegNo);
! 				XLogArchiveForceDone(xlogfname);
  			}
  			recvFile = -1;
  
--- 910,926 ----
  									XLogFileNameP(recvFileTLI, recvSegNo))));
  
  				/*
! 				 * Notify the archiver that the WAL segment is ready to
! 				 * copy to archival storage even during recovery
! 				 * if WAL archiving is enabled always. Otherwise,
! 				 * create .done file forcibly to prevent the streamed segment
  				 * from being archived later.
  				 */
  				XLogFileName(xlogfname, recvFileTLI, recvSegNo);
! 				if (XLogArchivingActiveAlways())
! 					XLogArchiveNotify(xlogfname);
! 				else
! 					XLogArchiveForceDone(xlogfname);
  			}
  			recvFile = -1;
  
*** a/src/backend/utils/misc/guc.c
--- b/src/backend/utils/misc/guc.c
***************
*** 401,406 **** static const struct config_enum_entry huge_pages_options[] = {
--- 401,423 ----
  };
  
  /*
+  * Although only "on", "off", and "always" are documented,
+  * we accept all the likely variants of "on" and "off".
+  */
+ static const struct config_enum_entry archive_mode_options[] = {
+ 	{"always", ARCHIVE_MODE_ALWAYS, false},
+ 	{"on", ARCHIVE_MODE_ON, false},
+ 	{"off", ARCHIVE_MODE_OFF, false},
+ 	{"true", ARCHIVE_MODE_ON, true},
+ 	{"false", ARCHIVE_MODE_OFF, true},
+ 	{"yes", ARCHIVE_MODE_ON, true},
+ 	{"no", ARCHIVE_MODE_OFF, true},
+ 	{"1", ARCHIVE_MODE_ON, true},
+ 	{"0", ARCHIVE_MODE_OFF, true},
+ 	{NULL, 0, false}
+ };
+ 
+ /*
   * Options for enum values stored in other modules
   */
  extern const struct config_enum_entry wal_level_options[];
***************
*** 1429,1444 **** static struct config_bool ConfigureNamesBool[] =
  	},
  
  	{
- 		{"archive_mode", PGC_POSTMASTER, WAL_ARCHIVING,
- 			gettext_noop("Allows archiving of WAL files using archive_command."),
- 			NULL
- 		},
- 		&XLogArchiveMode,
- 		false,
- 		NULL, NULL, NULL
- 	},
- 
- 	{
  		{"hot_standby", PGC_POSTMASTER, REPLICATION_STANDBY,
  			gettext_noop("Allows connections and queries during recovery."),
  			NULL
--- 1446,1451 ----
***************
*** 3421,3426 **** static struct config_enum ConfigureNamesEnum[] =
--- 3428,3443 ----
  	},
  
  	{
+ 		{"archive_mode", PGC_POSTMASTER, WAL_ARCHIVING,
+ 			gettext_noop("Allows archiving of WAL files using archive command."),
+ 			NULL
+ 		},
+ 		&XLogArchiveMode,
+ 		ARCHIVE_MODE_OFF, archive_mode_options,
+ 		NULL, NULL, NULL
+ 	},
+ 
+ 	{
  		{"trace_recovery_messages", PGC_SIGHUP, DEVELOPER_OPTIONS,
  			gettext_noop("Enables logging of recovery-related debugging information."),
  			gettext_noop("Each level includes all the levels that follow it. The later"
*** a/src/backend/utils/misc/postgresql.conf.sample
--- b/src/backend/utils/misc/postgresql.conf.sample
***************
*** 202,208 ****
  
  # - Archiving -
  
! #archive_mode = off		# allows archiving to be done
  				# (change requires restart)
  #archive_command = ''		# command to use to archive a logfile segment
  				# placeholders: %p = path of file to archive
--- 202,208 ----
  
  # - Archiving -
  
! #archive_mode = off		# allows archiving to be done; off, on, or always
  				# (change requires restart)
  #archive_command = ''		# command to use to archive a logfile segment
  				# placeholders: %p = path of file to archive
*** a/src/include/access/xlog.h
--- b/src/include/access/xlog.h
***************
*** 12,17 ****
--- 12,18 ----
  #define XLOG_H
  
  #include "access/rmgr.h"
+ #include "access/xlogarchive.h"
  #include "access/xlogdefs.h"
  #include "datatype/timestamp.h"
  #include "lib/stringinfo.h"
***************
*** 188,194 **** extern int	CheckPointSegments;
  extern int	wal_keep_segments;
  extern int	XLOGbuffers;
  extern int	XLogArchiveTimeout;
! extern bool XLogArchiveMode;
  extern char *XLogArchiveCommand;
  extern bool EnableHotStandby;
  extern bool fullPageWrites;
--- 189,195 ----
  extern int	wal_keep_segments;
  extern int	XLOGbuffers;
  extern int	XLogArchiveTimeout;
! extern int	XLogArchiveMode;
  extern char *XLogArchiveCommand;
  extern bool EnableHotStandby;
  extern bool fullPageWrites;
***************
*** 206,212 **** typedef enum WalLevel
  } WalLevel;
  extern int	wal_level;
  
! #define XLogArchivingActive()	(XLogArchiveMode && wal_level >= WAL_LEVEL_ARCHIVE)
  #define XLogArchiveCommandSet() (XLogArchiveCommand[0] != '\0')
  
  /*
--- 207,216 ----
  } WalLevel;
  extern int	wal_level;
  
! #define XLogArchivingActive()	(XLogArchiveMode > ARCHIVE_MODE_OFF \
! 								 && wal_level >= WAL_LEVEL_ARCHIVE)
! #define XLogArchivingActiveAlways()	(XLogArchiveMode == ARCHIVE_MODE_ALWAYS \
! 								 && wal_level >= WAL_LEVEL_ARCHIVE)
  #define XLogArchiveCommandSet() (XLogArchiveCommand[0] != '\0')
  
  /*
*** /dev/null
--- b/src/include/access/xlogarchive.h
***************
*** 0 ****
--- 1,21 ----
+ /*
+  * xlogarchive.h
+  *
+  * Functions for archiving WAL files and restoring from the archive.
+  *
+  * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+  * Portions Copyright (c) 1994, Regents of the University of California
+  *
+  * src/include/access/xlogarchive.h
+  */
+ #ifndef XLOG_ARCHIVE_H
+ #define XLOG_ARCHIVE_H
+ 
+ typedef enum
+ {
+ 	ARCHIVE_MODE_OFF,	/* disabled*/
+ 	ARCHIVE_MODE_ON,	/*enabled while server is running normally */
+ 	ARCHIVE_MODE_ALWAYS	/* enabled always (even during recovery) */
+ }	ArchiveModeLevel;
+ 
+ #endif   /* XLOG_ARCHIVE_H */
#2Robert Haas
robertmhaas@gmail.com
In reply to: Fujii Masao (#1)
Re: Enable WAL archiving even in standby

On Wed, Aug 13, 2014 at 6:42 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

I'd propose the attached WIP patch which allows us to enable WAL archiving
even in standby. The patch adds "always" as the valid value of archive_mode.
If it's set to "always", the archiver is started when the server is in standby
mode and all the WAL files that walreceiver wrote to the disk are archived by
using archive_command. Then, even after the server is promoted to master,
the archiver keeps archiving WAL files. The patch doesn't change the meanings
of the setting values "on" and "off" of archive_mode.

I like the feature, but I don't much like this as a control mechanism.
Having archive_command and standby_archive_command, as you propose
further down, seems saner.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Fujii Masao
masao.fujii@gmail.com
In reply to: Robert Haas (#2)
Re: Enable WAL archiving even in standby

On Fri, Aug 15, 2014 at 4:30 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Aug 13, 2014 at 6:42 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

I'd propose the attached WIP patch which allows us to enable WAL archiving
even in standby. The patch adds "always" as the valid value of archive_mode.
If it's set to "always", the archiver is started when the server is in standby
mode and all the WAL files that walreceiver wrote to the disk are archived by
using archive_command. Then, even after the server is promoted to master,
the archiver keeps archiving WAL files. The patch doesn't change the meanings
of the setting values "on" and "off" of archive_mode.

I like the feature, but I don't much like this as a control mechanism.
Having archive_command and standby_archive_command, as you propose
further down, seems saner.

Okay, that's fine. One question is; Which WAL files should be archived by
standby_archive_command? There are following kinds of WAL files.

(1) WAL files which were fully written and closed by walreceiver
Curently they are not archived at all.

(2) WAL file which is being written by walreceiver
This file will be closed before it's fully written because of,
for example, standby promotion.
Currently this is archived by archive_command.

(3) WAL file with new timeline, which is copied from (2)
At the end of recovery, after new timeline is assigned,
this latest WAL file with new timeline is created by being copied
from (2) (i.e., latest WAL file with old timeline). WAL data of
end-of-recovery checkpoint is written to this latest WAL file.
Currently this is archived by archive_command.

(4) Timeline history files
When standby is promoted to the master, the imeline is incremented
and the timeline history file is created.
Currently the timeline history files are archived by archive_command.

(5) WAL files generated in normal processing mode
Currently they are archived by archive_command.

I'm thinking to use standby_archive_command only for (1) because
the others are currently archived by archive_command. That means
that even if there are type (1) WAL files which have not been archived
yet after the standby promotion (i.e., the situation where WAL archiving
was delayed for some reasons in the standby), they are archived by
standby_archive_command. IOW, the archiver uses both archive commands
as the situation demands.

OTOH, maybe there are people who want to use standby_archive_command
for all the WAL files with old timeline, i.e., (1) and (2). Thought?

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4MauMau
maumau307@gmail.com
In reply to: Fujii Masao (#1)
Re: Enable WAL archiving even in standby

From: "Fujii Masao" <masao.fujii@gmail.com>

I'd propose the attached WIP patch which allows us to enable WAL archiving
even in standby. The patch adds "always" as the valid value of
archive_mode.
If it's set to "always", the archiver is started when the server is in
standby
mode and all the WAL files that walreceiver wrote to the disk are archived
by
using archive_command. Then, even after the server is promoted to master,
the archiver keeps archiving WAL files. The patch doesn't change the
meanings
of the setting values "on" and "off" of archive_mode.

I think that this feature is useful for the case, e.g., where large
database
needs to be replicated between remote servers. Imagine the situation where
the replicated database gets corrupted completely in the remote standby.
How should we address this problematic situation and restart the standby?

One approach is to take a fresh backup from the master and restore it onto
the standby. But since the database is large and there is long distance
between two servers, this approach might take a surprisingly long time.

Another approach is to restore the backup which was taken from the standby
before. But most of many WAL files which the backup needs might exist only
in the master (because WAL archiving cannot be enabled in the standby) and
they need to be transfered from the master to the standby via
long-distance
network. So I think that this approach also would take a fairly long time.
To shorten that time, you may think that archive_command in the master can
be set so that it transfers WAL files from the master to the standby's
archival storage. I agree that this setting can accelerate the database
restore
process. But this causes every WAL files to be transfered between remote
servers twice (one is by streaming replication, another is by
archive_command),
and which is a waste of network bandwidth.

Great. This is exactly what I hoped for disaster recovery, although I
haven't looked at the patch yet.

Back to the patch. If archive_mode is set to "always", archive_command is
always used to archive WAL files even during recovery. Do we need to
separate
the command into two for master and standby, respectively? We can add
something like standby_archive_command parameter which is used to archive
only WAL files walreceiver writes. The other WAL files are archived by
archive_command. I'm not sure if it's really worth separating the command
that way. Is there any use case?

I don't see any reason to separate parameters. I want the spec simple.

I've not included the update of document in the patch yet. If we agree to
support this feature, I will do the remaining work.

Could you consider adding a new section for disaster recovery that describes
concrete parameter settings (e.g. how do we discard old archive WAL files
after taking a base backup from standby, because backup label file is not
created?). Good luck!

Regards
MauMau

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Robert Haas
robertmhaas@gmail.com
In reply to: Fujii Masao (#3)
Re: Enable WAL archiving even in standby

On Tue, Aug 19, 2014 at 7:33 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Fri, Aug 15, 2014 at 4:30 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Aug 13, 2014 at 6:42 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

I'd propose the attached WIP patch which allows us to enable WAL archiving
even in standby. The patch adds "always" as the valid value of archive_mode.
If it's set to "always", the archiver is started when the server is in standby
mode and all the WAL files that walreceiver wrote to the disk are archived by
using archive_command. Then, even after the server is promoted to master,
the archiver keeps archiving WAL files. The patch doesn't change the meanings
of the setting values "on" and "off" of archive_mode.

I like the feature, but I don't much like this as a control mechanism.
Having archive_command and standby_archive_command, as you propose
further down, seems saner.

Okay, that's fine. One question is; Which WAL files should be archived by
standby_archive_command? There are following kinds of WAL files.

(1) WAL files which were fully written and closed by walreceiver
Curently they are not archived at all.

(2) WAL file which is being written by walreceiver
This file will be closed before it's fully written because of,
for example, standby promotion.
Currently this is archived by archive_command.

(3) WAL file with new timeline, which is copied from (2)
At the end of recovery, after new timeline is assigned,
this latest WAL file with new timeline is created by being copied
from (2) (i.e., latest WAL file with old timeline). WAL data of
end-of-recovery checkpoint is written to this latest WAL file.
Currently this is archived by archive_command.

(4) Timeline history files
When standby is promoted to the master, the imeline is incremented
and the timeline history file is created.
Currently the timeline history files are archived by archive_command.

(5) WAL files generated in normal processing mode
Currently they are archived by archive_command.

I'm thinking to use standby_archive_command only for (1) because
the others are currently archived by archive_command. That means
that even if there are type (1) WAL files which have not been archived
yet after the standby promotion (i.e., the situation where WAL archiving
was delayed for some reasons in the standby), they are archived by
standby_archive_command. IOW, the archiver uses both archive commands
as the situation demands.

OTOH, maybe there are people who want to use standby_archive_command
for all the WAL files with old timeline, i.e., (1) and (2). Thought?

Boy, that's quite confusing. I didn't think we ever ran
archive_command on the standby right now, so then it would make sense
to have a way to do that. And it makes sense for it to be separate
from the mode used on the master to avoid breaking existing
configurations, so that a user assuming that a certain setting will
only take effect after promotion will not be disappointed. However,
if what you're saying is that we do archiving on the standby sometimes
but not others, I'm not quite sure what the best way forward is. It
seems rather inconsistent to do it for some types of WAL files but not
others.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Alexey Klyukin
alexk@hintbits.com
In reply to: Fujii Masao (#1)
Re: Enable WAL archiving even in standby

On Wed, Aug 13, 2014 at 12:42 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

Hi,

I'd propose the attached WIP patch which allows us to enable WAL archiving
even in standby.

...

I think that this feature is useful for the case, e.g., where large database
needs to be replicated between remote servers. Imagine the situation where
the replicated database gets corrupted completely in the remote standby.
How should we address this problematic situation and restart the standby?

One approach is to take a fresh backup from the master and restore it onto
the standby. But since the database is large and there is long distance
between two servers, this approach might take a surprisingly long time.

Another approach is to restore the backup which was taken from the standby
before. But most of many WAL files which the backup needs might exist only
in the master (because WAL archiving cannot be enabled in the standby) and
they need to be transfered from the master to the standby via long-distance
network. So I think that this approach also would take a fairly long time.
To shorten that time, you may think that archive_command in the master can
be set so that it transfers WAL files from the master to the standby's
archival storage. I agree that this setting can accelerate the database restore
process. But this causes every WAL files to be transfered between remote
servers twice (one is by streaming replication, another is by archive_command),
and which is a waste of network bandwidth.

Well, in theory one can also use pg_receivexlog to get the WAL files
from master,
and then run them through the PITR on the slave without the
archive_cleanup command.

I'm not sure you can do the same if the source of the WAL files is a
cascading slave,
but I see no reasons why not to.

However, I find the patch useful, since it allows accomplishing
things in a much more
straightforward way.

Back to the patch. If archive_mode is set to "always", archive_command is
always used to archive WAL files even during recovery. Do we need to separate
the command into two for master and standby, respectively? We can add
something like standby_archive_command parameter which is used to archive
only WAL files walreceiver writes. The other WAL files are archived by
archive_command. I'm not sure if it's really worth separating the command
that way. Is there any use case?

I don't see a good use case for doing things only on standby, but I can imagine
that some different archiving methods might be used depending on the role of
the archiver: on master, one may do, for instance, additional copy to the NFS
partition. Does it make sense to expose the server role ('is_master') via an
additional variable available to the recovery_command (i.e. %m)?

The patch doesn't allow us to enable WAL archiving *only* during recovery.
Should we support yet another archive_mode like "standby" which allows
the archiver to be running only during recovery, but makes it end just after
the server is promoted to master? I'm not sure if there is really use case for
that.

I do not see much use for this as well.

I've not included the update of document in the patch yet. If we agree to
support this feature, I will do the remaining work.

I think it is useful, and I gave this patch a spin by, essentially, creating a
cascaded archive-only slave. I made a base backup from master, then
ran the standby from this base backup with archive_mode = 'always' and
archive_command copying files to the archive_location, then created another
base backup out of it (without including WAL files into the backup) and pointed
the recovery command of the final slave into the archive created by
the intermediate one.

Recovery worked, as well as the promotion of the intermediate slave to
the master,
the final slave just caught up with the timeline changes (with
recovery_timeline set to
'latest') and went on with the recovery.

One thing I've noticed is that pg_basebackup copies the postgresql.conf from the
standby verbatim, including the archive_mode, which means that if one runs
the cascaded replica without changing the archive_mode, that replica
will try to archive
the WAL, and if both the source and the replica are running on the same machine
(or attached to NFS using the same mount points) even the destination
for archiving
will be the same. Should not be a big problem if one follows the
recommendation of not
overwriting the files that already exist at the destination, but it
would be nice to reset the
archive_mode flag to off.

I do not know much about the WAL-related code, but one thing that I
found strange
in the patch is a separate file xlogarchive.h instead of putting
stuff into xlog.h?
Does not make much sense for a single enum, are you planning to put
more things there?

There was a single hunk when applying this against the latest master:

Hunk #4 succeeded at 4789 (offset -1 lines).

--
Regards,
Alexey Klyukin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Simon Riggs
simon@2ndquadrant.com
In reply to: Fujii Masao (#1)
Re: Enable WAL archiving even in standby

On 13 August 2014 11:42, Fujii Masao <masao.fujii@gmail.com> wrote:

I'd propose the attached WIP patch which allows us to enable WAL archiving
even in standby. The patch adds "always" as the valid value of archive_mode.
If it's set to "always", the archiver is started when the server is in standby
mode and all the WAL files that walreceiver wrote to the disk are archived by
using archive_command. Then, even after the server is promoted to master,
the archiver keeps archiving WAL files. The patch doesn't change the meanings
of the setting values "on" and "off" of archive_mode.

Seems OK idea.

Perhaps better to have a new parameter called
archive_role = Origin | Replica | Always should be used so we match
up with how triggers/rules work.
rather than abuse archive_mode

We can pass a attribute called %m (mode?) which tells the
archive_command whether we are Master or Standby when called

Remember to bump the number of background procs by 1 since we
previously assumed archiver would never run in recovery

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers