pg_ctl emits strange warning message
Hi,
http://archives.postgresql.org/pgsql-hackers/2010-05/msg00921.php
(2)
pg_ctl -ms stop emits the following warning whenever there is the
backup_label file in $PGDATA.WARNING: online backup mode is active
Shutdown will not complete until pg_stop_backup() is called.This warning doesn't fit in with the shutdown during recovery case.
Since smart shutdown might be requested by other than pg_ctl, the
warning should be emitted in server side rather than client, I think.
How about moving the warning to the server side?Hmm, I'm not sure whether that's a good idea or not. Perhaps we
should discuss for 9.1?Okay, this is not a critical problem.
Let's revisit this issue. The problem here is that pg_ctl might emit
the following warning messages when it shuts down the standby server.
This is very strange thing since online backup mode is never active
in the standby.
WARNING: online backup mode is active
Shutdown will not complete until pg_stop_backup() is called.
Another problem is that the above useful messages are not emitted
when shutdown is requested by other than pg_ctl.
The attached patch moves the warning from pg_ctl to the server side
and makes postmaster emit it only when online backup mode is really
active. This can urge users to call pg_stop_backup to complete smart
shutdown whenever it's requested during online backup. I added the
patch into the next CF.
Comments?
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Attachments:
warning_during_shutdown_v1.patchapplication/octet-stream; name=warning_during_shutdown_v1.patchDownload
*** a/src/backend/postmaster/postmaster.c
--- b/src/backend/postmaster/postmaster.c
***************
*** 2193,2198 **** pmdie(SIGNAL_ARGS)
--- 2193,2209 ----
*/
pmState = (pmState == PM_RUN) ?
PM_WAIT_BACKUP : PM_WAIT_READONLY;
+
+ /*
+ * Urge users to call pg_stop_backup to complete the smart
+ * shutdown if online backup mode is active.
+ */
+ if (pmState == PM_WAIT_BACKUP && BackupInProgress())
+ ereport(WARNING,
+ (errmsg("online backup mode is active"),
+ errhint("Shutdown will not complete until pg_stop_backup() is called.")));
+ else
+ pmState = PM_WAIT_BACKENDS;
}
/*
*** a/src/bin/pg_ctl/pg_ctl.c
--- b/src/bin/pg_ctl/pg_ctl.c
***************
*** 141,147 **** static bool postmaster_is_alive(pid_t pid);
static char postopts_file[MAXPGPATH];
static char pid_file[MAXPGPATH];
static char conf_file[MAXPGPATH];
- static char backup_file[MAXPGPATH];
#if defined(HAVE_GETRLIMIT) && defined(RLIMIT_CORE)
static void unlimit_core_size(void);
--- 141,146 ----
***************
*** 769,775 **** do_stop(void)
{
int cnt;
pgpid_t pid;
- struct stat statbuf;
pid = get_pgpid();
--- 768,773 ----
***************
*** 802,813 **** do_stop(void)
}
else
{
- if ((shutdown_mode == SMART_MODE) && (stat(backup_file, &statbuf) == 0))
- {
- print_msg(_("WARNING: online backup mode is active\n"
- "Shutdown will not complete until pg_stop_backup() is called.\n\n"));
- }
-
print_msg(_("waiting for server to shut down..."));
for (cnt = 0; cnt < wait_seconds; cnt++)
--- 800,805 ----
***************
*** 844,850 **** do_restart(void)
{
int cnt;
pgpid_t pid;
- struct stat statbuf;
pid = get_pgpid();
--- 836,841 ----
***************
*** 879,890 **** do_restart(void)
exit(1);
}
- if ((shutdown_mode == SMART_MODE) && (stat(backup_file, &statbuf) == 0))
- {
- print_msg(_("WARNING: online backup mode is active\n"
- "Shutdown will not complete until pg_stop_backup() is called.\n\n"));
- }
-
print_msg(_("waiting for server to shut down..."));
/* always wait for restart */
--- 870,875 ----
***************
*** 1961,1967 **** main(int argc, char **argv)
snprintf(postopts_file, MAXPGPATH, "%s/postmaster.opts", pg_data);
snprintf(pid_file, MAXPGPATH, "%s/postmaster.pid", pg_data);
snprintf(conf_file, MAXPGPATH, "%s/postgresql.conf", pg_data);
- snprintf(backup_file, MAXPGPATH, "%s/backup_label", pg_data);
}
switch (ctl_command)
--- 1946,1951 ----
On 13/09/10 13:08, Fujii Masao wrote:
Hi,
http://archives.postgresql.org/pgsql-hackers/2010-05/msg00921.php
(2)
pg_ctl -ms stop emits the following warning whenever there is the
backup_label file in $PGDATA.WARNING: online backup mode is active
Shutdown will not complete until pg_stop_backup() is called.This warning doesn't fit in with the shutdown during recovery case.
Since smart shutdown might be requested by other than pg_ctl, the
warning should be emitted in server side rather than client, I think.
How about moving the warning to the server side?Hmm, I'm not sure whether that's a good idea or not. Perhaps we
should discuss for 9.1?Okay, this is not a critical problem.
Let's revisit this issue. The problem here is that pg_ctl might emit
the following warning messages when it shuts down the standby server.
This is very strange thing since online backup mode is never active
in the standby.WARNING: online backup mode is active
Shutdown will not complete until pg_stop_backup() is called.Another problem is that the above useful messages are not emitted
when shutdown is requested by other than pg_ctl.The attached patch moves the warning from pg_ctl to the server side
and makes postmaster emit it only when online backup mode is really
active. This can urge users to call pg_stop_backup to complete smart
shutdown whenever it's requested during online backup. I added the
patch into the next CF.
There's this comment by Robert Haas:
http://archives.postgresql.org/pgsql-hackers/2010-05/msg01433.php
Like Robert, I'm also a bit worried that this might make the message
less prominent. When you run "pg_ctl stop" manually, it's good to get
the message directly in the terminal.
Another line of attack is to remove the backup_label file earlier during
recovery. We could remove it as soon as we update pg_control file with
the same information, ie. the backup start location. And by remove I
mean rename to backup_label.old, to avoid destroying forensic
information ;-).
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On 13/09/10 16:01, Heikki Linnakangas wrote:
On 13/09/10 13:08, Fujii Masao wrote:
Hi,
http://archives.postgresql.org/pgsql-hackers/2010-05/msg00921.php
(2)
pg_ctl -ms stop emits the following warning whenever there is the
backup_label file in $PGDATA.WARNING: online backup mode is active
Shutdown will not complete until pg_stop_backup() is called.This warning doesn't fit in with the shutdown during recovery case.
Since smart shutdown might be requested by other than pg_ctl, the
warning should be emitted in server side rather than client, I think.
How about moving the warning to the server side?Hmm, I'm not sure whether that's a good idea or not. Perhaps we
should discuss for 9.1?Okay, this is not a critical problem.
Let's revisit this issue. The problem here is that pg_ctl might emit
the following warning messages when it shuts down the standby server.
This is very strange thing since online backup mode is never active
in the standby.WARNING: online backup mode is active
Shutdown will not complete until pg_stop_backup() is called.Another problem is that the above useful messages are not emitted
when shutdown is requested by other than pg_ctl.The attached patch moves the warning from pg_ctl to the server side
and makes postmaster emit it only when online backup mode is really
active. This can urge users to call pg_stop_backup to complete smart
shutdown whenever it's requested during online backup. I added the
patch into the next CF.There's this comment by Robert Haas:
http://archives.postgresql.org/pgsql-hackers/2010-05/msg01433.php
Like Robert, I'm also a bit worried that this might make the message
less prominent. When you run "pg_ctl stop" manually, it's good to get
the message directly in the terminal.Another line of attack is to remove the backup_label file earlier during
recovery. We could remove it as soon as we update pg_control file with
the same information, ie. the backup start location. And by remove I
mean rename to backup_label.old, to avoid destroying forensic
information ;-).
Yet another idea: Check in pg_ctl if recovery.conf is also present. If
it is, assume we're in recovery and don't print that warning. That would
be dead simple. I would even consider backpatching it to 9.0, this can
be regarded as a bug, albeit a minor one.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
Yet another idea: Check in pg_ctl if recovery.conf is also present. If
it is, assume we're in recovery and don't print that warning. That would
be dead simple.
+1. This sounds like a much more appropriate approach.
regards, tom lane
On 13/09/10 18:19, Tom Lane wrote:
Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> writes:
Yet another idea: Check in pg_ctl if recovery.conf is also present. If
it is, assume we're in recovery and don't print that warning. That would
be dead simple.+1. This sounds like a much more appropriate approach.
Hmm, looking at this more closely, I'm a bit confused. We already rename
away backup_label at the beginning of recovery. Under what circumstances
do we have a problem? This starts to seem like a complete non-issue to me.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Tue, Sep 14, 2010 at 12:35 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
Hmm, looking at this more closely, I'm a bit confused. We already rename
away backup_label at the beginning of recovery. Under what circumstances do
we have a problem? This starts to seem like a complete non-issue to me.
A checkpoint record is read before backup_label file is renamed, so
the problem happens if shutdown is requested while the standby server
is waiting for the WAL containing a checkpoint record to arrive from
the master. This happened some times in my box when testing HS/SR.
To avoid the problem, we should rename backup_label before reading
any WAL record?
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Tue, Sep 14, 2010 at 12:10 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
Yet another idea: Check in pg_ctl if recovery.conf is also present. If it
is, assume we're in recovery and don't print that warning. That would be
dead simple. I would even consider backpatching it to 9.0, this can be
regarded as a bug, albeit a minor one.
Yeah, it's very simple. Attached patch does that.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Attachments:
warning_during_shutdown_v2.patchapplication/octet-stream; name=warning_during_shutdown_v2.patchDownload
*** a/src/bin/pg_ctl/pg_ctl.c
--- b/src/bin/pg_ctl/pg_ctl.c
***************
*** 142,147 **** static char postopts_file[MAXPGPATH];
--- 142,148 ----
static char pid_file[MAXPGPATH];
static char conf_file[MAXPGPATH];
static char backup_file[MAXPGPATH];
+ static char recovery_file[MAXPGPATH];
#if defined(HAVE_GETRLIMIT) && defined(RLIMIT_CORE)
static void unlimit_core_size(void);
***************
*** 802,808 **** do_stop(void)
}
else
{
! if ((shutdown_mode == SMART_MODE) && (stat(backup_file, &statbuf) == 0))
{
print_msg(_("WARNING: online backup mode is active\n"
"Shutdown will not complete until pg_stop_backup() is called.\n\n"));
--- 803,810 ----
}
else
{
! if ((shutdown_mode == SMART_MODE) && (stat(backup_file, &statbuf) == 0) &&
! (stat(recovery_file, &statbuf) != 0))
{
print_msg(_("WARNING: online backup mode is active\n"
"Shutdown will not complete until pg_stop_backup() is called.\n\n"));
***************
*** 879,885 **** do_restart(void)
exit(1);
}
! if ((shutdown_mode == SMART_MODE) && (stat(backup_file, &statbuf) == 0))
{
print_msg(_("WARNING: online backup mode is active\n"
"Shutdown will not complete until pg_stop_backup() is called.\n\n"));
--- 881,888 ----
exit(1);
}
! if ((shutdown_mode == SMART_MODE) && (stat(backup_file, &statbuf) == 0) &&
! (stat(recovery_file, &statbuf) != 0))
{
print_msg(_("WARNING: online backup mode is active\n"
"Shutdown will not complete until pg_stop_backup() is called.\n\n"));
***************
*** 1962,1967 **** main(int argc, char **argv)
--- 1965,1971 ----
snprintf(pid_file, MAXPGPATH, "%s/postmaster.pid", pg_data);
snprintf(conf_file, MAXPGPATH, "%s/postgresql.conf", pg_data);
snprintf(backup_file, MAXPGPATH, "%s/backup_label", pg_data);
+ snprintf(recovery_file, MAXPGPATH, "%s/recovery.conf", pg_data);
}
switch (ctl_command)
On 14/09/10 03:39, Fujii Masao wrote:
On Tue, Sep 14, 2010 at 12:35 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:Hmm, looking at this more closely, I'm a bit confused. We already rename
away backup_label at the beginning of recovery. Under what circumstances do
we have a problem? This starts to seem like a complete non-issue to me.A checkpoint record is read before backup_label file is renamed, so
the problem happens if shutdown is requested while the standby server
is waiting for the WAL containing a checkpoint record to arrive from
the master. This happened some times in my box when testing HS/SR.
Ok, I see. Shouldn't be very common, but might happen if you have a set
restore_command incorrectly for example.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On 14/09/10 03:55, Fujii Masao wrote:
On Tue, Sep 14, 2010 at 12:10 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:Yet another idea: Check in pg_ctl if recovery.conf is also present. If it
is, assume we're in recovery and don't print that warning. That would be
dead simple. I would even consider backpatching it to 9.0, this can be
regarded as a bug, albeit a minor one.Yeah, it's very simple. Attached patch does that.
Committed with some extra comments. I also backpatched this to 9.0.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Tue, Sep 14, 2010 at 5:06 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
Committed with some extra comments. I also backpatched this to 9.0.
Thanks! I marked the patch as committed on CF.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Tue, Sep 14, 2010 at 05:35:23PM +0900, Fujii Masao wrote:
On Tue, Sep 14, 2010 at 5:06 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:Committed with some extra comments. I also backpatched this to 9.0.
Thanks! I marked the patch as committed on CF.
Thanks!
Cheers,
David (Commitfest Cat Herd of the Month).
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate