HS/SR and smart shutdown
I've been working on my demo, and I'm discovering that due to the
connection from the walsender and walreceiver, "smart" shutdown from
pg_ctl doesn't work if replication is active.
This seems worth fixing; if we don't fix it, we should at least document it.
Comments?
--Josh
On Thu, Jan 21, 2010 at 8:04 AM, Josh Berkus <josh@agliodbs.com> wrote:
I've been working on my demo, and I'm discovering that due to the
connection from the walsender and walreceiver, "smart" shutdown from
pg_ctl doesn't work if replication is active.This seems worth fixing; if we don't fix it, we should at least document it.
Comments?
Thanks for the report.
Which servers (primary or standby) did you try a "smart" shutdown on?
If it's "primary", could you show me the reproducible test set? At least
in my box, a "smart" shutdown on the primary works fine.
If it's "standby", it's a previously-existing behavior that a "smart"
shutdown doesn't work immediately during recovery. After a recovery
has been completed, it would work. Of course, I agree that such a
behavior should be documented.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
If it's "standby", it's a previously-existing behavior that a "smart"
shutdown doesn't work immediately during recovery. After a recovery
has been completed, it would work. Of course, I agree that such a
behavior should be documented.
Well, as long as streaming rep is running, you can't do a smart shutdown
... smart shutdown seems to treat the walreciever as a client
connection. At the very least, this should be in the documentation.
--Josh Berkus
On Wed, Jan 20, 2010 at 8:44 PM, Josh Berkus <josh@agliodbs.com> wrote:
If it's "standby", it's a previously-existing behavior that a "smart"
shutdown doesn't work immediately during recovery. After a recovery
has been completed, it would work. Of course, I agree that such a
behavior should be documented.Well, as long as streaming rep is running, you can't do a smart shutdown
... smart shutdown seems to treat the walreciever as a client
connection. At the very least, this should be in the documentation.
How hard is it to fix?
...Robert
Robert Haas <robertmhaas@gmail.com> writes:
On Wed, Jan 20, 2010 at 8:44 PM, Josh Berkus <josh@agliodbs.com> wrote:
Well, as long as streaming rep is running, you can't do a smart shutdown
... smart shutdown seems to treat the walreciever as a client
connection. �At the very least, this should be in the documentation.
How hard is it to fix?
I think the first question is do we *want* to fix it, or is it
appropriate behavior?
If the master shuts down, will the slaves try to fail over to become
masters? When the master restarts, will the slaves automatically
reconnect? If these questions have the wrong answers, shutting down the
master isn't something to be done lightly, and automatically
disconnecting slaves would be a real bad idea.
regards, tom lane
On Wed, Jan 20, 2010 at 8:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Wed, Jan 20, 2010 at 8:44 PM, Josh Berkus <josh@agliodbs.com> wrote:
Well, as long as streaming rep is running, you can't do a smart shutdown
... smart shutdown seems to treat the walreciever as a client
connection. At the very least, this should be in the documentation.How hard is it to fix?
I think the first question is do we *want* to fix it, or is it
appropriate behavior?If the master shuts down, will the slaves try to fail over to become
masters? When the master restarts, will the slaves automatically
reconnect? If these questions have the wrong answers, shutting down the
master isn't something to be done lightly, and automatically
disconnecting slaves would be a real bad idea.
I thought the scenario in question was that someone wanted to manually
shut down the slave. Am I misunderstanding?
...Robert
Tom Lane wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Wed, Jan 20, 2010 at 8:44 PM, Josh Berkus <josh@agliodbs.com> wrote:
Well, as long as streaming rep is running, you can't do a smart shutdown
... smart shutdown seems to treat the walreciever as a client
connection. At the very least, this should be in the documentation.How hard is it to fix?
I think the first question is do we *want* to fix it, or is it
appropriate behavior?If the master shuts down, will the slaves try to fail over to become
masters? When the master restarts, will the slaves automatically
reconnect? If these questions have the wrong answers, shutting down the
master isn't something to be done lightly, and automatically
disconnecting slaves would be a real bad idea.
Right - surely people who have been using pg_standby etc have discovered
this behaviour, so documenting it is fine I would think.
regards
Mark
On Thu, Jan 21, 2010 at 10:44 AM, Josh Berkus <josh@agliodbs.com> wrote:
If it's "standby", it's a previously-existing behavior that a "smart"
shutdown doesn't work immediately during recovery. After a recovery
has been completed, it would work. Of course, I agree that such a
behavior should be documented.Well, as long as streaming rep is running, you can't do a smart shutdown
... smart shutdown seems to treat the walreciever as a client
connection.
Even if SR is not running, as long as the startup process is running,
we can't do a smart shutdown. It's not peculiar to SR.
At the very least, this should be in the documentation.
Agreed. Something like "smart shutdown is not allowed during recovery"
should be in the following section.
http://developer.postgresql.org/pgdocs/postgres/server-shutdown.html
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Fujii Masao wrote:
On Thu, Jan 21, 2010 at 10:44 AM, Josh Berkus <josh@agliodbs.com> wrote:
If it's "standby", it's a previously-existing behavior that a "smart"
shutdown doesn't work immediately during recovery. After a recovery
has been completed, it would work. Of course, I agree that such a
behavior should be documented.Well, as long as streaming rep is running, you can't do a smart shutdown
... smart shutdown seems to treat the walreciever as a client
connection.Even if SR is not running, as long as the startup process is running,
we can't do a smart shutdown. It's not peculiar to SR.
Right, that's the way a standby server (= one still in recovery) has
always behaved. It has made sense in the past: it's not in the spirit of
smart shutdown to kill the WAL replay immediately. "smart" means wait
for recovery to finish, then shutdown.
It's a good question if that still makes sense with Hot Standby. Perhaps
we should redefine smart shutdown in standby mode to shut down as soon
as all read-only connections have died.
At the very least, this should be in the documentation.
Agreed. Something like "smart shutdown is not allowed during recovery"
should be in the following section.
http://developer.postgresql.org/pgdocs/postgres/server-shutdown.html
It's allowed, it just doesn't do what you might expect.
In the master, smart shutdown shuts down as soon as all regular backends
are gone. It doesn't wait for the standby connections to die. In fact
they're not killed until after the shutdown checkpoint is written, so
that it gets sent to the standbys too. I think we're good there.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas wrote:
It's a good question if that still makes sense with Hot Standby. Perhaps
we should redefine smart shutdown in standby mode to shut down as soon
as all read-only connections have died.
I've advocated in the past that an escalating shutdown procedure would
be helpful in general to have available. Start kicking off clients with
smart, continue to fast if there's any left, and if there's still any
left after that (have seen COPY clients that ignore fast) disconnect
them and go to immediate to completely kill them. Once you've started
the server on the road to shutdown, even with smart, you've basically
committed to going all the way down by whatever means is available
anyway, so why not make that more automated and easier.
If something like that were available, I could see inserting a step in
the middle there specifically aimed at resolving this issue. Maybe it's
just a change to the beginning of fast shutdown, or to the end of smart
as I think you're suggesting. Perhaps you only get it if you do one of
these escalating shutdowns I'm proposing, making that the preferred way
to handle HS servers.
--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.com
On Thu, Jan 21, 2010 at 4:27 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
It's a good question if that still makes sense with Hot Standby. Perhaps
we should redefine smart shutdown in standby mode to shut down as soon
as all read-only connections have died.
Okay. Let's work out the details.
I guess that the startup process and the walreceiver should wait
for all read only backends to exit in smart shutdown case. It's
because those backends might be waiting for the record that conflicts
with their queries to be replayed. Is this OK? Or we should kill the
startup process and the walreceiver on ahead?
If my guess is right, we would need to add new PMState to cancel
recovery and replication after all read only connections have died.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Fujii,
I guess that the startup process and the walreceiver should wait
for all read only backends to exit in smart shutdown case. It's
because those backends might be waiting for the record that conflicts
with their queries to be replayed. Is this OK? Or we should kill the
startup process and the walreceiver on ahead?If my guess is right, we would need to add new PMState to cancel
recovery and replication after all read only connections have died.
How could existing read queries on the slave be waiting on a WAL record?
I don't follow this.
--Josh Berkus
Josh Berkus wrote:
I guess that the startup process and the walreceiver should wait
for all read only backends to exit in smart shutdown case. It's
because those backends might be waiting for the record that conflicts
with their queries to be replayed. Is this OK? Or we should kill the
startup process and the walreceiver on ahead?If my guess is right, we would need to add new PMState to cancel
recovery and replication after all read only connections have died.How could existing read queries on the slave be waiting on a WAL record?
Imagine that you do this in the master:
begin;
DROP TABLE foo (id int4);
< a lot of other stuff>
commit;
When the DROP is replayed in the standby, the startup process acquires a
lock on table foo, on behalf of the transaction that it's replaying. If
you run "SELECT * FROM foo" in the standby after that, it will block
until the startup process replays the COMMIT record and releases the lock.
This is similar to the deadlock situation in hot standby that was
discussed on the other thread, "Re: pgsql: In HS, Startup process sets
SIGALRM when waiting for buffer pin."
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Thu, 2010-01-21 at 09:27 +0200, Heikki Linnakangas wrote:
Right, that's the way a standby server (= one still in recovery) has
always behaved. It has made sense in the past: it's not in the spirit
of smart shutdown to kill the WAL replay immediately. "smart" means
wait for recovery to finish, then shutdown.It's a good question if that still makes sense with Hot Standby.
Perhaps we should redefine smart shutdown in standby mode to shut down
as soon as all read-only connections have died.
It's clear that "smart" shutdown doesn't work while something is active.
Recovery is "active" and so we shouldn't shutdown. It makes sense, it
works like this already, lets leave it. Document it if needed.
--
Simon Riggs www.2ndQuadrant.com
It's a good question if that still makes sense with Hot Standby.
Perhaps we should redefine smart shutdown in standby mode to shut down
as soon as all read-only connections have died.It's clear that "smart" shutdown doesn't work while something is active.
Recovery is "active" and so we shouldn't shutdown. It makes sense, it
works like this already, lets leave it. Document it if needed.
I don't think it's clear, or intuitive for users. In SR, recovery is
*never* done, so smart shutdown never completes (even if the master is
shut down, when I tested it). This is particularly an important issue
when you consider that some/many service and init scripts only use smart
shutdown ... so we'll get a lot of "bug reports" of "posgresql does not
shut down".
HOWEVER, I do believe this is an issue we could live with for 9.0 if
it's going to lead to a whole lot of additional debugging of SR. But if
it's an easy fix, it'll avoid a lot of complaints on pgsql-general.
--Josh Berkus
On Fri, Jan 29, 2010 at 7:01 PM, Josh Berkus <josh@agliodbs.com> wrote:
It's a good question if that still makes sense with Hot Standby.
Perhaps we should redefine smart shutdown in standby mode to shut down
as soon as all read-only connections have died.It's clear that "smart" shutdown doesn't work while something is active.
Recovery is "active" and so we shouldn't shutdown. It makes sense, it
works like this already, lets leave it. Document it if needed.I don't think it's clear, or intuitive for users. In SR, recovery is
*never* done, so smart shutdown never completes (even if the master is
shut down, when I tested it). This is particularly an important issue
when you consider that some/many service and init scripts only use smart
shutdown ... so we'll get a lot of "bug reports" of "posgresql does not
shut down".
Absolutely agreed. The existing smart shutdown behavior makes sense
from a certain point of view, but it doesn't seem very... what's the
word I'm looking for?... smart.
HOWEVER, I do believe this is an issue we could live with for 9.0 if
it's going to lead to a whole lot of additional debugging of SR. But if
it's an easy fix, it'll avoid a lot of complaints on pgsql-general.
Also agreed.
...Robert
On Sat, Jan 30, 2010 at 9:01 AM, Josh Berkus <josh@agliodbs.com> wrote:
I don't think it's clear, or intuitive for users. In SR, recovery is
*never* done, so smart shutdown never completes (even if the master is
shut down, when I tested it).
If you specify the trigger_file parameter in the recovery.conf, the presence
of the trigger file would complete recovery. So the existing smart shutdown
waits for it to be created. I agree that this behavior is somewhat confusing
for users.
HOWEVER, I do believe this is an issue we could live with for 9.0 if
it's going to lead to a whole lot of additional debugging of SR. But if
it's an easy fix, it'll avoid a lot of complaints on pgsql-general.
I think that the latter statement is right.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Sat, Jan 30, 2010 at 01:05, Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Jan 29, 2010 at 7:01 PM, Josh Berkus <josh@agliodbs.com> wrote:
It's a good question if that still makes sense with Hot Standby.
Perhaps we should redefine smart shutdown in standby mode to shut down
as soon as all read-only connections have died.It's clear that "smart" shutdown doesn't work while something is active.
Recovery is "active" and so we shouldn't shutdown. It makes sense, it
works like this already, lets leave it. Document it if needed.I don't think it's clear, or intuitive for users. In SR, recovery is
*never* done, so smart shutdown never completes (even if the master is
shut down, when I tested it). This is particularly an important issue
when you consider that some/many service and init scripts only use smart
shutdown ... so we'll get a lot of "bug reports" of "posgresql does not
shut down".Absolutely agreed. The existing smart shutdown behavior makes sense
from a certain point of view, but it doesn't seem very... what's the
word I'm looking for?... smart.
Yeah.
How about we change it so it's not the default anymore?
The fact is that for most applications, it's just broken. Consider any
application that uses connection pooling, which happens to be what we
recommend people to do. A smart shutdown will never shut that server
down. But it will make it not accept new connections. Which is
probably the worst possible behavior in most cases.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On Sat, Jan 30, 2010 at 12:54 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
HOWEVER, I do believe this is an issue we could live with for 9.0 if
it's going to lead to a whole lot of additional debugging of SR. But if
it's an easy fix, it'll avoid a lot of complaints on pgsql-general.I think that the latter statement is right.
Though we've not reached consensus on smart shutdown during
recovery yet, I wrote the patch that changes its behavior:
shut down the server (including the startup process and the
walreceiver) as soon as all read-only connections have died.
The code is also available in the 'replication' branch in
my git repository.
And, let's discuss whether something like the attached patch
is required for v9.0 or not.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Attachments:
new_smart_shutdown_20100201.patchtext/x-patch; charset=US-ASCII; name=new_smart_shutdown_20100201.patchDownload
*** a/src/backend/postmaster/postmaster.c
--- b/src/backend/postmaster/postmaster.c
***************
*** 278,283 **** typedef enum
--- 278,284 ----
PM_RECOVERY_CONSISTENT, /* consistent recovery mode */
PM_RUN, /* normal "database is alive" state */
PM_WAIT_BACKUP, /* waiting for online backup mode to end */
+ PM_WAIT_READONLY, /* waiting for read only backends to exit */
PM_WAIT_BACKENDS, /* waiting for live backends to exit */
PM_SHUTDOWN, /* waiting for bgwriter to do shutdown ckpt */
PM_SHUTDOWN_2, /* waiting for archiver and walsenders to finish */
***************
*** 2165,2171 **** pmdie(SIGNAL_ARGS)
/* and the walwriter too */
if (WalWriterPID != 0)
signal_child(WalWriterPID, SIGTERM);
! pmState = PM_WAIT_BACKUP;
}
/*
--- 2166,2173 ----
/* and the walwriter too */
if (WalWriterPID != 0)
signal_child(WalWriterPID, SIGTERM);
! /* online backup mode is active only when normal processing */
! pmState = (pmState == PM_RUN) ? PM_WAIT_BACKUP : PM_WAIT_READONLY;
}
/*
***************
*** 2840,2845 **** PostmasterStateMachine(void)
--- 2842,2870 ----
}
/*
+ * If we are in a state-machine state that implies waiting for read only
+ * backends to exit, see if they're all gone, and change state if so.
+ */
+ if (pmState == PM_WAIT_READONLY)
+ {
+ /*
+ * PM_WAIT_READONLY state ends when we have no regular backends that
+ * have been started during recovery. Since those backends might be
+ * waiting for the WAL record that conflicts with their queries to be
+ * replayed, recovery and replication need to remain until all read
+ * only backends have been gone away.
+ */
+ if (CountChildren(BACKEND_TYPE_NORMAL) == 0)
+ {
+ if (StartupPID != 0)
+ signal_child(StartupPID, SIGTERM);
+ if (WalReceiverPID != 0)
+ signal_child(WalReceiverPID, SIGTERM);
+ pmState = PM_WAIT_BACKENDS;
+ }
+ }
+
+ /*
* If we are in a state-machine state that implies waiting for backends to
* exit, see if they're all gone, and change state if so.
*/
On Mon, Feb 1, 2010 at 11:49 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
On Sat, Jan 30, 2010 at 12:54 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
HOWEVER, I do believe this is an issue we could live with for 9.0 if
it's going to lead to a whole lot of additional debugging of SR. But if
it's an easy fix, it'll avoid a lot of complaints on pgsql-general.I think that the latter statement is right.
Though we've not reached consensus on smart shutdown during
recovery yet, I wrote the patch that changes its behavior:
shut down the server (including the startup process and the
walreceiver) as soon as all read-only connections have died.
The code is also available in the 'replication' branch in
my git repository.And, let's discuss whether something like the attached patch
is required for v9.0 or not.
There is no post about this for over a month. Can I remove this
from TODO item of SR for 9.0? Thought? Objection?
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Thu, Mar 4, 2010 at 12:11 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
There is no post about this for over a month. Can I remove this
from TODO item of SR for 9.0? Thought? Objection?
Does smart shutdown still fail to shut down a slave?
--
greg
On Thu, Mar 4, 2010 at 11:55 PM, Greg Stark <gsstark@mit.edu> wrote:
On Thu, Mar 4, 2010 at 12:11 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
There is no post about this for over a month. Can I remove this
from TODO item of SR for 9.0? Thought? Objection?Does smart shutdown still fail to shut down a slave?
Yes. More precisely, smart shutdown during recovery does not complete
until recovery ends.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Thu, Mar 4, 2010 at 10:17 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
On Thu, Mar 4, 2010 at 11:55 PM, Greg Stark <gsstark@mit.edu> wrote:
On Thu, Mar 4, 2010 at 12:11 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
There is no post about this for over a month. Can I remove this
from TODO item of SR for 9.0? Thought? Objection?Does smart shutdown still fail to shut down a slave?
Yes. More precisely, smart shutdown during recovery does not complete
until recovery ends.
Well, I don't think we should let smart shutdown just never terminate
when standby_mode = on. That's really a minefield for the unwary. I
think we either need to make it work, or somehow give the user an
error that says "try a different shutdown mode".
...Robert
On Thu, Mar 4, 2010 at 3:56 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Mar 4, 2010 at 10:17 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
Yes. More precisely, smart shutdown during recovery does not complete
until recovery ends.Well, I don't think we should let smart shutdown just never terminate
when standby_mode = on. That's really a minefield for the unwary. I
think we either need to make it work, or somehow give the user an
error that says "try a different shutdown mode".
It also seems dangerous to let someone think they have a standby
database ready to go and the minute they need it -- it shuts down....
--
greg
On Thu, Mar 4, 2010 at 12:39 PM, Greg Stark <gsstark@mit.edu> wrote:
On Thu, Mar 4, 2010 at 3:56 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Mar 4, 2010 at 10:17 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
Yes. More precisely, smart shutdown during recovery does not complete
until recovery ends.Well, I don't think we should let smart shutdown just never terminate
when standby_mode = on. That's really a minefield for the unwary. I
think we either need to make it work, or somehow give the user an
error that says "try a different shutdown mode".It also seems dangerous to let someone think they have a standby
database ready to go and the minute they need it -- it shuts down....
LOL.
Yeah, that would not be cool.
...Robert