pg_ctl non-idempotent behavior change

Started by Jeff Janesover 12 years ago6 messages
#1Jeff Janes
jeff.janes@gmail.com

After 87306184580c9c49717, if the postmaster dies without cleaning up (i.e.
power outage), running "pg_ctl start" just gives this message and then
exits:

pg_ctl: another server might be running

Under the old behavior, it would try to start the server anyway, and
succeed, then go through recovery and give you back a functional system.

From reading the archive, I can't really tell if this change in behavior
was intentional.

Anyway it seems like a bad thing to me. Now the user has a system that
will not start up, and is given no clue that they need to remove
"postmaster.pid" and try again.

The behavior here under the new "-I" flag seems no better in this
situation. It claims the server is running, when it only "might" be
running (and in fact is not running).

Cheers,

Jeff

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jeff Janes (#1)
Re: pg_ctl non-idempotent behavior change

Jeff Janes <jeff.janes@gmail.com> writes:

After 87306184580c9c49717, if the postmaster dies without cleaning up (i.e.
power outage), running "pg_ctl start" just gives this message and then
exits:

pg_ctl: another server might be running

Under the old behavior, it would try to start the server anyway, and
succeed, then go through recovery and give you back a functional system.

From reading the archive, I can't really tell if this change in behavior
was intentional.

Hmm. I rather thought we had agreed not to change the default behavior,
but the commit message fairly clearly says that the default behavior is
being changed. This case shows that that change was inadequately
thought through.

Anyway it seems like a bad thing to me. Now the user has a system that
will not start up, and is given no clue that they need to remove
"postmaster.pid" and try again.

Yeah, this is not tolerable. We could think about improving the logic
to have a stronger check on whether the old server is really there or
not (ie it should be doing something more like pg_ping and less like
just checking if the pidfile is there). But given how close we are to
beta, maybe the best thing is to revert that change for now and put it
back on the to-think-about-for-9.4 list. Peter?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#2)
Re: pg_ctl non-idempotent behavior change

On Sat, 2013-04-27 at 14:24 -0400, Tom Lane wrote:

Yeah, this is not tolerable. We could think about improving the logic
to have a stronger check on whether the old server is really there or
not (ie it should be doing something more like pg_ping and less like
just checking if the pidfile is there). But given how close we are to
beta, maybe the best thing is to revert that change for now and put it
back on the to-think-about-for-9.4 list. Peter?

Reverted.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#2)
Re: pg_ctl non-idempotent behavior change

Tom Lane wrote:

Jeff Janes <jeff.janes@gmail.com> writes:

After 87306184580c9c49717, if the postmaster dies without cleaning up (i.e.
power outage), running "pg_ctl start" just gives this message and then
exits:

pg_ctl: another server might be running

Under the old behavior, it would try to start the server anyway, and
succeed, then go through recovery and give you back a functional system.

From reading the archive, I can't really tell if this change in behavior
was intentional.

Hmm. I rather thought we had agreed not to change the default behavior,
but the commit message fairly clearly says that the default behavior is
being changed. This case shows that that change was inadequately
thought through.

Anyway it seems like a bad thing to me. Now the user has a system that
will not start up, and is given no clue that they need to remove
"postmaster.pid" and try again.

Yeah, this is not tolerable. We could think about improving the logic
to have a stronger check on whether the old server is really there or
not (ie it should be doing something more like pg_ping and less like
just checking if the pidfile is there). But given how close we are to
beta, maybe the best thing is to revert that change for now and put it
back on the to-think-about-for-9.4 list. Peter?

Are we going to unrevert this patch for 9.5?

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Bruce Momjian
bruce@momjian.us
In reply to: Alvaro Herrera (#4)
Re: pg_ctl non-idempotent behavior change

On Mon, Aug 4, 2014 at 05:07:47PM -0400, Alvaro Herrera wrote:

Tom Lane wrote:

Jeff Janes <jeff.janes@gmail.com> writes:

After 87306184580c9c49717, if the postmaster dies without cleaning up (i.e.
power outage), running "pg_ctl start" just gives this message and then
exits:

pg_ctl: another server might be running

Under the old behavior, it would try to start the server anyway, and
succeed, then go through recovery and give you back a functional system.

From reading the archive, I can't really tell if this change in behavior
was intentional.

Hmm. I rather thought we had agreed not to change the default behavior,
but the commit message fairly clearly says that the default behavior is
being changed. This case shows that that change was inadequately
thought through.

Anyway it seems like a bad thing to me. Now the user has a system that
will not start up, and is given no clue that they need to remove
"postmaster.pid" and try again.

Yeah, this is not tolerable. We could think about improving the logic
to have a stronger check on whether the old server is really there or
not (ie it should be doing something more like pg_ping and less like
just checking if the pidfile is there). But given how close we are to
beta, maybe the best thing is to revert that change for now and put it
back on the to-think-about-for-9.4 list. Peter?

Are we going to unrevert this patch for 9.5?

Seems no one is thinking of restoring this patch and working on the
issue.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Peter Eisentraut
peter_e@gmx.net
In reply to: Bruce Momjian (#5)
Re: pg_ctl non-idempotent behavior change

On 10/11/14 6:54 PM, Bruce Momjian wrote:

Are we going to unrevert this patch for 9.5?

Seems no one is thinking of restoring this patch and working on the
issue.

I had postponed work on this issue and set out to create a test
infrastructure so that all the subtle behavioral dependencies mentioned
in the thread could be expressed in code rather than prose.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers