pgsql: pg_ctl: Detect current standby state from pg_control

Started by Peter Eisentrautover 9 years ago9 messageshackers
Jump to latest
#1Peter Eisentraut
peter_e@gmx.net

pg_ctl: Detect current standby state from pg_control

pg_ctl used to determine whether a server was in standby mode by looking
for a recovery.conf file. With this change, it instead looks into
pg_control, which is potentially more accurate. There are also
occasional discussions about removing recovery.conf, so this removes one
dependency.

Reviewed-by: Michael Paquier <michael.paquier@gmail.com>

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/c1dc51d4844e2a37412b034c07c1c5a439ba0b9d

Modified Files
--------------
src/backend/utils/misc/pg_controldata.c | 12 +++++++++
src/bin/pg_controldata/pg_controldata.c | 4 +++
src/bin/pg_ctl/pg_ctl.c | 47 ++++++++++++++++++++++++++-------
src/common/controldata_utils.c | 15 +++++------
src/include/common/controldata_utils.h | 2 +-
5 files changed, 62 insertions(+), 18 deletions(-)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#1)
Re: pgsql: pg_ctl: Detect current standby state from pg_control

Peter Eisentraut <peter_e@gmx.net> writes:

pg_ctl: Detect current standby state from pg_control

Coverity thinks that this patch introduced a bunch of
null-pointer-dereference hazards, and AFAICS it is right.
The change in get_controlfile()'s API is completely broken
and needs to be undone.

regards, tom lane

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

#3Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#2)
Re: pgsql: pg_ctl: Detect current standby state from pg_control

On Mon, Sep 26, 2016 at 12:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

pg_ctl: Detect current standby state from pg_control

Coverity thinks that this patch introduced a bunch of
null-pointer-dereference hazards, and AFAICS it is right.
The change in get_controlfile()'s API is completely broken
and needs to be undone.

So you'd rather have a sleep(1) and complicate this code to replace it
with a WaitLatch() when the code is used by the backend? I don't agree
with that as the new API is cleaner as presented, though I agree that
it is a change that caller does not get any result in case of a CRC
mismatch, but who's going to use results that cannot be trusted
anyway? It seems to me that the correct fix here is that
pg_controldata.c should just exit like in the attached patch. Somewhat
I missed that during my lookup of c1dc51d4.

If we would still want callers of get_controlfile() to be allowed to
read untrustworthy results, we could just add a boolean status flag..
--
Michael

Attachments:

controldata-untrust.patchapplication/x-download; name=controldata-untrust.patchDownload+4-1
#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#3)
Re: pgsql: pg_ctl: Detect current standby state from pg_control

Michael Paquier <michael.paquier@gmail.com> writes:

On Mon, Sep 26, 2016 at 12:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Coverity thinks that this patch introduced a bunch of
null-pointer-dereference hazards, and AFAICS it is right.
The change in get_controlfile()'s API is completely broken
and needs to be undone.

So you'd rather have a sleep(1) and complicate this code to replace it
with a WaitLatch() when the code is used by the backend? I don't agree
with that as the new API is cleaner as presented, though I agree that
it is a change that caller does not get any result in case of a CRC
mismatch, but who's going to use results that cannot be trusted
anyway?

Well, they certainly won't be using any untrustworthy reports if
pg_controldata dumps core instead of printing the data, which is what
will happen in HEAD. Although I don't understand your reference to
needing a sleep. I think that it's 100% pointless for get_control_dbstate
to be worried about transient CRC failures. If writes to pg_control
aren't atomic then we have problems enormously larger than whether
"pg_ctl promote" throws an error or not.

It seems to me that the correct fix here is that
pg_controldata.c should just exit like in the attached patch.

No. Removing pg_controldata's designed-in ability to print information
from corrupted pg_control files was not part of the advertised impact of
this patch, and I will absolutely not hold still for it happening as an
accidental consequence of ill-advised refactoring.

It may well be that you need two different APIs for get_controlfile(),
maybe with a flag, or two different wrappers around some common code.
Or just go back to using #ifdef FRONTEND, and forget the silliness of
expecting pg_ctl not to throw an error for bad pg_control content.

BTW, now that get_controlfile has become a global symbol rather than
something private to pg_controldata.c, I suggest that it needs a less
generic name. "control file" could apply to a lot of things.
Possibly "get_pg_control_contents" would be better.

regards, tom lane

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

#5Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#4)
Re: pgsql: pg_ctl: Detect current standby state from pg_control

On 9/26/16 8:53 AM, Tom Lane wrote:

I think that it's 100% pointless for get_control_dbstate
to be worried about transient CRC failures. If writes to pg_control
aren't atomic then we have problems enormously larger than whether
"pg_ctl promote" throws an error or not.

The new code was designed to handle that, but if we think it's not an
issue, then we can simplify it. I'm on it.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

#6Peter Eisentraut
peter_e@gmx.net
In reply to: Peter Eisentraut (#5)
Re: [COMMITTERS] pgsql: pg_ctl: Detect current standby state from pg_control

On 9/26/16 7:56 PM, Peter Eisentraut wrote:

On 9/26/16 8:53 AM, Tom Lane wrote:

I think that it's 100% pointless for get_control_dbstate
to be worried about transient CRC failures. If writes to pg_control
aren't atomic then we have problems enormously larger than whether
"pg_ctl promote" throws an error or not.

The new code was designed to handle that, but if we think it's not an
issue, then we can simplify it. I'm on it.

How about this?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-Fix-CRC-check-handling-in-get_controlfile.patchinvalid/octet-stream; name=0001-Fix-CRC-check-handling-in-get_controlfile.patchDownload+31-30
#7Michael Paquier
michael@paquier.xyz
In reply to: Peter Eisentraut (#6)
Re: [COMMITTERS] pgsql: pg_ctl: Detect current standby state from pg_control

On Tue, Sep 27, 2016 at 9:45 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

On 9/26/16 7:56 PM, Peter Eisentraut wrote:

On 9/26/16 8:53 AM, Tom Lane wrote:

I think that it's 100% pointless for get_control_dbstate
to be worried about transient CRC failures. If writes to pg_control
aren't atomic then we have problems enormously larger than whether
"pg_ctl promote" throws an error or not.

The new code was designed to handle that, but if we think it's not an
issue, then we can simplify it. I'm on it.

How about this?

+       if (crc_ok_p)
+           *crc_ok_p = false;
+       else
+       {
+           pfree(ControlFile);
+           return NULL;
+       }
Seems overcomplicated to me. How about returning the control file all
the time and let the caller pfree the result? You could then use
crc_ok in pg_ctl.c's get_control_dbstate() to do the decision-making.

Also, if the sleep() loop is removed for promote -w, we assume that
we'd fail immediately in case of a corrupted control file. Could it be
possible that after a couple of seconds the backend gets it right and
overwrites a correct control file on top of a corrupted one? I am just
curious about such possibilities, still I am +1 for removing the
pg_usleep loop and failing right away as you propose.

If we do the renaming of get_controlfile(), it may be also a good idea
to do the same for get_configdata() in config_info.c for consistency.
That's not really critical IMHO though.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#7)
Re: [COMMITTERS] pgsql: pg_ctl: Detect current standby state from pg_control

On Tue, Sep 27, 2016 at 9:55 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:

Seems overcomplicated to me. How about returning the control file all
the time and let the caller pfree the result? You could then use
crc_ok in pg_ctl.c's get_control_dbstate() to do the decision-making.

In short I would just go with the attached and call it a day.
--
Michael

Attachments:

control-crc-checks-v2.patchapplication/x-patch; name=control-crc-checks-v2.patchDownload+32-38
#9Peter Eisentraut
peter_e@gmx.net
In reply to: Michael Paquier (#8)
Re: [COMMITTERS] pgsql: pg_ctl: Detect current standby state from pg_control

On 9/28/16 12:44 AM, Michael Paquier wrote:

On Tue, Sep 27, 2016 at 9:55 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:

Seems overcomplicated to me. How about returning the control file all
the time and let the caller pfree the result? You could then use
crc_ok in pg_ctl.c's get_control_dbstate() to do the decision-making.

In short I would just go with the attached and call it a day.

Pushed that way. Thanks!

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers