Idea: closing the loop for "pg_ctl reload"
Bug #12788 reminded me of a problem I think we've discussed before:
if you use "pg_ctl reload" to trigger reload of the postmaster's
config files, and there's something wrong with those files, there's
no warning to you of that. The postmaster just bleats to its log and
keeps running. If you don't think to look in the log to confirm
successful reload, you're left with a time bomb: the next attempt
to restart the postmaster will fail altogether because of the bad
config file.
I commented in the bug thread that there wasn't much that pg_ctl
could do about this, but on reflection it seems like something we
could fix, because pg_ctl must be able to read the postmaster.pid
file in order to issue a reload signal. Consider a design like this:
1. Extend the definition of the postmaster.pid file to add another
line, which will contain the time of the last postmaster configuration
load attempt (might as well be a numeric Unix-style timestamp) and
a boolean indication of whether that attempt succeeded or not.
2. Change pg_ctl so that after sending a reload signal, it sleeps
for a second and checks for a change in the config load timestamp
(repeat as necessary till timeout). Once it sees the timestamp
change, it's in a position to report success or failure using the
boolean. I think this should become the default behavior, though
you could define -W to mean that there should be no wait or feedback.
It's tempting to think of storing a whole error message rather than
just a boolean, but I think that would complicate the pidfile definition
undesirably. A warning to look in the postmaster log ought to be
sufficient here.
For extra credit, the pg_reload_conf() function could be made to behave
similarly.
I don't have the time to pursue this idea myself, but perhaps someone
looking for a not-too-complicated project could take it on.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On February 19, 2015 08:26:45 PM Tom Lane wrote:
I don't have the time to pursue this idea myself, but perhaps someone
looking for a not-too-complicated project could take it on.
I can have a crack at this. What's the process? Do I add it to a CF once I
have a patch, or do I do that beforehand?
jan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Jan de Visser <jan@de-visser.net> writes:
I can have a crack at this. What's the process? Do I add it to a CF once I
have a patch, or do I do that beforehand?
The CF process is for reviewing things, so until you have either a patch
or a design sketch you want feedback on, there's no need for a CF entry.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On February 19, 2015 08:26:45 PM Tom Lane wrote:
Bug #12788 reminded me of a problem I think we've discussed before:
if you use "pg_ctl reload" to trigger reload of the postmaster's
config files, and there's something wrong with those files, there's
no warning to you of that. The postmaster just bleats to its log and
keeps running. If you don't think to look in the log to confirm
successful reload, you're left with a time bomb: the next attempt
to restart the postmaster will fail altogether because of the bad
config file.I commented in the bug thread that there wasn't much that pg_ctl
could do about this, but on reflection it seems like something we
could fix, because pg_ctl must be able to read the postmaster.pid
file in order to issue a reload signal. Consider a design like this:1. Extend the definition of the postmaster.pid file to add another
line, which will contain the time of the last postmaster configuration
load attempt (might as well be a numeric Unix-style timestamp) and
a boolean indication of whether that attempt succeeded or not.2. Change pg_ctl so that after sending a reload signal, it sleeps
for a second and checks for a change in the config load timestamp
(repeat as necessary till timeout). Once it sees the timestamp
change, it's in a position to report success or failure using the
boolean. I think this should become the default behavior, though
you could define -W to mean that there should be no wait or feedback.It's tempting to think of storing a whole error message rather than
just a boolean, but I think that would complicate the pidfile definition
undesirably. A warning to look in the postmaster log ought to be
sufficient here.For extra credit, the pg_reload_conf() function could be made to behave
similarly.I don't have the time to pursue this idea myself, but perhaps someone
looking for a not-too-complicated project could take it on.regards, tom lane
Here's my first crack at this. Notes:
1/ I haven't done the '-W' flag Tom mentions yet.
2/ Likewise haven't touched pg_reload_conf()
3/ Design details: I introduced a new struct in pg_ctl containing the parsed-
out data from postmaster.pid, with functions to retrieve the data and to
dispose memory allocated for it. This made the change in do_reload fairly
straightforward. I think this struct can be used in other places in pg_ctl as
well, and potentially in other files as well.
4/ Question: Can I assume pg_malloc allocated memory is set to zero? If not,
is it OK to do a memset(..., 0, ...)? I don't have much experience on any of
the esoteric platforms pgsql supports...
jan
Attachments:
Let_pg_ctl_check_the_result_of_a_postmaster_config_reload.patchtext/x-patch; charset=UTF-8; name=Let_pg_ctl_check_the_result_of_a_postmaster_config_reload.patchDownload+189-23
On March 2, 2015 12:56:23 AM Jan de Visser wrote:
... stuff ...
I seem to have mis-clicked something in the CF app - I created two entries
somehow. I think one got created when I entered the msgid of Tom's original
message with the enclosing '<...>'. If that's the case, then that may be a
bug.
jan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Mar 2, 2015 at 3:01 PM, Jan de Visser <jan@de-visser.net> wrote:
On March 2, 2015 12:56:23 AM Jan de Visser wrote:
... stuff ...I seem to have mis-clicked something in the CF app - I created two entries
somehow. I think one got created when I entered the msgid of Tom's original
message with the enclosing '<...>'. If that's the case, then that may be a
bug.
Don't worry for that. I just removed the duplicated entry.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Jan de Visser <jan@de-visser.net> writes:
4/ Question: Can I assume pg_malloc allocated memory is set to zero? If not,
is it OK to do a memset(..., 0, ...)? I don't have much experience on any of
the esoteric platforms pgsql supports...
No, you need the memset. You might accidentally get already-zeroed memory
from malloc, but it's not something you can rely on.
However, you could and should use pg_malloc0, which takes care of that
for you...
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On March 2, 2015 09:50:49 AM Tom Lane wrote:
However, you could and should use pg_malloc0, which takes care of that
for you...
I am (using pg_malloc, that is). So, just to be sure: pg_malloc memsets the
block to 0, right?
My question was more along the lines if memsetting to 0 to ensure that pointer
fields are NULL and int/long fields are 0. I know they are on Linux, but don't
know if that applies to other platforms as well, or if I need to set fields
explicitly to those 'zero'/'uninitialized' values.
jan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Jan de Visser <jan@de-visser.net> writes:
On March 2, 2015 09:50:49 AM Tom Lane wrote:
However, you could and should use pg_malloc0, which takes care of that
for you...
I am (using pg_malloc, that is). So, just to be sure: pg_malloc memsets the
block to 0, right?
No, it doesn't, but pg_malloc0 does. Consult the code if you're confused:
src/common/fe_memutils.c
My question was more along the lines if memsetting to 0 to ensure that pointer
fields are NULL and int/long fields are 0.
Yes, we do assume that widely, and so does a heck of a lot of other code.
In principle the C standard doesn't require that a NULL pointer be
all-zero-bits, only that casting "0" to a pointer yield a NULL pointer.
But certainly there are no modern implementations that don't represent
NULL as 0. Anybody who tried to do it differently would soon find that
hardly any real-world C code would run on their platform.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On March 2, 2015 12:44:40 PM Tom Lane wrote:
No, it doesn't, but pg_malloc0 does. Consult the code if you're confused:
src/common/fe_memutils.c
Doh! I read pg_malloc( ), not pg_malloc0.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Jan de Visser <jan@de-visser.net> wrote:
On March 2, 2015 09:50:49 AM Tom Lane wrote:
However, you could and should use pg_malloc0, which takes care
of that for you...I am (using pg_malloc, that is). So, just to be sure: pg_malloc
memsets the block to 0, right?
I think you may have misread a zero character as an empty pair of
parentheses. Tom pointed out that the pg_malloc() function gives
you uninitialized memory -- you cannot count on the contents. He
further pointed out that if you need it to be initialized to '0'
bytes you should call the pg_malloc0() function rather than calling
the pg_malloc() function and running memset separately.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On March 2, 2015 12:56:23 AM Jan de Visser wrote:
Here's my first crack at this. Notes:
1/ I haven't done the '-W' flag Tom mentions yet.
2/ Likewise haven't touched pg_reload_conf()
3/ Design details: I introduced a new struct in pg_ctl containing the
parsed- out data from postmaster.pid, with functions to retrieve the data
and to dispose memory allocated for it. This made the change in do_reload
fairly straightforward. I think this struct can be used in other places in
pg_ctl as well, and potentially in other files as well.
4/ Question: Can I assume pg_malloc allocated memory is set to zero? If not,
is it OK to do a memset(..., 0, ...)? I don't have much experience on any
of the esoteric platforms pgsql supports...
Slight tweaks to better deal with pre-9.5 (6?) servers that don't write the
reload timestamp. I tested with a 9.4 server and it seemed to work...
Also implemented -W/-w. Haven't done pg_reload_conf() yet.
Attachments:
Let_pg_ctl_check_the_result_of_a_postmaster_config_reload.patchtext/x-patch; charset=UTF-8; name=Let_pg_ctl_check_the_result_of_a_postmaster_config_reload.patchDownload+216-26
On Fri, Feb 20, 2015 at 1:26 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
1. Extend the definition of the postmaster.pid file to add another
line, which will contain the time of the last postmaster configuration
load attempt (might as well be a numeric Unix-style timestamp) and
a boolean indication of whether that attempt succeeded or not.
Fwiw this concerns me slightly. I'm sure a lot of people are doing
things like "kill -HUP `cat .../postmaster.pid`" or the equivalent.
We do have the external_pid_file GUC so perhaps this just means we
should warn people in the release notes or somewhere and point them at
that GUC.
--
greg
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2015-03-03 15:21:24 +0000, Greg Stark wrote:
Fwiw this concerns me slightly. I'm sure a lot of people are doing
things like "kill -HUP `cat .../postmaster.pid`" or the equivalent.
postmaster.pid already contains considerably more than just the pid. e.g.
4071
/srv/dev/pgdev-master
1425396089
5440
/tmp
localhost
5440001 82345992
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andres Freund <andres@2ndquadrant.com> writes:
On 2015-03-03 15:21:24 +0000, Greg Stark wrote:
Fwiw this concerns me slightly. I'm sure a lot of people are doing
things like "kill -HUP `cat .../postmaster.pid`" or the equivalent.
postmaster.pid already contains considerably more than just the pid. e.g.
Yeah, that ship sailed long ago. I'm sure anyone who's doing this is
using "head -1" not just "cat", and what I suggest wouldn't break that.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 3/3/15 9:26 AM, Andres Freund wrote:
On 2015-03-03 15:21:24 +0000, Greg Stark wrote:
Fwiw this concerns me slightly. I'm sure a lot of people are doing
things like "kill -HUP `cat .../postmaster.pid`" or the equivalent.postmaster.pid already contains considerably more than just the pid. e.g.
4071
/srv/dev/pgdev-master
1425396089
5440
/tmp
localhost
5440001 82345992
If we already have all this extra stuff, why not include an actual error
message then, or at least the first line of an error (or maybe just swap
any newlines with spaces)?
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On March 3, 2015 10:29:43 AM Tom Lane wrote:
Andres Freund <andres@2ndquadrant.com> writes:
On 2015-03-03 15:21:24 +0000, Greg Stark wrote:
Fwiw this concerns me slightly. I'm sure a lot of people are doing
things like "kill -HUP `cat .../postmaster.pid`" or the equivalent.postmaster.pid already contains considerably more than just the pid. e.g.
Yeah, that ship sailed long ago. I'm sure anyone who's doing this is
using "head -1" not just "cat", and what I suggest wouldn't break that.regards, tom lane
And what I've implemented doesn't either. The extra line is added to the back
of the file.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On March 3, 2015 11:09:29 AM Jim Nasby wrote:
On 3/3/15 9:26 AM, Andres Freund wrote:
On 2015-03-03 15:21:24 +0000, Greg Stark wrote:
Fwiw this concerns me slightly. I'm sure a lot of people are doing
things like "kill -HUP `cat .../postmaster.pid`" or the equivalent.postmaster.pid already contains considerably more than just the pid. e.g.
4071
/srv/dev/pgdev-master
1425396089
5440
/tmp
localhost5440001 82345992
If we already have all this extra stuff, why not include an actual error
message then, or at least the first line of an error (or maybe just swap
any newlines with spaces)?
Not impossible. I can play around with that and see if it's as straightforward
as I think it is.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 3/3/15 11:15 AM, Jan de Visser wrote:
On March 3, 2015 11:09:29 AM Jim Nasby wrote:
On 3/3/15 9:26 AM, Andres Freund wrote:
On 2015-03-03 15:21:24 +0000, Greg Stark wrote:
Fwiw this concerns me slightly. I'm sure a lot of people are doing
things like "kill -HUP `cat .../postmaster.pid`" or the equivalent.postmaster.pid already contains considerably more than just the pid. e.g.
4071
/srv/dev/pgdev-master
1425396089
5440
/tmp
localhost5440001 82345992
If we already have all this extra stuff, why not include an actual error
message then, or at least the first line of an error (or maybe just swap
any newlines with spaces)?Not impossible. I can play around with that and see if it's as straightforward
as I think it is.
I'm sure the code side of this is trivial; it's a question of why Tom
was objecting. It would probably be better for us to come to a
conclusion before working on this.
On a related note... something else we could do here would be to keep a
last-known-good copy of the config files around. That way if you flubbed
something at least the server would still start. I do think that any
admin worth anything would notice an error from pg_ctl, but maybe others
have a different opinion.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2015-03-03 11:09:29 -0600, Jim Nasby wrote:
On 3/3/15 9:26 AM, Andres Freund wrote:
On 2015-03-03 15:21:24 +0000, Greg Stark wrote:
Fwiw this concerns me slightly. I'm sure a lot of people are doing
things like "kill -HUP `cat .../postmaster.pid`" or the equivalent.postmaster.pid already contains considerably more than just the pid. e.g.
4071
/srv/dev/pgdev-master
1425396089
5440
/tmp
localhost
5440001 82345992If we already have all this extra stuff, why not include an actual error
message then, or at least the first line of an error (or maybe just swap any
newlines with spaces)?
It's often several errors and such. I think knowing that it failed and
that you should look into the error log is already quite helpful
already.
Generally we obviously need some status to indicate that the config file
has been reloaded, but that could be easily combined with storing the
error message.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers