Clarifying "server starting" messaging in pg_ctl start without --wait
Hi Postgres Devs,
I had a suggestion regarding the output pg_ctl gives when you use it to
start the postgres server. At first I was going to write a patch, but then
I decided to just ask you guys first to see what you think.
I had an issue earlier where I was trying to upgrade my postgres database
to a new major version and incidentally a new pg_catalog version, and
therefore the new code could no longer run the existing data directory
without pg_upgrade or pg_dump (I ended up needing pg_dump). Initially I
was very confused because I tried running "pg_ctl -D datadir -l logfile
start" like normal, and it just said "server starting", yet the server was
not starting. It took me a while to realize that I needed to use the
"--wait" / "-w" option to actually wait and test whether the server was
really starting, at which point it told me there was a problem and to check
the log.
I'm concerned some new users may not understand this behavior of pg_ctl, so
I wanted to suggest that we add some additional messaging after "server
starting" - something like:
$ pg_ctl -D datadir -l logfile start
server starting
(to wait for confirmation that server actually started, try pg_ctl again
with --wait)
What do you guys think? Is it important to keep pg_ctl output more terse
than this? I do think something like this could help new users avoid
frustration.
I'm happy to write a patch for this if it's helpful, though it's such a
simple change that if one of the core devs wants this s/he can probably
more easily just add it themselves.
Cheers,
Ryan
On 12/20/16 3:31 PM, Ryan Murphy wrote:
I'm concerned some new users may not understand this behavior of pg_ctl,
so I wanted to suggest that we add some additional messaging after
"server starting" - something like:$ pg_ctl -D datadir -l logfile start
server starting
(to wait for confirmation that server actually started, try pg_ctl again
with --wait)
Maybe the fix is to make --wait the default?
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Dec 20, 2016 at 03:43:11PM -0500, Peter Eisentraut wrote:
On 12/20/16 3:31 PM, Ryan Murphy wrote:
I'm concerned some new users may not understand this behavior of pg_ctl,
so I wanted to suggest that we add some additional messaging after
"server starting" - something like:$ pg_ctl -D datadir -l logfile start
server starting
(to wait for confirmation that server actually started, try pg_ctl again
with --wait)Maybe the fix is to make --wait the default?
+1
It's not super useful to have the prompt return while the server is
still starting up.
Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Ryan Murphy <ryanfmurphy@gmail.com> writes:
I'm concerned some new users may not understand this behavior of pg_ctl, so
I wanted to suggest that we add some additional messaging after "server
starting" - something like:
$ pg_ctl -D datadir -l logfile start
server starting
(to wait for confirmation that server actually started, try pg_ctl again
with --wait)
That seems annoyingly verbose and nanny-ish. Perhaps we could get the
point across like this:
$ pg_ctl -D datadir -l logfile start
requested server to start
$
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
Maybe the fix is to make --wait the default?
I was wondering about that too ... does anyone remember the rationale
for the current behavior? But the message for the non-wait case seems
like it could stand to be improved independently of that.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Tom Lane wrote:
Ryan Murphy <ryanfmurphy@gmail.com> writes:
I'm concerned some new users may not understand this behavior of pg_ctl, so
I wanted to suggest that we add some additional messaging after "server
starting" - something like:$ pg_ctl -D datadir -l logfile start
server starting
(to wait for confirmation that server actually started, try pg_ctl again
with --wait)That seems annoyingly verbose and nanny-ish. Perhaps we could get the
point across like this:$ pg_ctl -D datadir -l logfile start
requested server to start
+1, but also +1 to making --wait the default. Extra points if systemd
start scripts are broken by the change ;-)
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Dec 20, 2016 at 1:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
Maybe the fix is to make --wait the default?
I was wondering about that too ... does anyone remember the rationale
for the current behavior? But the message for the non-wait case seems
like it could stand to be improved independently of that.
Not totally independent.
If the default is changed to --wait then the message can be written
assuming the user understands what "--no-wait" does; but if the default is
left "--no-wait" then cluing the user into the asynchronous behavior and
telling them how to get the more expected synchronous behavior would be
helpful.
David J.
On 12/20/16 3:49 PM, Tom Lane wrote:
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
Maybe the fix is to make --wait the default?
I was wondering about that too ... does anyone remember the rationale
for the current behavior?
Probably because that didn't work reliably before pg_ctl learned how to
get the right port number and PQping() and such things.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 12/20/16 3:43 PM, Peter Eisentraut wrote:
On 12/20/16 3:31 PM, Ryan Murphy wrote:
I'm concerned some new users may not understand this behavior of pg_ctl,
so I wanted to suggest that we add some additional messaging after
"server starting" - something like:$ pg_ctl -D datadir -l logfile start
server starting
(to wait for confirmation that server actually started, try pg_ctl again
with --wait)Maybe the fix is to make --wait the default?
Here is a patch for that.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
0001-pg_ctl-Change-default-to-wait-for-all-actions.patchtext/x-patch; name=0001-pg_ctl-Change-default-to-wait-for-all-actions.patchDownload+21-31
On Fri, Dec 23, 2016 at 10:47 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
On 12/20/16 3:43 PM, Peter Eisentraut wrote:
On 12/20/16 3:31 PM, Ryan Murphy wrote:
I'm concerned some new users may not understand this behavior of pg_ctl,
so I wanted to suggest that we add some additional messaging after
"server starting" - something like:$ pg_ctl -D datadir -l logfile start
server starting
(to wait for confirmation that server actually started, try pg_ctl again
with --wait)Maybe the fix is to make --wait the default?
Here is a patch for that.
Is there still a use case for --no-wait in the real world? Why not
simply ripping it out?
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Michael Paquier <michael.paquier@gmail.com> writes:
Is there still a use case for --no-wait in the real world?
Sure. Most system startup scripts aren't going to want to wait.
If we take it out those people will go back to starting the postmaster
by hand.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 12/23/16 6:10 PM, Tom Lane wrote:
Michael Paquier <michael.paquier@gmail.com> writes:
Is there still a use case for --no-wait in the real world?
Sure. Most system startup scripts aren't going to want to wait.
If we take it out those people will go back to starting the postmaster
by hand.
Presumably they could just background it... since it's not going to be
long-lived it's presumably not that big a deal. Though, seems like many
startup scripts like to make sure what they're starting is actually working.
What might be interesting is a mode that waited for everything but
recovery so at least you know the config is valid, the port is
available, etc. That would be much harder to handle externally.
</feature_creep>
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Dec 23, 2016 at 7:25 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
On 12/23/16 6:10 PM, Tom Lane wrote:
Michael Paquier <michael.paquier@gmail.com> writes:
Is there still a use case for --no-wait in the real world?
Sure. Most system startup scripts aren't going to want to wait.
If we take it out those people will go back to starting the postmaster
by hand.Presumably they could just background it... since it's not going to be
long-lived it's presumably not that big a deal. Though, seems like many
startup scripts like to make sure what they're starting is actually working.
Making --wait the default may or may not be sensible -- I'm not sure
-- but removing --no-wait is clearly a bad idea, and we shouldn't do
it. The fact that the problems created by removing it might be
solvable doesn't mean that it's a good idea to create them in the
first place.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Making --wait the default may or may not be sensible -- I'm not sure
-- but removing --no-wait is clearly a bad idea, and we shouldn't do
it. The fact that the problems created by removing it might be
solvable doesn't mean that it's a good idea to create them in the
first place.
I agree with Robert - pg_ctl is no doubt used in all kinds of scripts that
would then have to change.
It may make sense to have --wait be the default though - certainly less
confusing to new users!
The following review has been posted through the commitfest application:
make installcheck-world: tested, failed
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation: tested, passed
(Though I could not check "make installcheck-world" as passed because it failed 1 test, I think it basically SHOULD pass - see my comment below.)
Patch looks good to me and does what we talked about, and Docs seem clear and correct.
I was able to build Postgres and run pg_ctl and observe that it waited by default for the 'start' action, which addresses my original concern.
`make` and `make install` went fine, and `make check` did as well, but `make installcheck-world` said (after a while):
=======================
1 of 55 tests failed.
=======================
The diff summary is here:
*** /home/my_secret_local_username/my/secret/path/to/postgres/src/interfaces/ecpg/test/expected/connect-test5.stderr 2016-08-23 10:00:53.000000000 -0500
--- /home/my_secret_local_username/my/secret/path/to/postgres/src/interfaces/ecpg/test/results/connect-test5.stderr 2017-01-06 00:08:40.000000000 -0600
***************
*** 36,42 ****
[NO_PID]: sqlca: code: 0, state: 00000
[NO_PID]: ECPGconnect: opening database <DEFAULT> on <DEFAULT> port <DEFAULT> for user regress_ecpg_user2
[NO_PID]: sqlca: code: 0, state: 00000
! [NO_PID]: ECPGconnect: could not open database: FATAL: database "regress_ecpg_user2" does not exist
[NO_PID]: sqlca: code: 0, state: 00000
[NO_PID]: ecpg_finish: connection main closed
--- 36,42 ----
[NO_PID]: sqlca: code: 0, state: 00000
[NO_PID]: ECPGconnect: opening database <DEFAULT> on <DEFAULT> port <DEFAULT> for user regress_ecpg_user2
[NO_PID]: sqlca: code: 0, state: 00000
! [NO_PID]: ECPGconnect: could not open database: FATAL: database "my_secret_local_username" does not exist
[NO_PID]: sqlca: code: 0, state: 00000
[NO_PID]: ecpg_finish: connection main closed
***************
*** 73,79 ****
[NO_PID]: sqlca: code: -220, state: 08003
[NO_PID]: ECPGconnect: opening database <DEFAULT> on <DEFAULT> port <DEFAULT> for user regress_ecpg_user2
[NO_PID]: sqlca: code: 0, state: 00000
! [NO_PID]: ECPGconnect: could not open database: FATAL: database "regress_ecpg_user2" does not exist
[NO_PID]: sqlca: code: 0, state: 00000
[NO_PID]: ecpg_finish: connection main closed
--- 73,79 ----
[NO_PID]: sqlca: code: -220, state: 08003
[NO_PID]: ECPGconnect: opening database <DEFAULT> on <DEFAULT> port <DEFAULT> for user regress_ecpg_user2
[NO_PID]: sqlca: code: 0, state: 00000
! [NO_PID]: ECPGconnect: could not open database: FATAL: database "my_secret_local_username" does not exist
[NO_PID]: sqlca: code: 0, state: 00000
[NO_PID]: ecpg_finish: connection main closed
======================================================================
I don't actually believe this to indicate a problem though - I think perhaps there's a problem with this test, or with how I am running it. The only diff was that when it (correctly) complained of a nonexistent database, it referred to my username that I was logged in as, instead of the test database name "regress_ecpg_user2". I don't think this has anything to do with the changes to pg_ctl.
I could be wrong though! I am going to leave this as "Needs review" until someone more familiar with the project double-checks this.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 1/6/17 12:24 AM, Ryan Murphy wrote:
I don't actually believe this to indicate a problem though - I think perhaps there's a problem with this test, or with how I am running it. The only diff was that when it (correctly) complained of a nonexistent database, it referred to my username that I was logged in as, instead of the test database name "regress_ecpg_user2". I don't think this has anything to do with the changes to pg_ctl.
Hrm, I'm not able to reproduce that problem. Can you run make
installworld-check on a checkout of master and see if you get the same
thing?
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 1/7/17 11:14 PM, Ryan Murphy wrote:
So I realized that I've never actually done `make world` before, and
when I try that I get a funny error:make -C doc all
make -C src all
make -C sgml all
...
***
ERROR: `osx' is missing on your system.
...
Do you have any idea what that means? I googled it but couldn't find
anything. I can dig around more or ask the mailing list if you have no
idea.
It's good to reply on the list (which I've cc'd) as there's lots of
folks that can help you that way.
To answer your question, the error has something to do with building
docs. You don't really need to do that, and it can be rather painful to
get setup to build them. I wouldn't bother for now.
But anyway, last time I think I was running `make installworld-check`
without first running `make world`, which I think is wrong right? - need
`make world` first?
No, you don't. Some of our make targets can be a bit confusing in this
regard...
installworld-check will install all code (and docs, if they've been
built) and then run full tests against them. There's no tests for the
docs, so it doesn't matter if they get installed. The only "test" for
docs is whether they build, but IMHO it's not worth it to ask a new
reviewer to try and test that unless the patch has a *lot* of changes to
the docs.
In any case, docs won't explain why you were seeing a test failure and I
wasn't.
Hmm... I just thought of something though... do you have PGUSER set?
That might break installworld-check.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: CAHeEsBddCEXTkDkqxnu3mxkNGcC3FC4pRat7dZt3HtUTVBTXg@mail.gmail.com
On Fri, Jan 6, 2017 at 11:54 AM, Ryan Murphy <ryanfmurphy@gmail.com> wrote:
The following review has been posted through the commitfest application:
make installcheck-world: tested, failed
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation: tested, passed(Though I could not check "make installcheck-world" as passed because it
failed 1 test, I think it basically SHOULD pass - see my comment below.)Patch looks good to me and does what we talked about, and Docs seem clear
and correct.I was able to build Postgres and run pg_ctl and observe that it waited by
default for the 'start' action, which addresses my original concern.`make` and `make install` went fine, and `make check` did as well, but
`make installcheck-world` said (after a while):=======================
1 of 55 tests failed.
=======================
I am sure you would get this error even without the patch.
--
Thank you,
Beena Emerson
Have a Great Day!
Hello,
On Wed, Jan 11, 2017 t 6:06 PM, Beena Emerson <memissemerson@gmail.com>
wrote:
On Fri, Jan 6, 2017 at 11:54 AM, Ryan Murphy <ryanfmurphy@gmail.com>
wrote:The following review has been posted through the commitfest application:
make installcheck-world: tested, failed
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation: tested, passed(Though I could not check "make installcheck-world" as passed because it
failed 1 test, I think it basically SHOULD pass - see my comment below.)Patch looks good to me and does what we talked about, and Docs seem clear
and correct.I was able to build Postgres and run pg_ctl and observe that it waited by
default for the 'start' action, which addresses my original concern.`make` and `make install` went fine, and `make check` did as well, but
`make installcheck-world` said (after a while):=======================
1 of 55 tests failed.
=======================I am sure you would get this error even without the patch.
The patch is good. I do not have any comments to make about the patch.
Ryan try to run 'make install-world' then 'make -i installcheck-world', -i
option will ignore the error and proceed. You can check if any other tests
fails. This is a separate issue, unrelated to this patch. I do not think we
should stop from changing the status because of this.
The status is now updated to 'Ready for committer'
Thank you,
Beena Emerson
Have a Great Day!
Thanks for the review Beena, I'm glad the patch is ready to go!
I think because of my environment/setup, I get errors when I try "make
install-world", but I'm at work now, when I have time I will go back and
try again and figure out what is wrong. I'll let you guys know if I have
any questions.
Take care,
Ryan