pg_ctl configurable timeout
I'm having trouble with the hardcoded 60 second timeout in pg_ctl. pg_ctl
sometimes just times out and there is no way to make it wait a little longer.
I would like to add an option to be able to change that, say
pg_ctl -w --timeout=120. Comments?
--
Peter Eisentraut
http://developer.postgresql.org/~petere/
Peter Eisentraut wrote:
I'm having trouble with the hardcoded 60 second timeout in pg_ctl. pg_ctl
sometimes just times out and there is no way to make it wait a little longer.
I would like to add an option to be able to change that, say
pg_ctl -w --timeout=120. Comments?
+1
I played with 2GB shared buffers and stop action takes 10-20s. On system
with more memory 60s is not enough.
Zdenek
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Zdenek Kotala wrote:
Peter Eisentraut wrote:
I'm having trouble with the hardcoded 60 second timeout in pg_ctl.
pg_ctl sometimes just times out and there is no way to make it wait a
little longer. I would like to add an option to be able to change
that, say pg_ctl -w --timeout=120. Comments?+1
I played with 2GB shared buffers and stop action takes 10-20s. On system
with more memory 60s is not enough.
Huh? I have never seen this problem.
Joshua D. Drake
Zdenek
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
- --
=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997 http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFGze9uATb/zqfZUUQRAsjDAJwI2Q3Cv8cCIqmNXnbbw1vQLXDADwCdHBdx
fWqe0ffSciAfAcdIN3jXMfw=
=m+9v
-----END PGP SIGNATURE-----
Joshua D. Drake wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1Zdenek Kotala wrote:
Peter Eisentraut wrote:
I'm having trouble with the hardcoded 60 second timeout in pg_ctl.
pg_ctl sometimes just times out and there is no way to make it wait a
little longer. I would like to add an option to be able to change
that, say pg_ctl -w --timeout=120. Comments?+1
I played with 2GB shared buffers and stop action takes 10-20s. On system
with more memory 60s is not enough.Huh? I have never seen this problem.
It happened when I stop server after heavy performance test. I expected
that postgres tries to check if there is not some dirty page in the
buffer, but I did not investigate in it.
Zdenek
Am Freitag, 17. August 2007 schrieb Peter Eisentraut:
I'm having trouble with the hardcoded 60 second timeout in pg_ctl. pg_ctl
sometimes just times out and there is no way to make it wait a little
longer. I would like to add an option to be able to change that, say
pg_ctl -w --timeout=120. Comments?
Lost track of this, but it keeps biting me.
Somehow, the 60 second timeout seems completely arbitrary anyway. Maybe we
should remove it altogether. We could add an option as described above, but
then the packager who creates the init script or whoever creates the initial
configuration will have to make an equally arbitrary choice. And most likely
you will not notice that your configuration is insufficient until you are
really in a bind.
What should we do?
--
Peter Eisentraut
http://developer.postgresql.org/~petere/
------- Original Message -------
From: Peter Eisentraut <peter_e@gmx.net>
To: pgsql-hackers@postgresql.org
Sent: 29/10/07, 17:54:00
Subject: Re: [HACKERS] pg_ctl configurable timeoutAm Freitag, 17. August 2007 schrieb Peter Eisentraut:
I'm having trouble with the hardcoded 60 second timeout in pg_ctl. pg_ctl
sometimes just times out and there is no way to make it wait a little
longer. I would like to add an option to be able to change that, say
pg_ctl -w --timeout=120. Comments?Lost track of this, but it keeps biting me.
Somehow, the 60 second timeout seems completely arbitrary anyway. Maybe we
should remove it altogether. We could add an option as described above, but
then the packager who creates the init script or whoever creates the initial
configuration will have to make an equally arbitrary choice. And most likely
you will not notice that your configuration is insufficient until you are
really in a bind.What should we do?
We need the option on Windows to prevent dependent services being started too quickly.
The same problem occurs there with pg_ctl reporting it's status to the service control manager. The scm interface handles this by having the service regularly increment a variable, and if required, updating the estimated startup time. A similar architecture might be feasible if we had the postmaster signal pg_ctl periodically until started at which point a different signal is sent. We then only timeout if no pulse or started signal is received within X seconds.
Regards, Dave
Import Notes
Resolved by subject fallback
Peter Eisentraut wrote:
Am Freitag, 17. August 2007 schrieb Peter Eisentraut:
I'm having trouble with the hardcoded 60 second timeout in pg_ctl. pg_ctl
sometimes just times out and there is no way to make it wait a little
longer. I would like to add an option to be able to change that, say
pg_ctl -w --timeout=120. Comments?Lost track of this, but it keeps biting me.
Somehow, the 60 second timeout seems completely arbitrary anyway. Maybe we
should remove it altogether. We could add an option as described above, but
then the packager who creates the init script or whoever creates the initial
configuration will have to make an equally arbitrary choice. And most likely
you will not notice that your configuration is insufficient until you are
really in a bind.What should we do?
How about an environment variable to control the timeout? Is that
cleaner?
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
On Mon, 2007-10-29 at 17:34 -0300, Alvaro Herrera wrote:
Maybe hack the postmaster to have a new special connection mode which
keeps the connection open until the startup process exits, to avoid
polling continuously (ideally report progress too, if at all
possible).
That sounds good to me. The spurious connection messages look weird and
its difficult to say that's one of the ERRORs that isn't an error. There
has to be a way for pg_ctl to ask whether the server is still starting
up without causing a message every second in the server log.
--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com
Import Notes
Reply to msg id not found: 20071029203445.GA3324@alvh.no-ip.org
Peter Eisentraut <peter_e@gmx.net> writes:
Somehow, the 60 second timeout seems completely arbitrary anyway. Maybe we
should remove it altogether. We could add an option as described above, but
then the packager who creates the init script or whoever creates the initial
configuration will have to make an equally arbitrary choice.
Yeah. One problem is that we use the same timeout for startup and
shutdown, which really are entirely different; and the other problem
is that we've not wanted pg_ctl to have too many smarts about the
server's internal behavior.
On startup, it would be reasonable to assume failure if we don't see
a postmaster pid-file appear PDQ, but then after that it might stay
in the "database is starting up" state for a long time (maybe even
indefinitely if it's a warm standby server). Still, you could argue
that it's reasonable to keep waiting as long as the postmaster keeps
returning "database is starting up" when pinged.
On shutdown, it'd be reasonable to expect that the postmaster starts
returning "database is shutting down" almost immediately, and to report
failure if not. However, if it was a default "smart mode" stop you
could again wait indefinitely for clients to decide to give up their
sessions. I'm not sure if it's sane for pg_ctl to wait indefinitely
in that scenario.
I agree that just pushing the choice of timeout onto the user's
shoulders wouldn't be much of an improvement. He can always hit ^C
if he gets tired of waiting.
regards, tom lane
Bruce Momjian wrote:
How about an environment variable to control the timeout? Is that
cleaner?
I don't see why it should be. I think Peter's --timeout suggestion
should be just fine.
cheers
andrtew
Peter Eisentraut wrote:
Am Freitag, 17. August 2007 schrieb Peter Eisentraut:
I'm having trouble with the hardcoded 60 second timeout in pg_ctl. pg_ctl
sometimes just times out and there is no way to make it wait a little
longer. I would like to add an option to be able to change that, say
pg_ctl -w --timeout=120. Comments?Lost track of this, but it keeps biting me.
Somehow, the 60 second timeout seems completely arbitrary anyway. Maybe we
should remove it altogether. We could add an option as described above, but
then the packager who creates the init script or whoever creates the initial
configuration will have to make an equally arbitrary choice. And most likely
you will not notice that your configuration is insufficient until you are
really in a bind.What should we do?
I think the mythical pg_ping utility should be written. It seems the
easiest way out of the problem.
Maybe hack the postmaster to have a new special connection mode which
keeps the connection open until the startup process exits, to avoid
polling continuously (ideally report progress too, if at all possible).
--
Alvaro Herrera http://www.amazon.com/gp/registry/DXLWNGRJD34J
Y dijo Dios: "Que sea Satanás, para que la gente no me culpe de todo a mÃ."
"Y que hayan abogados, para que la gente no culpe de todo a Satanás"
Andrew Dunstan <andrew@dunslane.net> writes:
Bruce Momjian wrote:
How about an environment variable to control the timeout? Is that
cleaner?
I don't see why it should be. I think Peter's --timeout suggestion
should be just fine.
I wrote a moment ago that the user can hit control-C when he gets bored,
but that argument only works for interactive use of pg_ctl. In a script
I think you'd want a --timeout option. I don't see the advantage of
an environment variable in either scenario.
regards, tom lane
Alvaro Herrera <alvherre@commandprompt.com> writes:
I think the mythical pg_ping utility should be written. It seems the
easiest way out of the problem.
If pg_ctl were still a shell script there would be some point in that,
but since it's a C program it can certainly do anything a separate
utility would do.
regards, tom lane
Tom Lane wrote:
Alvaro Herrera <alvherre@commandprompt.com> writes:
I think the mythical pg_ping utility should be written. It seems the
easiest way out of the problem.If pg_ctl were still a shell script there would be some point in that,
but since it's a C program it can certainly do anything a separate
utility would do.
Well, pg_ctl would not be the only user of such an utility. Things like
(say) control panels for shared hosting could benefit from it as well.
As would system health monitors.
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes:
Tom Lane wrote:
Alvaro Herrera <alvherre@commandprompt.com> writes:
I think the mythical pg_ping utility should be written. It seems the
easiest way out of the problem.If pg_ctl were still a shell script there would be some point in that,
but since it's a C program it can certainly do anything a separate
utility would do.
Well, pg_ctl would not be the only user of such an utility. Things like
(say) control panels for shared hosting could benefit from it as well.
As would system health monitors.
I still see no point in creating a separate binary for the
functionality. If you want to make it available to shell scripts,
invent a "pg_ctl ping" subcommand.
regards, tom lane
Tom Lane wrote:
Andrew Dunstan <andrew@dunslane.net> writes:
Bruce Momjian wrote:
How about an environment variable to control the timeout? Is that
cleaner?I don't see why it should be. I think Peter's --timeout suggestion
should be just fine.I wrote a moment ago that the user can hit control-C when he gets bored,
but that argument only works for interactive use of pg_ctl. In a script
I think you'd want a --timeout option. I don't see the advantage of
an environment variable in either scenario.
I have implemented pg_ctl -t secs timeout option with the attached
patch. It still defaults to 60. I did not code the 'ping' option.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Attachments:
/pgpatches/pg_ctltext/x-diffDownload
Index: doc/src/sgml/ref/pg_ctl-ref.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/ref/pg_ctl-ref.sgml,v
retrieving revision 1.42
diff -c -c -r1.42 pg_ctl-ref.sgml
*** doc/src/sgml/ref/pg_ctl-ref.sgml 10 Nov 2007 04:52:17 -0000 1.42
--- doc/src/sgml/ref/pg_ctl-ref.sgml 10 Nov 2007 05:03:03 -0000
***************
*** 24,29 ****
--- 24,30 ----
<command>pg_ctl</command>
<arg choice="plain">start</arg>
<arg>-w</arg>
+ <arg>-t <replaceable>seconds</replaceable></arg>
<arg>-s</arg>
<arg>-D <replaceable>datadir</replaceable></arg>
<arg>-l <replaceable>filename</replaceable></arg>
***************
*** 47,52 ****
--- 48,54 ----
<command>pg_ctl</command>
<arg choice="plain">restart</arg>
<arg>-w</arg>
+ <arg>-t <replaceable>seconds</replaceable></arg>
<arg>-s</arg>
<arg>-D <replaceable>datadir</replaceable></arg>
<arg>-c</arg>
***************
*** 80,85 ****
--- 82,88 ----
<arg>-P <replaceable>password</replaceable></arg>
<arg>-D <replaceable>datadir</replaceable></arg>
<arg>-w</arg>
+ <arg>-t <replaceable>seconds</replaceable></arg>
<arg>-o <replaceable>options</replaceable></arg>
<sbr>
<command>pg_ctl</command>
***************
*** 261,271 ****
</varlistentry>
<varlistentry>
<term><option>-w</option></term>
<listitem>
<para>
! Wait for the start or shutdown to complete. Times out after
! 60 seconds. This is the default for shutdowns. A successful
shutdown is indicated by removal of the <acronym>PID</acronym>
file. For starting up, a successful <command>psql -l</command>
indicates success. <command>pg_ctl</command> will attempt to
--- 264,284 ----
</varlistentry>
<varlistentry>
+ <term><option>-t</option></term>
+ <listitem>
+ <para>
+ Seconds to wait for start or shutdown to complete when
+ the <option>-w</> option is used.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><option>-w</option></term>
<listitem>
<para>
! Wait for the start or shutdown to complete. The default wait time
! is 60 seconds. This is the default option for shutdowns. A successful
shutdown is indicated by removal of the <acronym>PID</acronym>
file. For starting up, a successful <command>psql -l</command>
indicates success. <command>pg_ctl</command> will attempt to
Index: src/bin/pg_ctl/pg_ctl.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/pg_ctl/pg_ctl.c,v
retrieving revision 1.85
diff -c -c -r1.85 pg_ctl.c
*** src/bin/pg_ctl/pg_ctl.c 31 Oct 2007 10:55:25 -0000 1.85
--- src/bin/pg_ctl/pg_ctl.c 10 Nov 2007 05:03:04 -0000
***************
*** 1465,1485 ****
printf(_("%s is a utility to start, stop, restart, reload configuration files,\n"
"report the status of a PostgreSQL server, or signal a PostgreSQL process.\n\n"), progname);
printf(_("Usage:\n"));
! printf(_(" %s start [-w] [-D DATADIR] [-s] [-l FILENAME] [-o \"OPTIONS\"]\n"), progname);
printf(_(" %s stop [-W] [-D DATADIR] [-s] [-m SHUTDOWN-MODE]\n"), progname);
! printf(_(" %s restart [-w] [-D DATADIR] [-s] [-m SHUTDOWN-MODE] [-o \"OPTIONS\"]\n"), progname);
printf(_(" %s reload [-D DATADIR] [-s]\n"), progname);
printf(_(" %s status [-D DATADIR]\n"), progname);
printf(_(" %s kill SIGNALNAME PID\n"), progname);
#if defined(WIN32) || defined(__CYGWIN__)
printf(_(" %s register [-N SERVICENAME] [-U USERNAME] [-P PASSWORD] [-D DATADIR]\n"
! " [-w] [-o \"OPTIONS\"]\n"), progname);
printf(_(" %s unregister [-N SERVICENAME]\n"), progname);
#endif
printf(_("\nCommon options:\n"));
printf(_(" -D, --pgdata DATADIR location of the database storage area\n"));
printf(_(" -s, --silent only print errors, no informational messages\n"));
printf(_(" -w wait until operation completes\n"));
printf(_(" -W do not wait until operation completes\n"));
printf(_(" --help show this help, then exit\n"));
--- 1465,1487 ----
printf(_("%s is a utility to start, stop, restart, reload configuration files,\n"
"report the status of a PostgreSQL server, or signal a PostgreSQL process.\n\n"), progname);
printf(_("Usage:\n"));
! printf(_(" %s start [-w] [-t secs] [-D DATADIR] [-s] [-l FILENAME] [-o \"OPTIONS\"]\n"), progname);
printf(_(" %s stop [-W] [-D DATADIR] [-s] [-m SHUTDOWN-MODE]\n"), progname);
! printf(_(" %s restart [-w] [-t secs] [-D DATADIR] [-s] [-m SHUTDOWN-MODE]\n\
! [-o \"OPTIONS\"]\n"), progname);
printf(_(" %s reload [-D DATADIR] [-s]\n"), progname);
printf(_(" %s status [-D DATADIR]\n"), progname);
printf(_(" %s kill SIGNALNAME PID\n"), progname);
#if defined(WIN32) || defined(__CYGWIN__)
printf(_(" %s register [-N SERVICENAME] [-U USERNAME] [-P PASSWORD] [-D DATADIR]\n"
! " [-w] [-t timeout] [-o \"OPTIONS\"]\n"), progname);
printf(_(" %s unregister [-N SERVICENAME]\n"), progname);
#endif
printf(_("\nCommon options:\n"));
printf(_(" -D, --pgdata DATADIR location of the database storage area\n"));
printf(_(" -s, --silent only print errors, no informational messages\n"));
+ printf(_(" -t secs seconds to wait when using -w option\n"));
printf(_(" -w wait until operation completes\n"));
printf(_(" -W do not wait until operation completes\n"));
printf(_(" --help show this help, then exit\n"));
***************
*** 1592,1597 ****
--- 1594,1600 ----
{"mode", required_argument, NULL, 'm'},
{"pgdata", required_argument, NULL, 'D'},
{"silent", no_argument, NULL, 's'},
+ {"timeout", required_argument, NULL, 't'},
{"core-files", no_argument, NULL, 'c'},
{NULL, 0, NULL, 0}
};
***************
*** 1704,1709 ****
--- 1707,1715 ----
case 's':
silent_mode = true;
break;
+ case 't':
+ wait_seconds = atoi(optarg);
+ break;
case 'U':
if (strchr(optarg, '\\'))
register_username = xstrdup(optarg);