putting a bgworker to rest

Started by Andres Freundover 12 years ago9 messages
#1Andres Freund
andres@2ndquadrant.com

Hi all,

I noticed the need to simply stop a bgworker after its work is done but
still have it restart in unusual circumstances like a crash.
Obviously I can just have it enter a loop where it checks its latch and
such, but that seems a bit pointless.

Would it make sense to add an extra return value or such for that?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andres Freund (#1)
Re: putting a bgworker to rest

Andres Freund wrote:

Hi all,

I noticed the need to simply stop a bgworker after its work is done but
still have it restart in unusual circumstances like a crash.
Obviously I can just have it enter a loop where it checks its latch and
such, but that seems a bit pointless.

Would it make sense to add an extra return value or such for that?

KaiGai also requested some more flexibility in the stop timing and
shutdown sequence. I understand the current design that workers are
always on can be a bit annoying.

How would postmaster know when to restart a worker that stopped?

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Andres Freund
andres@2ndquadrant.com
In reply to: Alvaro Herrera (#2)
Re: putting a bgworker to rest

On 2013-04-23 11:59:43 -0300, Alvaro Herrera wrote:

Andres Freund wrote:

Hi all,

I noticed the need to simply stop a bgworker after its work is done but
still have it restart in unusual circumstances like a crash.
Obviously I can just have it enter a loop where it checks its latch and
such, but that seems a bit pointless.

Would it make sense to add an extra return value or such for that?

KaiGai also requested some more flexibility in the stop timing and
shutdown sequence. I understand the current design that workers are
always on can be a bit annoying.

How would postmaster know when to restart a worker that stopped?

I had imagined we would assign some return codes special
meaning. Currently 0 basically means "restart immediately", 1 means
"crashed, wait for some time", everything else results in a postmaster
restart. It seems we can just assign returncode 2 as "done", probably
with some enum or such hiding the numbers.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andres Freund (#3)
Re: putting a bgworker to rest

Andres Freund wrote:

On 2013-04-23 11:59:43 -0300, Alvaro Herrera wrote:

Andres Freund wrote:

Hi all,

I noticed the need to simply stop a bgworker after its work is done but
still have it restart in unusual circumstances like a crash.
Obviously I can just have it enter a loop where it checks its latch and
such, but that seems a bit pointless.

Would it make sense to add an extra return value or such for that?

KaiGai also requested some more flexibility in the stop timing and
shutdown sequence. I understand the current design that workers are
always on can be a bit annoying.

How would postmaster know when to restart a worker that stopped?

I had imagined we would assign some return codes special
meaning. Currently 0 basically means "restart immediately", 1 means
"crashed, wait for some time", everything else results in a postmaster
restart. It seems we can just assign returncode 2 as "done", probably
with some enum or such hiding the numbers.

So a "done" worker would never be restarted, until postmaster sees a
crash or is itself restarted? I guess that'd be useful for workers
running during recovery, which terminate when recovery completes. Is
that your use case?

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Andres Freund
andres@2ndquadrant.com
In reply to: Alvaro Herrera (#4)
Re: putting a bgworker to rest

On 2013-04-23 14:11:26 -0300, Alvaro Herrera wrote:

Andres Freund wrote:

On 2013-04-23 11:59:43 -0300, Alvaro Herrera wrote:

Andres Freund wrote:

Hi all,

I noticed the need to simply stop a bgworker after its work is done but
still have it restart in unusual circumstances like a crash.
Obviously I can just have it enter a loop where it checks its latch and
such, but that seems a bit pointless.

Would it make sense to add an extra return value or such for that?

KaiGai also requested some more flexibility in the stop timing and
shutdown sequence. I understand the current design that workers are
always on can be a bit annoying.

How would postmaster know when to restart a worker that stopped?

I had imagined we would assign some return codes special
meaning. Currently 0 basically means "restart immediately", 1 means
"crashed, wait for some time", everything else results in a postmaster
restart. It seems we can just assign returncode 2 as "done", probably
with some enum or such hiding the numbers.

So a "done" worker would never be restarted, until postmaster sees a
crash or is itself restarted? I guess that'd be useful for workers
running during recovery, which terminate when recovery completes. Is
that your use case?

Well, its not actual postgres recovery, but something similar in the
context of logical replication.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Andres Freund (#3)
Re: putting a bgworker to rest

Andres Freund <andres@2ndquadrant.com> writes:

How would postmaster know when to restart a worker that stopped?

I had imagined we would assign some return codes special
meaning. Currently 0 basically means "restart immediately", 1 means
"crashed, wait for some time", everything else results in a postmaster
restart. It seems we can just assign returncode 2 as "done", probably
with some enum or such hiding the numbers.

In Erlang, the lib that cares about such things in called OTP, and that
proposes a model of supervisor that knows when to restart a worker. The
specs for the restart behaviour are:

Restart = permanent | transient | temporary

Restart defines when a terminated child process should be restarted.

- A permanent child process is always restarted.

- A temporary child process is never restarted (not even when the
supervisor's restart strategy is rest_for_one or one_for_all and a
sibling's death causes the temporary process to be terminated).

- A transient child process is restarted only if it terminates
abnormally, i.e. with another exit reason than normal, shutdown or
{shutdown,Term}.

Then about restart frequency, what they have is:

The supervisors have a built-in mechanism to limit the number of
restarts which can occur in a given time interval. This is
determined by the values of the two parameters MaxR and MaxT in the
start specification returned by the callback function [ ... ]

If more than MaxR number of restarts occur in the last MaxT seconds,
then the supervisor terminates all the child processes and then
itself.

You can read the whole thing here:

http://www.erlang.org/doc/design_principles/sup_princ.html#id71215

I think we should get some inspiration from them here.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#5)
Re: putting a bgworker to rest

On Tue, Apr 23, 2013 at 1:22 PM, Andres Freund <andres@2ndquadrant.com> wrote:

So a "done" worker would never be restarted, until postmaster sees a
crash or is itself restarted? I guess that'd be useful for workers
running during recovery, which terminate when recovery completes. Is
that your use case?

Well, its not actual postgres recovery, but something similar in the
context of logical replication.

It's probably too late to be twiddling this very much more, but
another thing I think would be useful is for backends to have the
ability to request that the postmaster start a worker of type xyz,
rather than having the server start it automatically at startup time.
That's what you'd need for parallel query, and there might be some
replication-related use cases for such things as well. The general
usage pattern would be:

- regular backend realizes that it needs help
- kicks postmaster to start a helper process
- helper process runs for a while, doing work
- helper process finishes work, maybe waits around for some period of
time to see if any new work arrives, and then exits
- eventually go back to step 1

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Peter Eisentraut
peter_e@gmx.net
In reply to: Dimitri Fontaine (#6)
Re: putting a bgworker to rest

On 4/24/13 12:30 PM, Dimitri Fontaine wrote:

In Erlang, the lib that cares about such things in called OTP, and that
proposes a model of supervisor that knows when to restart a worker. The
specs for the restart behaviour are:

Restart = permanent | transient | temporary

There is also supervisord; see configuration settings "autorestart" and
"exitcodes" here:

http://supervisord.org/configuration.html#program-x-section-settings

Yes, the feature creep is in full progress!

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Peter Eisentraut (#8)
Re: putting a bgworker to rest

Peter Eisentraut wrote:

On 4/24/13 12:30 PM, Dimitri Fontaine wrote:

In Erlang, the lib that cares about such things in called OTP, and that
proposes a model of supervisor that knows when to restart a worker. The
specs for the restart behaviour are:

Restart = permanent | transient | temporary

There is also supervisord; see configuration settings "autorestart" and
"exitcodes" here:

http://supervisord.org/configuration.html#program-x-section-settings

Yes, the feature creep is in full progress!

The main missing feature before this can be sensibly implemented, in my
view, is some way to make workers start when they are stopped, assuming
no intervening postmaster crash. I suppose we could write a
SQL-callable function so that a backend can signal postmaster to launch
a worker. For this to work, I think we need an SQL-accesible way to
list existing registered workers, along with whether they are running or
not, and some identifier. However, the list of registered workers and
their statuses currently only exists in postmaster local memory;
exporting that might be problematic. (Maybe a simple file with a list
of registered workers, but not the status, is good enough. Postmaster
could write it after registration is done.)

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers