Autovacuum launcher doesn't notice death of postmaster immediately

Started by Peter Eisentrautover 18 years ago25 messages
#1Peter Eisentraut
peter_e@gmx.net

I notice that in 8.3, when I kill the postmaster process with SIGKILL or
SIGSEGV, the child processes writer and stats collector go away
immediately, but the autovacuum launcher hangs around for up to a
minute. (I suppose this has to do with the periodic wakeups?). When
you try to restart the postmaster before that it fails with a complaint
that someone is still attached to the shared memory segment.

These are obviously not normal modes of operation, but I fear that this
could cause some problems with people's control scripts of the
sort, "it crashed, let's try to restart it".

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#2Alvaro Herrera
alvherre@commandprompt.com
In reply to: Peter Eisentraut (#1)
Re: Autovacuum launcher doesn't notice death of postmaster immediately

Peter Eisentraut wrote:

I notice that in 8.3, when I kill the postmaster process with SIGKILL or
SIGSEGV, the child processes writer and stats collector go away
immediately, but the autovacuum launcher hangs around for up to a
minute. (I suppose this has to do with the periodic wakeups?). When
you try to restart the postmaster before that it fails with a complaint
that someone is still attached to the shared memory segment.

These are obviously not normal modes of operation, but I fear that this
could cause some problems with people's control scripts of the
sort, "it crashed, let's try to restart it".

The launcher is set up to wake up in autovacuum_naptime seconds at most.
So if the user configures a ridiculuos time (for example 86400 seconds,
which I've seen) then the launcher would not detect the postmaster death
for a very long time, which is probably bad. (You measured a one minute
delay because that's the default naptime).

Maybe this is not such a hot idea, and we should wake the launcher up
every 10 seconds (or less?). I picked 10 seconds because that's the
time the bgwriter sleeps if there is no activity configured. Does this
sound acceptable? The only problem with waking it up too frequently is
that it would be waking the system up (for gettimeofday()) even if
nothing is happening.

I also just noticed that the launcher will check if postmaster is alive,
then sleep, and then possibly do some work. So if the postmaster died
in the sleep period, the launcher might try to do some work. Should we
add a check for postmaster liveliness after the sleep?

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#3Jim C. Nasby
decibel@decibel.org
In reply to: Alvaro Herrera (#2)
Re: Autovacuum launcher doesn't notice death of postmaster immediately

On Mon, Jun 04, 2007 at 11:04:26AM -0400, Alvaro Herrera wrote:

The launcher is set up to wake up in autovacuum_naptime seconds at most.
So if the user configures a ridiculuos time (for example 86400 seconds,
which I've seen) then the launcher would not detect the postmaster death

Yeah, I've seen people set that up with the intention of "now autovacuum
will only run during our slow time!". I'm thinking it'd be worth
mentioning in the docs that this won't work, and instead suggesting that
they run vacuumdb -a or equivalent at that time instead. Thoughts?
--
Jim Nasby decibel@decibel.org
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

#4Andrew Hammond
andrew.george.hammond@gmail.com
In reply to: Jim C. Nasby (#3)
Re: Autovacuum launcher doesn't notice death of postmaster immediately

On 6/7/07, Jim C. Nasby <decibel@decibel.org> wrote:

On Mon, Jun 04, 2007 at 11:04:26AM -0400, Alvaro Herrera wrote:

The launcher is set up to wake up in autovacuum_naptime seconds at most.
So if the user configures a ridiculuos time (for example 86400 seconds,
which I've seen) then the launcher would not detect the postmaster death

Is there some threshold after which we should have PostgreSQL emit a
warning to the effect of "autovacuum_naptime is very large. Are you
sure you know what you're doing?"

Yeah, I've seen people set that up with the intention of "now autovacuum
will only run during our slow time!". I'm thinking it'd be worth
mentioning in the docs that this won't work, and instead suggesting that
they run vacuumdb -a or equivalent at that time instead. Thoughts?

Hmmm... it seems to me that points new users towards not using
autovacuum, which doesn't seem like the best idea. I think it'd be
better to say that setting the naptime really high is a Bad Idea.
Instead, if they want to shift maintenances to "off hours" they should
consider using a cron job that bonks around the
pg_autovacuum.vac_base_thresh or vac_scale_factor values for tables
they don't want vacuumed during "operational hours" (set them really
high at the start of operational hours, then to normal during off
hours). Tweaking the enable column would work too, but they presumably
don't want to disable ANALYZE, although it's entirely likely that new
users don't know what ANALYZE does, in which case they _really_ don't
want to disable it.

This should probably be very close to a section that says something
about how insufficient maintenance can be expected to lead to greater
performance issues than using autovacuum with default settings.
Assuming we believe that to be the case, which I think is reasonable
given that we are now defaulting to having autovacuum enabled.

Andrew

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Hammond (#4)
Re: Autovacuum launcher doesn't notice death of postmaster immediately

"Andrew Hammond" <andrew.george.hammond@gmail.com> writes:

Hmmm... it seems to me that points new users towards not using
autovacuum, which doesn't seem like the best idea. I think it'd be
better to say that setting the naptime really high is a Bad Idea.

It seems like we should have an upper limit on the GUC variable that's
less than INT_MAX ;-). Would an hour be sane? 10 minutes?

This is independent of the problem at hand, though, which is that we
probably want the launcher to notice postmaster death in less time
than autovacuum_naptime, for reasonable values of same.

regards, tom lane

#6Matthew T. O'Connor
matthew@zeut.net
In reply to: Tom Lane (#5)
Re: Autovacuum launcher doesn't notice death of postmaster immediately

Tom Lane wrote:

"Andrew Hammond" <andrew.george.hammond@gmail.com> writes:

Hmmm... it seems to me that points new users towards not using
autovacuum, which doesn't seem like the best idea. I think it'd be
better to say that setting the naptime really high is a Bad Idea.

It seems like we should have an upper limit on the GUC variable that's
less than INT_MAX ;-). Would an hour be sane? 10 minutes?

This is independent of the problem at hand, though, which is that we
probably want the launcher to notice postmaster death in less time
than autovacuum_naptime, for reasonable values of same.

Do we need a configurable autovacuum naptime at all? I know I put it in
the original contrib autovacuum because I had no idea what knobs might
be needed. I can't see a good reason to ever have a naptime longer than
the default 60 seconds, but I suppose one might want a smaller naptime
for a very active system?

#7Michael Paesold
mpaesold@gmx.at
In reply to: Matthew T. O'Connor (#6)
Re: Autovacuum launcher doesn't notice death of postmaster immediately

Matthew T. O'Connor schrieb:

Tom Lane wrote:

"Andrew Hammond" <andrew.george.hammond@gmail.com> writes:

Hmmm... it seems to me that points new users towards not using
autovacuum, which doesn't seem like the best idea. I think it'd be
better to say that setting the naptime really high is a Bad Idea.

It seems like we should have an upper limit on the GUC variable that's
less than INT_MAX ;-). Would an hour be sane? 10 minutes?

This is independent of the problem at hand, though, which is that we
probably want the launcher to notice postmaster death in less time
than autovacuum_naptime, for reasonable values of same.

Do we need a configurable autovacuum naptime at all? I know I put it in
the original contrib autovacuum because I had no idea what knobs might
be needed. I can't see a good reason to ever have a naptime longer than
the default 60 seconds, but I suppose one might want a smaller naptime
for a very active system?

A PostgreSQL database on my laptop for testing. It should use as little
resources as possible while being idle. That would be a scenario for
naptime greater than 60 seconds, wouldn't it?

Best Regards
Michael Paesold

#8Zeugswetter Andreas ADI SD
ZeugswetterA@spardat.at
In reply to: Andrew Hammond (#4)
Re: Autovacuum launcher doesn't notice death of postmaster immediately

The launcher is set up to wake up in autovacuum_naptime seconds at

most.

Imho the fix is usually to have a sleep loop.

Andreas

#9Alvaro Herrera
alvherre@commandprompt.com
In reply to: Zeugswetter Andreas ADI SD (#8)
Re: Autovacuum launcher doesn't notice death of postmaster immediately

Zeugswetter Andreas ADI SD escribi�:

The launcher is set up to wake up in autovacuum_naptime seconds at
most.

Imho the fix is usually to have a sleep loop.

This is what we have. The sleep time depends on the schedule of next
vacuum for the closest database in time. If naptime is high, the sleep
time will be high (depending on number of databases needing attention).

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#10Matthew O'Connor
matthew@zeut.net
In reply to: Michael Paesold (#7)
Re: Autovacuum launcher doesn't notice death of postmaster immediately

Michael Paesold wrote:

Matthew T. O'Connor schrieb:

Do we need a configurable autovacuum naptime at all? I know I put it
in the original contrib autovacuum because I had no idea what knobs
might be needed. I can't see a good reason to ever have a naptime
longer than the default 60 seconds, but I suppose one might want a
smaller naptime for a very active system?

A PostgreSQL database on my laptop for testing. It should use as little
resources as possible while being idle. That would be a scenario for
naptime greater than 60 seconds, wouldn't it?

Perhaps, but that isn't the use case PostgresSQL is being designed for.
If that is what you really need, then you should probably disable
autovacuum. Also a very long naptime means that autovacuum will still
wake up at random times and to do the work. At least with short
naptime, it will do the work shortly after you updated your tables.

#11Zeugswetter Andreas ADI SD
ZeugswetterA@spardat.at
In reply to: Alvaro Herrera (#9)
Re: Autovacuum launcher doesn't notice death of postmaster immediately

The launcher is set up to wake up in autovacuum_naptime

seconds

at most.

Imho the fix is usually to have a sleep loop.

This is what we have. The sleep time depends on the schedule
of next vacuum for the closest database in time. If naptime
is high, the sleep time will be high (depending on number of
databases needing attention).

No, I meant a "while (sleep 1(or 10) and counter < longtime) check for
exit" instead of "sleep longtime".

Andreas

#12Alvaro Herrera
alvherre@commandprompt.com
In reply to: Zeugswetter Andreas ADI SD (#11)
Re: Autovacuum launcher doesn't notice death of postmaster immediately

Zeugswetter Andreas ADI SD escribi�:

The launcher is set up to wake up in autovacuum_naptime

seconds

at most.

Imho the fix is usually to have a sleep loop.

This is what we have. The sleep time depends on the schedule
of next vacuum for the closest database in time. If naptime
is high, the sleep time will be high (depending on number of
databases needing attention).

No, I meant a "while (sleep 1(or 10) and counter < longtime) check for
exit" instead of "sleep longtime".

Ah; yes, what I was proposing (or thought about proposing, not sure if I
posted it or not) was putting a upper limit of 10 seconds in the sleep
(bgwriter sleeps 10 seconds if configured to not do anything). Though
10 seconds may seem like an eternity for systems like the ones Peter was
talking about, where there is a script trying to restart the server as
soon as the postmaster dies.

--
Alvaro Herrera Developer, http://www.PostgreSQL.org/
"Lim�tate a mirar... y algun d�a veras"

#13Jim C. Nasby
decibel@decibel.org
In reply to: Matthew O'Connor (#10)
Re: Autovacuum launcher doesn't notice death of postmaster immediately

On Fri, Jun 08, 2007 at 09:49:56AM -0400, Matthew O'Connor wrote:

Michael Paesold wrote:

Matthew T. O'Connor schrieb:

Do we need a configurable autovacuum naptime at all? I know I put it
in the original contrib autovacuum because I had no idea what knobs
might be needed. I can't see a good reason to ever have a naptime
longer than the default 60 seconds, but I suppose one might want a
smaller naptime for a very active system?

A PostgreSQL database on my laptop for testing. It should use as little
resources as possible while being idle. That would be a scenario for
naptime greater than 60 seconds, wouldn't it?

Perhaps, but that isn't the use case PostgresSQL is being designed for.
If that is what you really need, then you should probably disable
autovacuum. Also a very long naptime means that autovacuum will still
wake up at random times and to do the work. At least with short
naptime, it will do the work shortly after you updated your tables.

Agreed. Maybe 10 minutes might make sense, but the overhead of checking
to see if anything needs vacuuming is pretty tiny.

There *is* reason to allow setting the naptime smaller, though (or at
least there was; perhaps Alvero's recent changes negate this need):
clusters that have a large number of databases. I've worked with folks
who are in a hosted environment and give each customer their own
database; it's not hard to get a couple hundred databases that way.
Setting the naptime higher than a second in such an environment would
mean it could be hours before a database is checked for vacuuming.
--
Jim Nasby decibel@decibel.org
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

#14Jim C. Nasby
decibel@decibel.org
In reply to: Andrew Hammond (#4)
Re: Autovacuum launcher doesn't notice death of postmaster immediately

On Thu, Jun 07, 2007 at 12:13:09PM -0700, Andrew Hammond wrote:

On 6/7/07, Jim C. Nasby <decibel@decibel.org> wrote:

On Mon, Jun 04, 2007 at 11:04:26AM -0400, Alvaro Herrera wrote:

The launcher is set up to wake up in autovacuum_naptime seconds at most.
So if the user configures a ridiculuos time (for example 86400 seconds,
which I've seen) then the launcher would not detect the postmaster death

Is there some threshold after which we should have PostgreSQL emit a
warning to the effect of "autovacuum_naptime is very large. Are you
sure you know what you're doing?"

Yeah, I've seen people set that up with the intention of "now autovacuum
will only run during our slow time!". I'm thinking it'd be worth
mentioning in the docs that this won't work, and instead suggesting that
they run vacuumdb -a or equivalent at that time instead. Thoughts?

Hmmm... it seems to me that points new users towards not using
autovacuum, which doesn't seem like the best idea. I think it'd be

I think we could easily word it so that it's clear that just letting
autovacuum do it's thing is preferred.

better to say that setting the naptime really high is a Bad Idea.
Instead, if they want to shift maintenances to "off hours" they should
consider using a cron job that bonks around the
pg_autovacuum.vac_base_thresh or vac_scale_factor values for tables
they don't want vacuumed during "operational hours" (set them really
high at the start of operational hours, then to normal during off
hours). Tweaking the enable column would work too, but they presumably
don't want to disable ANALYZE, although it's entirely likely that new
users don't know what ANALYZE does, in which case they _really_ don't
want to disable it.

That sounds like a rather ugly solution, and one that would be hard to
implement; not something to be putting in the docs.
--
Jim Nasby decibel@decibel.org
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

#15Alvaro Herrera
alvherre@commandprompt.com
In reply to: Jim C. Nasby (#13)
Re: Autovacuum launcher doesn't notice death of postmaster immediately

Jim C. Nasby escribi�:

There *is* reason to allow setting the naptime smaller, though (or at
least there was; perhaps Alvero's recent changes negate this need):
clusters that have a large number of databases. I've worked with folks
who are in a hosted environment and give each customer their own
database; it's not hard to get a couple hundred databases that way.
Setting the naptime higher than a second in such an environment would
mean it could be hours before a database is checked for vacuuming.

Yes, the code in HEAD is different -- each database will be considered
separately. So the huge database taking all day to vacuum will not stop
the tiny databases from being vacuumed in a timely manner.

And the very huge table in that database will not stop the other tables
in the database from being vacuumed either. There can be more than one
worker in a single database.

The limit is autovacuum_max_workers.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#16Matthew T. O'Connor
matthew@zeut.net
In reply to: Alvaro Herrera (#15)
Re: Autovacuum launcher doesn't notice death of postmaster immediately

Alvaro Herrera wrote:

Jim C. Nasby escribi�:

There *is* reason to allow setting the naptime smaller, though (or at
least there was; perhaps Alvero's recent changes negate this need):
clusters that have a large number of databases. I've worked with folks
who are in a hosted environment and give each customer their own
database; it's not hard to get a couple hundred databases that way.
Setting the naptime higher than a second in such an environment would
mean it could be hours before a database is checked for vacuuming.

Yes, the code in HEAD is different -- each database will be considered
separately. So the huge database taking all day to vacuum will not stop
the tiny databases from being vacuumed in a timely manner.

And the very huge table in that database will not stop the other tables
in the database from being vacuumed either. There can be more than one
worker in a single database.

Ok, but I think the question posed is that in say a virtual hosting
environment there might be say 1,000 databases in the cluster. Am I
still going to have to wait a long time for my database to get vacuumed?
I don't think this has changed much no?

(If default naptime is 1 minute, then autovacuum won't even look at a
given database but once every 1,000 minutes (16.67 hours) assuming that
there isn't enough work to keep all the workers busy.)

#17Alvaro Herrera
alvherre@commandprompt.com
In reply to: Matthew T. O'Connor (#16)
Re: Autovacuum launcher doesn't notice death of postmaster immediately

Matthew T. O'Connor escribi�:

Ok, but I think the question posed is that in say a virtual hosting
environment there might be say 1,000 databases in the cluster. Am I
still going to have to wait a long time for my database to get vacuumed?
I don't think this has changed much no?

Depends on how much time it takes to vacuum the other 999 databases.
The default max workers is 3.

(If default naptime is 1 minute, then autovacuum won't even look at a
given database but once every 1,000 minutes (16.67 hours) assuming that
there isn't enough work to keep all the workers busy.)

The naptime is per database. Which means if you have 1000 databases and
a naptime of 60 seconds, the launcher is going to wake up every 100
milliseconds to check things up. (This results from 60000 / 1000 = 60
ms, but there is a minimum of 100 ms just to keep things sane).

If there are 3 workers and each of the 1000 databases in average takes
10 seconds to vacuum, there will be around 3000 seconds between autovac
runs of your database assuming my math is right.

I hope those 1000 databases you put in your shared hosting are not very
big.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#18Joshua D. Drake
jd@commandprompt.com
In reply to: Alvaro Herrera (#17)
Re: Autovacuum launcher doesn't notice death of postmaster immediately

Alvaro Herrera wrote:

Matthew T. O'Connor escribió:

Ok, but I think the question posed is that in say a virtual hosting
environment there might be say 1,000 databases in the cluster.

That is uhmmm insane... 1000 databases?

Joshua D. Drake

Am I

still going to have to wait a long time for my database to get vacuumed?
I don't think this has changed much no?

Depends on how much time it takes to vacuum the other 999 databases.
The default max workers is 3.

(If default naptime is 1 minute, then autovacuum won't even look at a
given database but once every 1,000 minutes (16.67 hours) assuming that
there isn't enough work to keep all the workers busy.)

The naptime is per database. Which means if you have 1000 databases and
a naptime of 60 seconds, the launcher is going to wake up every 100
milliseconds to check things up. (This results from 60000 / 1000 = 60
ms, but there is a minimum of 100 ms just to keep things sane).

If there are 3 workers and each of the 1000 databases in average takes
10 seconds to vacuum, there will be around 3000 seconds between autovac
runs of your database assuming my math is right.

I hope those 1000 databases you put in your shared hosting are not very
big.

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

#19Dann Corbit
DCorbit@connx.com
In reply to: Joshua D. Drake (#18)
Re: Autovacuum launcher doesn't notice death of postmaster immediately

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
owner@postgresql.org] On Behalf Of Joshua D. Drake
Sent: Friday, June 08, 2007 10:49 PM
To: Alvaro Herrera
Cc: Matthew T. O'Connor; Jim C. Nasby; Michael Paesold; Tom Lane; Andrew
Hammond; Peter Eisentraut; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Autovacuum launcher doesn't notice death of
postmaster immediately

Alvaro Herrera wrote:

Matthew T. O'Connor escribió:

Ok, but I think the question posed is that in say a virtual hosting
environment there might be say 1,000 databases in the cluster.

That is uhmmm insane... 1000 databases?

Not in a test environment. We have several hundred databases here. Of course, only a few dozen (or at most ~100) are of any one type, but I can imagine that under certain circumstances 1000 databases would not be unreasonable.

[snip]

#20ITAGAKI Takahiro
itagaki.takahiro@oss.ntt.co.jp
In reply to: Alvaro Herrera (#12)
1 attachment(s)
Re: Autovacuum launcher doesn't notice death of postmaster immediately

Alvaro Herrera <alvherre@commandprompt.com> wrote:

No, I meant a "while (sleep 1(or 10) and counter < longtime) check for
exit" instead of "sleep longtime".

Ah; yes, what I was proposing (or thought about proposing, not sure if I
posted it or not) was putting a upper limit of 10 seconds in the sleep
(bgwriter sleeps 10 seconds if configured to not do anything). Though
10 seconds may seem like an eternity for systems like the ones Peter was
talking about, where there is a script trying to restart the server as
soon as the postmaster dies.

Here is a patch for split-sleep of autovacuum_naptime.

There are some other issues in CVS HEAD; We use the calculation
{autovacuum_naptime * 1000000} in launcher_determine_sleep().
The result will be corrupted if we set autovacuum_naptime to >2147.

In another place, we use {autovacuum_naptime * 1000}, so we should
set the upper bound to INT_MAX/1000 instead of INT_MAX.
Incidentally, we've already had the same protections for
log_min_duration_statement and log_autovacuum.

I hope this patch could fix those large-autovacuum_naptime problems.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachments:

autovacuum_naptime_overflow.patchapplication/octet-stream; name=autovacuum_naptime_overflow.patchDownload
Index: src/backend/postmaster/autovacuum.c
===================================================================
--- src/backend/postmaster/autovacuum.c	(HEAD)
+++ src/backend/postmaster/autovacuum.c	(autovacuum_naptime_overflow)
@@ -502,7 +502,15 @@
 										  INVALID_OFFSET, false);
 
 		/* Sleep for a while according to schedule */
-		pg_usleep(micros);
+		while (micros >= 10000000L)
+		{
+			if (got_SIGHUP || avlauncher_shutdown_request)
+				break;
+			pg_usleep(10000000L);
+			micros -= 10000000L;
+		}
+		if (!(got_SIGHUP || avlauncher_shutdown_request))
+			pg_usleep(micros);
 
 		/* the normal shutdown case */
 		if (avlauncher_shutdown_request)
@@ -709,7 +717,7 @@
 		usecs = 100000;	/* 100 ms */
 	}
 
-	return secs * 1000000 + usecs;
+	return (uint64) secs * 1000000 + usecs;
 }
 
 /*
Index: src/backend/utils/misc/guc.c
===================================================================
--- src/backend/utils/misc/guc.c	(HEAD)
+++ src/backend/utils/misc/guc.c	(autovacuum_naptime_overflow)
@@ -1645,7 +1645,7 @@
 			GUC_UNIT_S
 		},
 		&autovacuum_naptime,
-		60, 1, INT_MAX, NULL, NULL
+		60, 1, INT_MAX / 1000, NULL, NULL
 	},
 	{
 		{"autovacuum_vacuum_threshold", PGC_SIGHUP, AUTOVACUUM,
#21Zdenek Kotala
Zdenek.Kotala@Sun.COM
In reply to: Alvaro Herrera (#12)
Re: Autovacuum launcher doesn't notice death of postmaster immediately

Alvaro Herrera wrote:

Zeugswetter Andreas ADI SD escribi�:

The launcher is set up to wake up in autovacuum_naptime

seconds

at most.

Imho the fix is usually to have a sleep loop.

This is what we have. The sleep time depends on the schedule
of next vacuum for the closest database in time. If naptime
is high, the sleep time will be high (depending on number of
databases needing attention).

No, I meant a "while (sleep 1(or 10) and counter < longtime) check for
exit" instead of "sleep longtime".

Ah; yes, what I was proposing (or thought about proposing, not sure if I
posted it or not) was putting a upper limit of 10 seconds in the sleep
(bgwriter sleeps 10 seconds if configured to not do anything). Though
10 seconds may seem like an eternity for systems like the ones Peter was
talking about, where there is a script trying to restart the server as
soon as the postmaster dies.

There is also one "wild" solution. Postmaster and bgwriter will connect
with socket/pipe and select command will be used instead sleep. If
connection unexpectedly fails, select finish immediately and we are able
to handle this issue asap. This socket should be used also in some
special case when we need wake up it faster.

Zdenek

#22Magnus Hagander
magnus@hagander.net
In reply to: Zdenek Kotala (#21)
Re: Autovacuum launcher doesn't notice death of postmaster immediately

On Tue, Jun 12, 2007 at 12:23:50PM +0200, Zdenek Kotala wrote:

Alvaro Herrera wrote:

Zeugswetter Andreas ADI SD escribi�:

The launcher is set up to wake up in autovacuum_naptime

seconds

at most.

Imho the fix is usually to have a sleep loop.

This is what we have. The sleep time depends on the schedule
of next vacuum for the closest database in time. If naptime
is high, the sleep time will be high (depending on number of
databases needing attention).

No, I meant a "while (sleep 1(or 10) and counter < longtime) check for
exit" instead of "sleep longtime".

Ah; yes, what I was proposing (or thought about proposing, not sure if I
posted it or not) was putting a upper limit of 10 seconds in the sleep
(bgwriter sleeps 10 seconds if configured to not do anything). Though
10 seconds may seem like an eternity for systems like the ones Peter was
talking about, where there is a script trying to restart the server as
soon as the postmaster dies.

There is also one "wild" solution. Postmaster and bgwriter will connect
with socket/pipe and select command will be used instead sleep. If
connection unexpectedly fails, select finish immediately and we are able
to handle this issue asap. This socket should be used also in some
special case when we need wake up it faster.

Given the amount of problems we've had with pipes on win32, let's try to
avoid adding extra ones unless they're really necessary. If split-sleep
works, that seems a safer bet.

//Magnus

#23Zdenek Kotala
Zdenek.Kotala@Sun.COM
In reply to: Magnus Hagander (#22)
Re: Autovacuum launcher doesn't notice death of postmaster immediately

Magnus Hagander wrote:

On Tue, Jun 12, 2007 at 12:23:50PM +0200, Zdenek Kotala wrote:

Alvaro Herrera wrote:

Zeugswetter Andreas ADI SD escribi�:

The launcher is set up to wake up in autovacuum_naptime

seconds

at most.

Imho the fix is usually to have a sleep loop.

This is what we have. The sleep time depends on the schedule
of next vacuum for the closest database in time. If naptime
is high, the sleep time will be high (depending on number of
databases needing attention).

No, I meant a "while (sleep 1(or 10) and counter < longtime) check for
exit" instead of "sleep longtime".

Ah; yes, what I was proposing (or thought about proposing, not sure if I
posted it or not) was putting a upper limit of 10 seconds in the sleep
(bgwriter sleeps 10 seconds if configured to not do anything). Though
10 seconds may seem like an eternity for systems like the ones Peter was
talking about, where there is a script trying to restart the server as
soon as the postmaster dies.

There is also one "wild" solution. Postmaster and bgwriter will connect
with socket/pipe and select command will be used instead sleep. If
connection unexpectedly fails, select finish immediately and we are able
to handle this issue asap. This socket should be used also in some
special case when we need wake up it faster.

Given the amount of problems we've had with pipes on win32, let's try to
avoid adding extra ones unless they're really necessary. If split-sleep
works, that seems a safer bet.

Ok It should be problem. But I'm afraid split-sleep is not good solution
as well. It should generate a lot of race condition in start/stop
scripts and monitoring tools. Much better should be improve pg_ctl to
perform clean up ("pg_ctl cleanup) when postmaster fails.

I think we must offer deterministic way to packagers integrator how to
handle this issue.

Zdenek

#24Alvaro Herrera
alvherre@commandprompt.com
In reply to: ITAGAKI Takahiro (#20)
1 attachment(s)
Re: [PATCHES] Autovacuum launcher doesn't notice death of postmaster immediately

ITAGAKI Takahiro wrote:

Alvaro Herrera <alvherre@commandprompt.com> wrote:

No, I meant a "while (sleep 1(or 10) and counter < longtime) check for
exit" instead of "sleep longtime".

Ah; yes, what I was proposing (or thought about proposing, not sure if I
posted it or not) was putting a upper limit of 10 seconds in the sleep
(bgwriter sleeps 10 seconds if configured to not do anything). Though
10 seconds may seem like an eternity for systems like the ones Peter was
talking about, where there is a script trying to restart the server as
soon as the postmaster dies.

Here is a patch for split-sleep of autovacuum_naptime.

There are some other issues in CVS HEAD; We use the calculation
{autovacuum_naptime * 1000000} in launcher_determine_sleep().
The result will be corrupted if we set autovacuum_naptime to >2147.

Ugh. How about this patch; this avoids the overflow issue altogether.
I am not sure that this works on Win32 but it seems we are already using
struct timeval elsewhere, so I don't see why it wouldn't work.

In another place, we use {autovacuum_naptime * 1000}, so we should
set the upper bound to INT_MAX/1000 instead of INT_MAX.
Incidentally, we've already had the same protections for
log_min_duration_statement and log_autovacuum.

Hmm, yes, the naptime should have an upper bound of INT_MAX/1000. It
doesn't seem worth the trouble of changing those places, when we know
that such a high value of naptime is uselessly high.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Attachments:

av-naptime-overflow.patchtext/x-diff; charset=us-asciiDownload
Index: src/backend/postmaster/autovacuum.c
===================================================================
RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/postmaster/autovacuum.c,v
retrieving revision 1.49
diff -c -p -r1.49 autovacuum.c
*** src/backend/postmaster/autovacuum.c	8 Jun 2007 21:21:28 -0000	1.49
--- src/backend/postmaster/autovacuum.c	13 Jun 2007 17:27:39 -0000
***************
*** 18,23 ****
--- 18,24 ----
  
  #include <signal.h>
  #include <sys/types.h>
+ #include <sys/time.h>
  #include <time.h>
  #include <unistd.h>
  
*************** int			autovacuum_vac_cost_limit;
*** 73,78 ****
--- 74,83 ----
  
  int			Log_autovacuum = -1;
  
+ 
+ /* maximum sleep duration in the launcher */
+ #define AV_SLEEP_QUANTUM 10
+ 
  /* Flags to tell if we are in an autovacuum process */
  static bool am_autovacuum_launcher = false;
  static bool am_autovacuum_worker = false;
*************** NON_EXEC_STATIC void AutoVacWorkerMain(i
*** 197,203 ****
  NON_EXEC_STATIC void AutoVacLauncherMain(int argc, char *argv[]);
  
  static Oid do_start_worker(void);
! static uint64 launcher_determine_sleep(bool canlaunch, bool recursing);
  static void launch_worker(TimestampTz now);
  static List *get_database_list(void);
  static void rebuild_database_list(Oid newdb);
--- 202,209 ----
  NON_EXEC_STATIC void AutoVacLauncherMain(int argc, char *argv[]);
  
  static Oid do_start_worker(void);
! static void launcher_determine_sleep(bool canlaunch, bool recursing,
! 						 struct timeval *nap);
  static void launch_worker(TimestampTz now);
  static List *get_database_list(void);
  static void rebuild_database_list(Oid newdb);
*************** AutoVacLauncherMain(int argc, char *argv
*** 487,493 ****
  
  	for (;;)
  	{
! 		uint64		micros;
  		bool	can_launch;
  		TimestampTz current_time = 0;
  
--- 493,499 ----
  
  	for (;;)
  	{
! 		struct timeval nap;
  		bool	can_launch;
  		TimestampTz current_time = 0;
  
*************** AutoVacLauncherMain(int argc, char *argv
*** 498,508 ****
  		if (!PostmasterIsAlive(true))
  			exit(1);
  
! 		micros = launcher_determine_sleep(AutoVacuumShmem->av_freeWorkers !=
! 										  INVALID_OFFSET, false);
  
! 		/* Sleep for a while according to schedule */
! 		pg_usleep(micros);
  
  		/* the normal shutdown case */
  		if (avlauncher_shutdown_request)
--- 504,542 ----
  		if (!PostmasterIsAlive(true))
  			exit(1);
  
! 		launcher_determine_sleep(AutoVacuumShmem->av_freeWorkers !=
!   								 INVALID_OFFSET, false, &nap);
! 
! 		/*
! 		 * Sleep for a while according to schedule.  We only sleep in
! 		 * AV_SLEEP_QUANTUM sec intervals, in order to promptly notice
! 		 * postmaster death.
! 		 */
! 		while (nap.tv_sec > 0 || nap.tv_usec > 0)
! 		{
! 			uint32	sleeptime;
! 
! 			sleeptime = nap.tv_usec;
! 			nap.tv_usec = 0;
  
! 			if (nap.tv_sec > 0)
! 			{
! 				sleeptime += Min(nap.tv_sec, AV_SLEEP_QUANTUM) * 1000000;
! 				nap.tv_sec -= Min(nap.tv_sec, AV_SLEEP_QUANTUM);
! 			}
! 			
! 			pg_usleep(sleeptime);
! 
! 			/*
! 			 * Emergency bailout if postmaster has died.  This is to avoid the
! 			 * necessity for manual cleanup of all postmaster children.
! 			 */
! 			if (!PostmasterIsAlive(true))
! 				exit(1);
! 
! 			if (avlauncher_shutdown_request || got_SIGHUP || got_SIGUSR1)
! 				break;
! 		}
  
  		/* the normal shutdown case */
  		if (avlauncher_shutdown_request)
*************** AutoVacLauncherMain(int argc, char *argv
*** 647,662 ****
  }
  
  /*
!  * Determine the time to sleep, in microseconds, based on the database list.
   *
   * The "canlaunch" parameter indicates whether we can start a worker right now,
!  * for example due to the workers being all busy.
   */
! static uint64
! launcher_determine_sleep(bool canlaunch, bool recursing)
  {
- 	long	secs;
- 	int		usecs;
  	Dlelem *elem;
  
  	/*
--- 681,695 ----
  }
  
  /*
!  * Determine the time to sleep, based on the database list.
   *
   * The "canlaunch" parameter indicates whether we can start a worker right now,
!  * for example due to the workers being all busy.  If this is false, we will
!  * cause a long sleep, which will be interrupted when a worker exits.
   */
! static void
! launcher_determine_sleep(bool canlaunch, bool recursing, struct timeval *nap)
  {
  	Dlelem *elem;
  
  	/*
*************** launcher_determine_sleep(bool canlaunch,
*** 667,689 ****
  	 */
  	if (!canlaunch)
  	{
! 		secs = autovacuum_naptime;
! 		usecs = 0;
  	}
  	else if ((elem = DLGetTail(DatabaseList)) != NULL)
  	{
  		avl_dbase  *avdb = DLE_VAL(elem);
  		TimestampTz	current_time = GetCurrentTimestamp();
  		TimestampTz	next_wakeup;
  
  		next_wakeup = avdb->adl_next_worker;
  		TimestampDifference(current_time, next_wakeup, &secs, &usecs);
  	}
  	else
  	{
  		/* list is empty, sleep for whole autovacuum_naptime seconds  */
! 		secs = autovacuum_naptime;
! 		usecs = 0;
  	}
  
  	/*
--- 700,727 ----
  	 */
  	if (!canlaunch)
  	{
! 		nap->tv_sec = autovacuum_naptime;
! 		nap->tv_usec = 0;
  	}
  	else if ((elem = DLGetTail(DatabaseList)) != NULL)
  	{
  		avl_dbase  *avdb = DLE_VAL(elem);
  		TimestampTz	current_time = GetCurrentTimestamp();
  		TimestampTz	next_wakeup;
+ 		long	secs;
+ 		int		usecs;
  
  		next_wakeup = avdb->adl_next_worker;
  		TimestampDifference(current_time, next_wakeup, &secs, &usecs);
+ 
+ 		nap->tv_sec = secs;
+ 		nap->tv_usec = usecs;
  	}
  	else
  	{
  		/* list is empty, sleep for whole autovacuum_naptime seconds  */
! 		nap->tv_sec = autovacuum_naptime;
! 		nap->tv_usec = 0;
  	}
  
  	/*
*************** launcher_determine_sleep(bool canlaunch,
*** 696,715 ****
  	 * We only recurse once.  rebuild_database_list should always return times
  	 * in the future, but it seems best not to trust too much on that.
  	 */
! 	if (secs == 0L && usecs == 0 && !recursing)
  	{
  		rebuild_database_list(InvalidOid);
! 		return launcher_determine_sleep(canlaunch, true);
  	}
  
  	/* 100ms is the smallest time we'll allow the launcher to sleep */
! 	if (secs <= 0L && usecs <= 100000)
  	{
! 		secs = 0L;
! 		usecs = 100000;	/* 100 ms */
  	}
- 
- 	return secs * 1000000 + usecs;
  }
  
  /*
--- 734,752 ----
  	 * We only recurse once.  rebuild_database_list should always return times
  	 * in the future, but it seems best not to trust too much on that.
  	 */
! 	if (nap->tv_sec == 0L && nap->tv_usec == 0 && !recursing)
  	{
  		rebuild_database_list(InvalidOid);
! 		launcher_determine_sleep(canlaunch, true, nap);
! 		return;
  	}
  
  	/* 100ms is the smallest time we'll allow the launcher to sleep */
! 	if (nap->tv_sec <= 0L && nap->tv_usec <= 100000)
  	{
! 		nap->tv_sec = 0L;
! 		nap->tv_usec = 100000;	/* 100 ms */
  	}
  }
  
  /*
#25Alvaro Herrera
alvherre@commandprompt.com
In reply to: Alvaro Herrera (#24)
Re: [PATCHES] Autovacuum launcher doesn't notice death of postmaster immediately

Alvaro Herrera wrote:

Ah; yes, what I was proposing (or thought about proposing, not sure if I
posted it or not) was putting a upper limit of 10 seconds in the sleep
(bgwriter sleeps 10 seconds if configured to not do anything). Though
10 seconds may seem like an eternity for systems like the ones Peter was
talking about, where there is a script trying to restart the server as
soon as the postmaster dies.

Peter, is 10 seconds good enough for you?

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.