can't stop autovacuum by HUP'ing the server
I'd like to stop autovac by changing the conf file then sending the
server a HUP
This "appears" to work, the logs show autovac terminated by
administrative command. Then a few minutes later I see a vacuum
process spawned.
Is it possible that there are timers that aren't being properly
terminated here ?
Dave
Dave Cramer <pg@fastcrypt.com> writes:
I'd like to stop autovac by changing the conf file then sending the
server a HUP
Uh ... why should that stop an autovac already in progress? I'd
only expect it to affect future launches.
regards, tom lane
Hi Dave,
Dave Cramer wrote:
I'd like to stop autovac by changing the conf file then sending the
server a HUPThis "appears" to work, the logs show autovac terminated by
administrative command. Then a few minutes later I see a vacuum process
spawned.Is it possible that there are timers that aren't being properly
terminated here ?
I wonder whether there are tables that need an emergency vacuum (i.e.
they have reached the freeze horizon). What version are you running
anyway?
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On 24-Aug-08, at 10:12 PM, Tom Lane wrote:
Dave Cramer <pg@fastcrypt.com> writes:
I'd like to stop autovac by changing the conf file then sending the
server a HUPUh ... why should that stop an autovac already in progress? I'd
only expect it to affect future launches.
Well, I go the extra mile and kill any remaing autovac procs
Here are the logs
2008-08-25 04:00:01 EDT [32276] LOG: autovacuum launcher shutting down
2008-08-25 04:00:01 EDT [20526] LOG: autovacuum launcher started
2008-08-25 04:00:01 EDT [21404] FATAL: terminating autovacuum
process due to administrator command
2008-08-25 04:00:01 EDT [23418] FATAL: terminating autovacuum
process due to administrator command
2008-08-25 04:00:01 EDT [21703] FATAL: terminating autovacuum
process due to administrator command
So now that I've read this why would the launcher start again ?
Dave
Show quoted text
regards, tom lane
On 24-Aug-08, at 10:17 PM, Alvaro Herrera wrote:
Hi Dave,
Dave Cramer wrote:
I'd like to stop autovac by changing the conf file then sending the
server a HUPThis "appears" to work, the logs show autovac terminated by
administrative command. Then a few minutes later I see a vacuum
process
spawned.Is it possible that there are timers that aren't being properly
terminated here ?I wonder whether there are tables that need an emergency vacuum (i.e.
they have reached the freeze horizon). What version are you running
anyway?
version 8.3 and doubtful.
Since a number of people actually do this it would seem like a better
way to temporarily suspend autovac should be on the todo ?
Dave
Show quoted text
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Dave Cramer wrote:
Since a number of people actually do this it would seem like a better
way to temporarily suspend autovac should be on the todo ?
No -- what you are doing is documented to work. However, it only stops
the autovac launcher, not a currently-running worker. If this isn't
working, I'd like to know why.
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Dave Cramer wrote:
Well, I go the extra mile and kill any remaing autovac procs
Here are the logs
2008-08-25 04:00:01 EDT [32276] LOG: autovacuum launcher shutting down
2008-08-25 04:00:01 EDT [20526] LOG: autovacuum launcher started
What did you SIGHUP, the launcher or postmaster? You need the latter.
The launcher should exit automatically at that time.
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On 25-Aug-08, at 10:43 AM, Alvaro Herrera wrote:
Dave Cramer wrote:
Well, I go the extra mile and kill any remaing autovac procs
Here are the logs
2008-08-25 04:00:01 EDT [32276] LOG: autovacuum launcher shutting
down
2008-08-25 04:00:01 EDT [20526] LOG: autovacuum launcher startedWhat did you SIGHUP, the launcher or postmaster? You need the latter.
The launcher should exit automatically at that time.
No, I am HUP'ing the postmaster then subsequently killing any
autovacuums still around, which may be the problem.
I may be killing the launcher prematurely. I'll try again tonight.
Dave
Show quoted text
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Mon, Aug 25, 2008 at 1:07 PM, Dave Cramer <pg@fastcrypt.com> wrote:
On 25-Aug-08, at 10:43 AM, Alvaro Herrera wrote:
Dave Cramer wrote:
Well, I go the extra mile and kill any remaing autovac procs
Here are the logs
2008-08-25 04:00:01 EDT [32276] LOG: autovacuum launcher shutting down
2008-08-25 04:00:01 EDT [20526] LOG: autovacuum launcher startedWhat did you SIGHUP, the launcher or postmaster? You need the latter.
The launcher should exit automatically at that time.No, I am HUP'ing the postmaster then subsequently killing any autovacuums
still around, which may be the problem.I may be killing the launcher prematurely. I'll try again tonight.
Ok, here are the logs from last night
2008-08-26 04:00:02 EDT [25407] LOG: received SIGHUP, reloading
configuration files
2008-08-26 04:00:02 EDT [22649] LOG: autovacuum launcher shutting down
2008-08-26 04:00:02 EDT [30438] LOG: autovacuum launcher started
you can see the SIGHUP, the launcher being shut down, and starting right
back up again ???
is this expected behaviour ?
Dave
Dave Cramer wrote:
Ok, here are the logs from last night
2008-08-26 04:00:02 EDT [25407] LOG: received SIGHUP, reloading
configuration files
2008-08-26 04:00:02 EDT [22649] LOG: autovacuum launcher shutting down
2008-08-26 04:00:02 EDT [30438] LOG: autovacuum launcher startedyou can see the SIGHUP, the launcher being shut down, and starting right
back up again ???is this expected behaviour ?
Certainly not, and that's not what I see here either. I assume process
25407 is (was) the postmaster, yes?
If you "show autovacuum", is it on?
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
On Tue, Aug 26, 2008 at 9:37 AM, Alvaro Herrera
<alvherre@commandprompt.com>wrote:
Dave Cramer wrote:
Ok, here are the logs from last night
2008-08-26 04:00:02 EDT [25407] LOG: received SIGHUP, reloading
configuration files
2008-08-26 04:00:02 EDT [22649] LOG: autovacuum launcher shutting down
2008-08-26 04:00:02 EDT [30438] LOG: autovacuum launcher startedyou can see the SIGHUP, the launcher being shut down, and starting right
back up again ???is this expected behaviour ?
Certainly not, and that's not what I see here either. I assume process
25407 is (was) the postmaster, yes?If you "show autovacuum", is it on?
Yes that was the postmaster, and I did check to see if autovacuum was on,
and it was not.
Dave
Show quoted text
--
Alvaro Herrera
http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
On Tue, Aug 26, 2008 at 9:59 AM, Dave Cramer <pg@fastcrypt.com> wrote:
On Tue, Aug 26, 2008 at 9:37 AM, Alvaro Herrera <
alvherre@commandprompt.com> wrote:Dave Cramer wrote:
Ok, here are the logs from last night
2008-08-26 04:00:02 EDT [25407] LOG: received SIGHUP, reloading
configuration files
2008-08-26 04:00:02 EDT [22649] LOG: autovacuum launcher shutting down
2008-08-26 04:00:02 EDT [30438] LOG: autovacuum launcher startedyou can see the SIGHUP, the launcher being shut down, and starting right
back up again ???is this expected behaviour ?
Certainly not, and that's not what I see here either. I assume process
25407 is (was) the postmaster, yes?If you "show autovacuum", is it on?
Yes that was the postmaster, and I did check to see if autovacuum was on,
and it was not.Dave
So where do we go from here ?
Dave Cramer wrote:
On Tue, Aug 26, 2008 at 9:59 AM, Dave Cramer <pg@fastcrypt.com> wrote:
On Tue, Aug 26, 2008 at 9:37 AM, Alvaro Herrera <
alvherre@commandprompt.com> wrote:
Certainly not, and that's not what I see here either. I assume process
25407 is (was) the postmaster, yes?If you "show autovacuum", is it on?
Yes that was the postmaster, and I did check to see if autovacuum was on,
and it was not.So where do we go from here ?
The only possible explanation for this behavior is that somebody is
signalling the postmaster due to Xid wraparound issues. This is keyed
on some GUC vars -- Perhaps you have autovacuum_freeze_max_age set to an
insane value?
varsup.c line 246
/*
* We'll start trying to force autovacuums when oldest_datfrozenxid gets
* to be more than autovacuum_freeze_max_age transactions old.
*
* Note: guc.c ensures that autovacuum_freeze_max_age is in a sane range,
* so that xidVacLimit will be well before xidWarnLimit.
*
* [...]
*/
xidVacLimit = oldest_datfrozenxid + autovacuum_freeze_max_age;
...
if (TransactionIdFollowsOrEquals(curXid, xidVacLimit) &&
IsUnderPostmaster)
SendPostmasterSignal(PMSIGNAL_START_AUTOVAC_LAUNCHER);
However, I think that in allowed configurations you should also receive
these warnings:
/* Give an immediate warning if past the wrap warn point */
if (TransactionIdFollowsOrEquals(curXid, xidWarnLimit))
ereport(WARNING,
(errmsg("database \"%s\" must be vacuumed within %u transactions",
NameStr(*oldest_datname),
xidWrapLimit - curXid),
errhint("To avoid a database shutdown, execute a full-database VACUUM in \"%s\".",
NameStr(*oldest_datname))));
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
On Tue, Aug 26, 2008 at 10:56 AM, Alvaro Herrera <alvherre@commandprompt.com
wrote:
Dave Cramer wrote:
On Tue, Aug 26, 2008 at 9:59 AM, Dave Cramer <pg@fastcrypt.com> wrote:
On Tue, Aug 26, 2008 at 9:37 AM, Alvaro Herrera <
alvherre@commandprompt.com> wrote:Certainly not, and that's not what I see here either. I assume
process
25407 is (was) the postmaster, yes?
If you "show autovacuum", is it on?
Yes that was the postmaster, and I did check to see if autovacuum was
on,
and it was not.
So where do we go from here ?
The only possible explanation for this behavior is that somebody is
signalling the postmaster due to Xid wraparound issues. This is keyed
on some GUC vars -- Perhaps you have autovacuum_freeze_max_age set to an
insane value?
Doesn't appear to be insane ?
#autovacuum_freeze_max_age = 200000000 # maximum XID age before forced
vacuum
Dave Cramer wrote:
On Tue, Aug 26, 2008 at 10:56 AM, Alvaro Herrera <alvherre@commandprompt.com
The only possible explanation for this behavior is that somebody is
signalling the postmaster due to Xid wraparound issues. This is keyed
on some GUC vars -- Perhaps you have autovacuum_freeze_max_age set to an
insane value?Doesn't appear to be insane ?
#autovacuum_freeze_max_age = 200000000 # maximum XID age before forced
vacuum
Not only sane, but also the default ;-)
What's the max age(pg_database.datfrozenxid)?
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Tue, Aug 26, 2008 at 11:41 AM, Alvaro Herrera <alvherre@commandprompt.com
wrote:
Dave Cramer wrote:
On Tue, Aug 26, 2008 at 10:56 AM, Alvaro Herrera <
alvherre@commandprompt.com
The only possible explanation for this behavior is that somebody is
signalling the postmaster due to Xid wraparound issues. This is keyed
on some GUC vars -- Perhaps you have autovacuum_freeze_max_age set toan
insane value?
Doesn't appear to be insane ?
#autovacuum_freeze_max_age = 200000000 # maximum XID age before forced
vacuumNot only sane, but also the default ;-)
What's the max age(pg_database.datfrozenxid)?
select datfrozenxid from pg_database ;
datfrozenxid
--------------
201850617
101850961
86039359
21522712
Show quoted text
--
Alvaro Herrera
http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Tue, Aug 26, 2008 at 11:51 AM, Dave Cramer <pg@fastcrypt.com> wrote:
On Tue, Aug 26, 2008 at 11:41 AM, Alvaro Herrera <
alvherre@commandprompt.com> wrote:Dave Cramer wrote:
On Tue, Aug 26, 2008 at 10:56 AM, Alvaro Herrera <
alvherre@commandprompt.com
The only possible explanation for this behavior is that somebody is
signalling the postmaster due to Xid wraparound issues. This is keyed
on some GUC vars -- Perhaps you have autovacuum_freeze_max_age set toan
insane value?
Doesn't appear to be insane ?
#autovacuum_freeze_max_age = 200000000 # maximum XID age before forced
vacuumNot only sane, but also the default ;-)
What's the max age(pg_database.datfrozenxid)?
select datfrozenxid from pg_database ;
datfrozenxid
--------------
201850617
101850961
86039359
21522712
this code in autovacuum.c looks like it might be interesting
if (AutoVacuumShmem->av_signal[AutoVacForkFailed])
{
/*
* If the postmaster failed to start a new
worker, we sleep
* for a little while and resend the
signal. The new worker's
* state is still in memory, so this is
sufficient. After
* that, we restart the main loop.
*
* XXX should we put a limit to the number
of times we retry?
* I don't think it makes much sense,
because a future start
* of a worker will continue to fail in the
same way.
*/
AutoVacuumShmem->av_signal[AutoVacForkFailed] = false;
pg_usleep(100000L); /* 100ms */
SendPostmasterSignal(PMSIGNAL_START_AUTOVAC_WORKER);
continue;
Do these signals get cleaned up on a reload ?
Dave
Dave Cramer wrote:
On Tue, Aug 26, 2008 at 11:41 AM, Alvaro Herrera <alvherre@commandprompt.com
What's the max age(pg_database.datfrozenxid)?
select datfrozenxid from pg_database ;
datfrozenxid
--------------
201850617
101850961
86039359
21522712
Well, the first one is over the limit, isn't it?
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Dave Cramer wrote:
AutoVacuumShmem->av_signal[AutoVacForkFailed] = false;
pg_usleep(100000L); /* 100ms */SendPostmasterSignal(PMSIGNAL_START_AUTOVAC_WORKER);
continue;Do these signals get cleaned up on a reload ?
Well, not on a reload specifically, but this signal is of prompt
response (i.e. the postmaster acts immediately on it). See
CheckPostmasterSignal. Also, note that this code starts a worker, not
the launcher which is what you're seeing.
The signal you're looking for is PMSIGNAL_START_AUTOVAC_LAUNCHER (see
the varsup.c code in the vicinity of what I posted earlier). The
postmaster response is to set start_autovac_launcher (see
sigusr1_handler in postmaster.c) and when this is seen set, the launcher
is started (see ServerLoop in postmaster.c). However the bit you're
interested in is *why* the signal is being sent, which is what the
freeze limits determine.
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Tue, Aug 26, 2008 at 12:10 PM, Alvaro Herrera <alvherre@commandprompt.com
wrote:
Dave Cramer wrote:
AutoVacuumShmem->av_signal[AutoVacForkFailed] = false;
pg_usleep(100000L); /* 100ms*/
SendPostmasterSignal(PMSIGNAL_START_AUTOVAC_WORKER);
continue;Do these signals get cleaned up on a reload ?
Well, not on a reload specifically, but this signal is of prompt
response (i.e. the postmaster acts immediately on it). See
CheckPostmasterSignal. Also, note that this code starts a worker, not
the launcher which is what you're seeing.
Ok, back to why
turns out template0 is the culprit, why is autovac not vacuuming this ?
Dave