new GUC var: autovacuum_process_all_tables

Started by Alvaro Herreraabout 17 years ago36 messageshackers
Jump to latest
#1Alvaro Herrera
alvherre@2ndquadrant.com

Hi,

Right now, when autovacuum is turned on we always assume it's supposed
to process all tables except those that have autovacuum_enabled=false.

Now, sometimes it might make more sense to keep it enabled but have it
only check for certain tables, and leave the majority of them disabled.
For this we'd have a separate GUC parameter, as in $SUBJECT (I'm not
wedded to the name), and have the user set autovacuum_enabled=true via
reloptions to enable it.

Opinions?

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#2Joshua D. Drake
jd@commandprompt.com
In reply to: Alvaro Herrera (#1)
Re: new GUC var: autovacuum_process_all_tables

On Thu, 2009-02-05 at 17:45 -0300, Alvaro Herrera wrote:

Hi,

Right now, when autovacuum is turned on we always assume it's supposed
to process all tables except those that have autovacuum_enabled=false.

Now, sometimes it might make more sense to keep it enabled but have it
only check for certain tables, and leave the majority of them disabled.
For this we'd have a separate GUC parameter, as in $SUBJECT (I'm not
wedded to the name), and have the user set autovacuum_enabled=true via
reloptions to enable it.

Opinions?

So you are inverting the option? What I mean is you are giving the
option of either:

A. Process everything unless false
B. Process nothing unless true

If I am understanding what you wrote correctly I am not sure I like the
idea as a whole. I think we should just always have it on and not have
it be optional. The rule of thumb should be, we autovacuum everything,
unless there is a extremely good reason not to and I think you should
have to explicitly turn off autovacuum for a relation.

Sincerely,

Joshua D. Drake

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

--
PostgreSQL - XMPP: jdrake@jabber.postgresql.org
Consulting, Development, Support, Training
503-667-4564 - http://www.commandprompt.com/
The PostgreSQL Company, serving since 1997

#3Simon Riggs
simon@2ndQuadrant.com
In reply to: Alvaro Herrera (#1)
Re: new GUC var: autovacuum_process_all_tables

On Thu, 2009-02-05 at 17:45 -0300, Alvaro Herrera wrote:

Right now, when autovacuum is turned on we always assume it's supposed
to process all tables except those that have autovacuum_enabled=false.

Now, sometimes it might make more sense to keep it enabled but have it
only check for certain tables, and leave the majority of them disabled.
For this we'd have a separate GUC parameter, as in $SUBJECT (I'm not
wedded to the name), and have the user set autovacuum_enabled=true via
reloptions to enable it.

I would prefer it if that behaviour was enabled by putting a special
entry into pg_autovacuum, e.g.

"ALL TABLES", autovacuum_enabled=false

I don't really want more GUCs for every nuance of AV behaviour.

If you do this we'd want it to be selectable by database and schema as
well. Perhaps by inserting the oid of the relevant database or schema?

e.g. if we want to turn off AV for database X, which has oid x

then we insert into pg_autovacuum(x, false, ....)

or to make the default no-autovacuum for all tables in all databases

insert into pg_autovacuum(0, false, ....)

It would be useful if all but the first two columns were nullable also,
to avoid having to specify -1.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#4Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Simon Riggs (#3)
Re: new GUC var: autovacuum_process_all_tables

Simon Riggs wrote:

On Thu, 2009-02-05 at 17:45 -0300, Alvaro Herrera wrote:

For this we'd have a separate GUC parameter, as in $SUBJECT (I'm not
wedded to the name), and have the user set autovacuum_enabled=true via
reloptions to enable it.

I would prefer it if that behaviour was enabled by putting a special
entry into pg_autovacuum,

So you're not aware that we're doing away with pg_autovacuum for good?
It's going to be replaced by reloptions, i.e.
ALTER TABLE foo SET (autovacuum_enabled = false);

Obviously there's no way to add a "catchall" setting.

e.g.
"ALL TABLES", autovacuum_enabled=false

I don't really want more GUCs for every nuance of AV behaviour.

In any case I fail to see how is this much different from a new GUC var.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#1)
Re: new GUC var: autovacuum_process_all_tables

Alvaro Herrera <alvherre@commandprompt.com> writes:

Right now, when autovacuum is turned on we always assume it's supposed
to process all tables except those that have autovacuum_enabled=false.

Now, sometimes it might make more sense to keep it enabled but have it
only check for certain tables, and leave the majority of them disabled.

When would that be? I can follow the use-case for vacuuming a selected
set of tables via cron-driven commands or whatever, and then excluding
those tables from autovacuum's purview. But there isn't a command to
"vacuum all tables except these". Without such a command available
to the cron-job, a switch such as you suggest is merely a foot-gun,
because it's dead certain some tables are going to get left out of
both manual and autovacuum processing.

And before anyone suggests it, I don't want to invent "vacuum all tables
except these". It'd make more sense to put the effort into developing
better scheduling control over autovacuum, such as a concept of
maintenance windows.

(BTW, autovac does vacuum tables to prevent wraparound even if you try
to tell it to skip them, right?)

regards, tom lane

#6Robert Haas
robertmhaas@gmail.com
In reply to: Alvaro Herrera (#1)
Re: new GUC var: autovacuum_process_all_tables

On Thu, Feb 5, 2009 at 3:45 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:

Right now, when autovacuum is turned on we always assume it's supposed
to process all tables except those that have autovacuum_enabled=false.

Now, sometimes it might make more sense to keep it enabled but have it
only check for certain tables, and leave the majority of them disabled.
For this we'd have a separate GUC parameter, as in $SUBJECT (I'm not
wedded to the name), and have the user set autovacuum_enabled=true via
reloptions to enable it.

Opinions?

Sounds horribly confusing. It's not very difficult to write a script
to set this value for every table in the database, if that's what you
want to do. Having autovacuum_enabled potentially mean two different
things depending on the value of some GUC sounds like a recipe for
confusion (and no I don't like it any better if we put the global
switch somewhere other than a GUC).

...Robert

#7Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#5)
Re: new GUC var: autovacuum_process_all_tables

On Thu, 2009-02-05 at 16:29 -0500, Tom Lane wrote:

It'd make more sense to put the effort into developing
better scheduling control over autovacuum, such as a concept of
maintenance windows.

We need that as well, not instead of.

People want to be able to specify (as an example)
* autovac these problem tables anytime required
* for all other tables disable AV, except on sundays [non-busy-times]

If we're going to build in scheduling featured for AV, I'd like to make
those scheduling features available to user defined tasks also. No need
to limit ourselves to just AV.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#8Simon Riggs
simon@2ndQuadrant.com
In reply to: Alvaro Herrera (#4)
Re: new GUC var: autovacuum_process_all_tables

On Thu, 2009-02-05 at 18:25 -0300, Alvaro Herrera wrote:

So you're not aware that we're doing away with pg_autovacuum for good?
It's going to be replaced by reloptions, i.e.
ALTER TABLE foo SET (autovacuum_enabled = false);

Obviously there's no way to add a "catchall" setting.

Seems like a bad plan then. How do you reconcile those conflicting
requirements?

e.g.
"ALL TABLES", autovacuum_enabled=false

I don't really want more GUCs for every nuance of AV behaviour.

In any case I fail to see how is this much different from a new GUC var.

Rows in a table v. new parameters. We can allow endless table driven
complexity. Adding my_little_nuance=on|off strains most people's
patience.

How would I specify that database A wants AV turned off, but database B
wants it on?

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#9Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#5)
Re: new GUC var: autovacuum_process_all_tables

Tom Lane wrote:

(BTW, autovac does vacuum tables to prevent wraparound even if you try
to tell it to skip them, right?)

Yes.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

In reply to: Alvaro Herrera (#1)
Re: new GUC var: autovacuum_process_all_tables

Alvaro Herrera escreveu:

Hi,

Right now, when autovacuum is turned on we always assume it's supposed
to process all tables except those that have autovacuum_enabled=false.

Now, sometimes it might make more sense to keep it enabled but have it
only check for certain tables, and leave the majority of them disabled.
For this we'd have a separate GUC parameter, as in $SUBJECT (I'm not
wedded to the name), and have the user set autovacuum_enabled=true via
reloptions to enable it.

Opinions?

What about 'autovacuum_mode'? Values could be 'all' and 'reloption'.

If we don't want to add another GUC, I'll go changing the 'autovacuum' GUC.
Values would be: 'on' means enable autovacuum and process all tables, 'off'
means disable autovacuum and 'reloption' (?) means only process those tables
that have reloption autovacuum_enabled=true. The con is that we couldn't
implement a per-{schema,database} switch for autovacuum. So I prefer the first
one.

--
Euler Taveira de Oliveira
http://www.timbira.com/

#11Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Simon Riggs (#8)
Re: new GUC var: autovacuum_process_all_tables

Simon Riggs wrote:

On Thu, 2009-02-05 at 18:25 -0300, Alvaro Herrera wrote:

So you're not aware that we're doing away with pg_autovacuum for good?
It's going to be replaced by reloptions, i.e.
ALTER TABLE foo SET (autovacuum_enabled = false);

Obviously there's no way to add a "catchall" setting.

Seems like a bad plan then. How do you reconcile those conflicting
requirements?

I don't see them as conflicting; I see yours as a missing feature,
namely the ability to add tables to an autovacuum "group", which could
have settings attached. Being able to do that is the whole point of
moving settings to reloptions.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#7)
Re: new GUC var: autovacuum_process_all_tables

Simon Riggs <simon@2ndQuadrant.com> writes:

On Thu, 2009-02-05 at 16:29 -0500, Tom Lane wrote:

It'd make more sense to put the effort into developing
better scheduling control over autovacuum, such as a concept of
maintenance windows.

We need that as well, not instead of.

I disagree; adding every frammish anyone could ever think of is not
an overall improvement to the system.

My feeling is that we should be trying to eliminate use-cases for
cron-driven vacuuming, not trying to make sure that cron-driven
scripts can do anything autovacuum can.

The main remaining use-case seems to me to make vacuuming work adhere
to some business-determined schedule, hence maintenance windows seem
like the next thing to do.

regards, tom lane

#13Simon Riggs
simon@2ndQuadrant.com
In reply to: Alvaro Herrera (#11)
Re: new GUC var: autovacuum_process_all_tables

On Thu, 2009-02-05 at 18:54 -0300, Alvaro Herrera wrote:

Simon Riggs wrote:

On Thu, 2009-02-05 at 18:25 -0300, Alvaro Herrera wrote:

So you're not aware that we're doing away with pg_autovacuum for good?
It's going to be replaced by reloptions, i.e.
ALTER TABLE foo SET (autovacuum_enabled = false);

Obviously there's no way to add a "catchall" setting.

Seems like a bad plan then. How do you reconcile those conflicting
requirements?

I don't see them as conflicting; I see yours as a missing feature,
namely the ability to add tables to an autovacuum "group", which could
have settings attached. Being able to do that is the whole point of
moving settings to reloptions.

So your changes will allow these?

ALTER DATABASE foo SET (autovacuum_enabled = false);
ALTER SCHEMA foo SET (autovacuum_enabled = false);

CREATE TABLE GROUP foo_group;
ALTER TABLE foo SET TABLE GROUP foo_group;
ALTER TABLE foo2 SET TABLE GROUP foo_group;
ALTER TABLE GROUP SET (autovacuum_enabled = false);

Hopefully the grouping of tables is not purely related to AV?

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#14Joshua D. Drake
jd@commandprompt.com
In reply to: Tom Lane (#12)
Re: new GUC var: autovacuum_process_all_tables

On Thu, 2009-02-05 at 17:08 -0500, Tom Lane wrote:

I disagree; adding every frammish anyone could ever think of is not
an overall improvement to the system.

:)

My feeling is that we should be trying to eliminate use-cases for
cron-driven vacuuming, not trying to make sure that cron-driven
scripts can do anything autovacuum can.

Agreed. IMO, the user should only have to think about "vacuum" in an
abstract sense. With the exception being those "few" tables that need
the customized configuration (thus reloptions).

The main remaining use-case seems to me to make vacuuming work adhere
to some business-determined schedule, hence maintenance windows seem
like the next thing to do.

Also agreed.

Sincerely,

Joshua D. Drake

regards, tom lane

--
PostgreSQL - XMPP: jdrake@jabber.postgresql.org
Consulting, Development, Support, Training
503-667-4564 - http://www.commandprompt.com/
The PostgreSQL Company, serving since 1997

#15Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#12)
Re: new GUC var: autovacuum_process_all_tables

Tom Lane wrote:

Simon Riggs <simon@2ndQuadrant.com> writes:

On Thu, 2009-02-05 at 16:29 -0500, Tom Lane wrote:

It'd make more sense to put the effort into developing
better scheduling control over autovacuum, such as a concept of
maintenance windows.

We need that as well, not instead of.

I disagree; adding every frammish anyone could ever think of is not
an overall improvement to the system.

Agreed, let's get this capability out in 8.4 and we can always adust it
based on user demand.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#16Simon Riggs
simon@2ndQuadrant.com
In reply to: Bruce Momjian (#15)
Re: new GUC var: autovacuum_process_all_tables

On Thu, 2009-02-05 at 17:57 -0500, Bruce Momjian wrote:

Tom Lane wrote:

Simon Riggs <simon@2ndQuadrant.com> writes:

On Thu, 2009-02-05 at 16:29 -0500, Tom Lane wrote:

It'd make more sense to put the effort into developing
better scheduling control over autovacuum, such as a concept of
maintenance windows.

We need that as well, not instead of.

I disagree; adding every frammish anyone could ever think of is not
an overall improvement to the system.

Agreed, let's get this capability out in 8.4 and we can always adust it
based on user demand.

Oh, I agree to limiting what we do for 8,4, but we need more later.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#17Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#12)
Re: new GUC var: autovacuum_process_all_tables

On Thu, 2009-02-05 at 17:08 -0500, Tom Lane wrote:

Simon Riggs <simon@2ndQuadrant.com> writes:

On Thu, 2009-02-05 at 16:29 -0500, Tom Lane wrote:

It'd make more sense to put the effort into developing
better scheduling control over autovacuum, such as a concept of
maintenance windows.

We need that as well, not instead of.

I disagree; adding every frammish anyone could ever think of is not
an overall improvement to the system.

I like your word frammish and am watchful of such things myself.

My feeling is that we should be trying to eliminate use-cases for
cron-driven vacuuming,

Agreed.

not trying to make sure that cron-driven
scripts can do anything autovacuum can.

I'm not in favour of limiting our capability to internal actions only.
If we add a capability for scheduling work, we can easily make it
capable of scheduling many kinds of work.

Writing an application maintenance utility in PL/pgSQL is much better
than having to write it for all the different servers an application may
need to run on. We can't ignore that many people use Windows.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

#18Bruce Momjian
bruce@momjian.us
In reply to: Simon Riggs (#17)
Re: new GUC var: autovacuum_process_all_tables

On Thu, Feb 5, 2009 at 11:57 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

Writing an application maintenance utility in PL/pgSQL is much better
than having to write it for all the different servers an application may
need to run on.

Welcome to the suction effect. If your scheduler is in the database
then you're stuck with the interfaces the database provides. When you
use those interfaces you're going to be stuck with whatever tools work
with them. Imagine trying to compose MIME email in plpgsql or do dns
lookups or interface with your C application code. Plpgsql is
singularly unsuited for anything other than database work. Yes we have
other languages but there are still relatively few and having them
running within a PL interface makes integrating with the rest of their
systems more awkward. And more dangerous -- consider what a simple
memory management bug can do if it's in a database backend instead of
a network client.

We can't ignore that many people use Windows.

I think that logic is backwards. People choose their development and
server environment based on what works best for them. It makes no
sense to engineer the system around the assumption that they don't
like developing using the best native tools. Our reimplementation of
the OS is always going to be second-rate by comparison and it's doing
nothing for them but imposing the disadvantages of the restrictions
being stuck in a database backend brings.

--
greg

#19Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#16)
Re: new GUC var: autovacuum_process_all_tables

Agreed, let's get this capability out in 8.4 and we can always adust it
based on user demand.

Oh, I agree to limiting what we do for 8,4, but we need more later.

Thinking about this a little more, the biggest problem I have with
this feature is that it makes autovacuum_enabled mean two different
things depending on context. But maybe we should change the name of
the reloption to "autovacuum" and have three values for it:
default|enabled|disabled.

Then we could add a GUC called autovacuum_by_default = on|off, and we
could later insert per-schema or per-database or per-table-group
defaults without introducing any backwards-incompatibility (default
would still mean default, though the default might come from a
different source).

...Robert

#20Andrew Dunstan
andrew@dunslane.net
In reply to: Simon Riggs (#17)
Re: new GUC var: autovacuum_process_all_tables

Simon Riggs wrote:

not trying to make sure that cron-driven
scripts can do anything autovacuum can.

I'm not in favour of limiting our capability to internal actions only.
If we add a capability for scheduling work, we can easily make it
capable of scheduling many kinds of work.

Writing an application maintenance utility in PL/pgSQL is much better
than having to write it for all the different servers an application may
need to run on. We can't ignore that many people use Windows.

I'm not sure what you're saying here. Windows has a scheduler (in my
setup, that's how my buildfarm members run). And there are third party
cron utilities as well.

cheers

andrew

#21Josh Berkus
josh@agliodbs.com
In reply to: Alvaro Herrera (#1)
#22Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#19)
#23Simon Riggs
simon@2ndQuadrant.com
In reply to: Andrew Dunstan (#20)
#24Simon Riggs
simon@2ndQuadrant.com
In reply to: Bruce Momjian (#18)
#25Andrew Dunstan
andrew@dunslane.net
In reply to: Simon Riggs (#23)
#26Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Josh Berkus (#21)
#27Ron Mayer
rm_pg@cheapcomplexdevices.com
In reply to: Joshua D. Drake (#14)
#28Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Ron Mayer (#27)
#29Josh Berkus
josh@agliodbs.com
In reply to: Alvaro Herrera (#26)
#30Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Simon Riggs (#13)
#31Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Josh Berkus (#29)
#32Robert Haas
robertmhaas@gmail.com
In reply to: Alvaro Herrera (#30)
#33Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#31)
#34Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Robert Haas (#32)
#35Robert Haas
robertmhaas@gmail.com
In reply to: Jaime Casanova (#34)
#36Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Robert Haas (#35)