Vacuuming Questions

Started by John Gardneralmost 18 years ago4 messagesgeneral
Jump to latest
#1John Gardner
john.gardner@tagish.co.uk

We have two PostgreSQL servers (8.2) running in a cluster.

We have autovacuum switched on on both servers and also we are running
the following as a cron job;

Server 1:
30 0,2,4,6,8,10,12,14,16,18,20,22 * * * /usr/bin/vacuumdb --all --analyze

Server 2:
30 1,3,5,7,9,11,13,15,17,19,21,23 * * * /usr/bin/vacuumdb --all --analyze

So, in other words we stagger the jobs one each server in the cluster.

Now, we're not seeing any problems with performance and we're not seeing
any bloat but I have a couple of questions which I hope some of you can
help me out on.

1) Is using the autovacuum daemon and running vacuumdb from a cron job
overkill?

2) The vendor of the clustering software suggests that I regularly run
vacuum --all --analyze on a specific table. Surely, if I'm running
vacuumdb --all --analyze isn't this just a wrapper for running the
vacuum --all --analyze command anyway? I guess that I'm asking why
would I want to run vacuum --all --analyze on a specific table, when I
regularly run vacuumdb --all --analyze on all databases?

Any help would be grateful.

Thanks

John

#2Joshua D. Drake
jd@commandprompt.com
In reply to: John Gardner (#1)
Re: Vacuuming Questions

On Wed, 23 Apr 2008 16:27:33 +0100
John Gardner <john.gardner@tagish.co.uk> wrote:

We have two PostgreSQL servers (8.2) running in a cluster.

Could you be a bit more specific about what you mean by: in a cluster?

Now, we're not seeing any problems with performance and we're not
seeing any bloat but I have a couple of questions which I hope some
of you can help me out on.

1) Is using the autovacuum daemon and running vacuumdb from a cron
job overkill?

Not necessarily. Autovacuum can get behind when managing larger tables
that are actively used. This is better in 8.3 because you can have more
than one autovacuum worker.

2) The vendor of the clustering software suggests that I regularly
run vacuum --all --analyze on a specific table. Surely, if I'm
running vacuumdb --all --analyze isn't this just a wrapper for
running the vacuum --all --analyze command anyway?

Well you have stated two different things here.

1. The vendor is stating you should vacuum a specific table
2. You are vacuuming the whole database.

Do you have a specific table that they are talking about? Is it
particularly busy? Say a session table?

I guess that I'm
asking why would I want to run vacuum --all --analyze on a specific
table, when I regularly run vacuumdb --all --analyze on all databases?

Yes those two are identical.

Joshua D. Drake

--
The PostgreSQL Company since 1997: http://www.commandprompt.com/
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate

#3Scott Marlowe
scott.marlowe@gmail.com
In reply to: John Gardner (#1)
Re: Vacuuming Questions

On Wed, Apr 23, 2008 at 9:27 AM, John Gardner <john.gardner@tagish.co.uk> wrote:

We have two PostgreSQL servers (8.2) running in a cluster.

We have autovacuum switched on on both servers and also we are running the
following as a cron job;

Server 1:
30 0,2,4,6,8,10,12,14,16,18,20,22 * * * /usr/bin/vacuumdb --all --analyze

You can change that to

30 0-59/2 * * * ...
and
30 1-59/2 * * *

(I think the second one does the odd minutes. might need to test it.

1) Is using the autovacuum daemon and running vacuumdb from a cron job
overkill?

Maybe. It's probably better let autovacuum handle most of the db, and
use a single vacuum analyze on the one really busy table.

It's possible that you're using up too much of your I/O bandwidth
cleaning a bunch of table that don't need it. As Joshua mentioned,
it's probably a good idea to look at 8.3 due to its ability to run >1
autovacuum thread at a time.

#4John Gardner
john.gardner@tagish.co.uk
In reply to: Joshua D. Drake (#2)
Re: Vacuuming Questions

Joshua D. Drake wrote:

On Wed, 23 Apr 2008 16:27:33 +0100
John Gardner <john.gardner@tagish.co.uk> wrote:

We have two PostgreSQL servers (8.2) running in a cluster.

Could you be a bit more specific about what you mean by: in a cluster?

Well, we're using middleware technology to load balance and cluster the
PostgreSQL backends

Now, we're not seeing any problems with performance and we're not
seeing any bloat but I have a couple of questions which I hope some
of you can help me out on.

1) Is using the autovacuum daemon and running vacuumdb from a cron
job overkill?

Not necessarily. Autovacuum can get behind when managing larger tables
that are actively used. This is better in 8.3 because you can have more
than one autovacuum worker.

2) The vendor of the clustering software suggests that I regularly
run vacuum --all --analyze on a specific table. Surely, if I'm
running vacuumdb --all --analyze isn't this just a wrapper for
running the vacuum --all --analyze command anyway?

Well you have stated two different things here.

1. The vendor is stating you should vacuum a specific table
2. You are vacuuming the whole database.

Do you have a specific table that they are talking about? Is it
particularly busy? Say a session table?

OK, well initially, the vendor told us to stagger the vacuumdb command
every couple of hours, but latterly they are telling us to vacuum a
specific table. The table contains the transactions that are used for
rollback if any particular node in the cluster needs to be restarted.

I guess that I'm
asking why would I want to run vacuum --all --analyze on a specific
table, when I regularly run vacuumdb --all --analyze on all databases?

Thanks

John