Allow vacuumdb to only analyze

Started by Jim Nasbyalmost 17 years ago9 messageshackers
Jump to latest
#1Jim Nasby
Jim.Nasby@BlueTreble.com

One of the talks at PGCon (update in place?) recommended running
vacuumdb -z to analyze all tables to rebuild statistics. Problem with
that is it also vacuums everything. ISTM it'd be useful to be able to
just vacuum all databases in a cluster, so I hacked it into vacuumdb.

Of course, using a command called vacuumdb is rather silly, but I
don't see a reasonable way to deal with that. I did change the name
of the functions from vacuum_* to process_*, since they can vacuum
and/or analyze.

The only thing I see missing is the checks for invalid combinations
of options, which I'm thinking should go in the function rather than
in the option parsing section. But I didn't want to put any more
effort into this if it's not something we actually want.

Attachments:

patchapplication/octet-stream; name=patch; x-unix-mode=0640Download+41-34
#2Robert Haas
robertmhaas@gmail.com
In reply to: Jim Nasby (#1)
Re: Allow vacuumdb to only analyze

On Sat, May 23, 2009 at 10:31 PM, decibel <decibel@decibel.org> wrote:

One of the talks at PGCon (update in place?) recommended running vacuumdb -z
to analyze all tables to rebuild statistics. Problem with that is it also
vacuums everything. ISTM it'd be useful to be able to just vacuum all
databases in a cluster, so I hacked it into vacuumdb.

I think you meant "ISTM it'd be useful to be able to just analyze all
databases in a cluster".

Of course, using a command called vacuumdb is rather silly, but I don't see
a reasonable way to deal with that. I did change the name of the functions
from vacuum_* to process_*, since they can vacuum and/or analyze.

The only thing I see missing is the checks for invalid combinations of
options, which I'm thinking should go in the function rather than in the
option parsing section. But I didn't want to put any more effort into this
if it's not something we actually want.

It does seem somewhat useful to be able to analyze all databases
easily from the command-line, but putting it into vacuumdb is
certainly a hack. (By the way, we don't allow C++ style comments.)

I wonder if we ought not to find a way to make pg_migrator
automatically do some of these things after starting up the database.
Given autovacuum, it should be a pretty rare thing to need to manually
analyze every database in the cluster, so instead of building a
general tool to do this, it might make more sense to make it happen
automatically in the one case where we know it's necessary.

I noticed in Bruce's talk that there are a number of post-migration
steps which are currently partially manual. Ideally we'd like to
automate them all, preferably in some sort of well-thought-out order.
I actually suspect this is something like: analyze each database,
reindex those indices invalidated by the upgrade, analyze each
database again. Ideally we'd like to have some control over the
degree of parallelism here too but that might be asking too much for
8.4.

...Robert

#3Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Robert Haas (#2)
Re: Allow vacuumdb to only analyze

Robert Haas <robertmhaas@gmail.com> wrote:

I noticed in Bruce's talk that there are a number of post-migration
steps which are currently partially manual. Ideally we'd like to
automate them all, preferably in some sort of well-thought-out

order.

I actually suspect this is something like: analyze each database,
reindex those indices invalidated by the upgrade, analyze each
database again.

We have found it useful to VACUUM FREEZE ANALYZE a converted database.
The first access to any page will cause writing of hint bits, and
we'd rather deal with that before we let the users in, to avoid having
sluggish performance for them while that happens. The FREEZE part is
to avoid a freeze of all subsequently untouched data in all tables at
some unpredictable time -- potentially in the middle of a busy
workday. The usual argument against aggressive freezing (that
forensic information useful in recovery of a corrupted database)
doesn't carry much weight right after a conversion.

-Kevin

#4Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Robert Haas (#2)
Re: Allow vacuumdb to only analyze

On May 23, 2009, at 9:51 PM, Robert Haas wrote:

vacuums everything. ISTM it'd be useful to be able to just vacuum all
databases in a cluster, so I hacked it into vacuumdb.

I think you meant "ISTM it'd be useful to be able to just analyze all
databases in a cluster".

Heh. "Oops".

Of course, using a command called vacuumdb is rather silly, but I
don't see
a reasonable way to deal with that. I did change the name of the
functions
from vacuum_* to process_*, since they can vacuum and/or analyze.

The only thing I see missing is the checks for invalid
combinations of
options, which I'm thinking should go in the function rather than
in the
option parsing section. But I didn't want to put any more effort
into this
if it's not something we actually want.

It does seem somewhat useful to be able to analyze all databases
easily from the command-line, but putting it into vacuumdb is
certainly a hack.

So... do we want a completely separate analyzedb command? That seems
like far overkill.

Arguably there are yet other things you'd want to do across an entire
cluster, so perhaps what we really want is a 'clusterrun' or
'clustercmd' command?

(By the way, we don't allow C++ style comments.)

Yeah, was being lazy since they're just temporary TODOs.

I wonder if we ought not to find a way to make pg_migrator
automatically do some of these things after starting up the database.

Sure, pg_migrator is what started this, but it's completely
orthogonal to the lack of a "analyze everything" command.
--
Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

#5Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Jim Nasby (#4)
Re: Allow vacuumdb to only analyze

On May 27, 2009, at 11:31 AM, decibel wrote:

It does seem somewhat useful to be able to analyze all databases
easily from the command-line, but putting it into vacuumdb is
certainly a hack.

So... do we want a completely separate analyzedb command? That
seems like far overkill.

Arguably there are yet other things you'd want to do across an
entire cluster, so perhaps what we really want is a 'clusterrun' or
'clustercmd' command?

No one else has commented, so I'm guessing that means no one is
opposed to allowing for vacuumdb to just analyze. If anyone else
objects to this please speak up before I put the final touches on the
patch...
--
Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

#6Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#2)
Re: Allow vacuumdb to only analyze

Robert Haas wrote:

I noticed in Bruce's talk that there are a number of post-migration
steps which are currently partially manual. Ideally we'd like to
automate them all, preferably in some sort of well-thought-out order.
I actually suspect this is something like: analyze each database,
reindex those indices invalidated by the upgrade, analyze each
database again. Ideally we'd like to have some control over the
degree of parallelism here too but that might be asking too much for
8.4.

I can easily have pg_migrator run those scripts itself but I pushed it
on to the administrator so pg_migrator could finish and they could
decide when to run those scripts. For example, there might only be
issues in a few databases and the other database could be used fully
while the upgrade scripts are running. The same hold for analyzing the
cluster --- anything I thought might take a while I gave to the
administrators.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#7Bruce Momjian
bruce@momjian.us
In reply to: Jim Nasby (#1)
Re: Allow vacuumdb to only analyze

decibel wrote:

One of the talks at PGCon (update in place?) recommended running
vacuumdb -z to analyze all tables to rebuild statistics. Problem with
that is it also vacuums everything. ISTM it'd be useful to be able to
just vacuum all databases in a cluster, so I hacked it into vacuumdb.

Of course, using a command called vacuumdb is rather silly, but I
don't see a reasonable way to deal with that. I did change the name
of the functions from vacuum_* to process_*, since they can vacuum
and/or analyze.

The only thing I see missing is the checks for invalid combinations
of options, which I'm thinking should go in the function rather than
in the option parsing section. But I didn't want to put any more
effort into this if it's not something we actually want.

This is implemented in 9.0 from vacuumdb:

-Z, --analyze-only only update optimizer hints

-- 
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com
  PG East:  http://www.enterprisedb.com/community/nav-pg-east-2010.do
  + If your life is a hard drive, Christ can be your backup. +
#8Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Bruce Momjian (#7)
Re: Allow vacuumdb to only analyze

On Thu, Feb 25, 2010 at 5:09 PM, Bruce Momjian <bruce@momjian.us> wrote:

This is implemented in 9.0 from vacuumdb:

         -Z, --analyze-only              only update optimizer hints

maybe just noise, but it's not better to say "optimizer statistics"
instead of "optimizer hints"?

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

#9Bruce Momjian
bruce@momjian.us
In reply to: Jaime Casanova (#8)
Re: Allow vacuumdb to only analyze

Jaime Casanova wrote:

On Thu, Feb 25, 2010 at 5:09 PM, Bruce Momjian <bruce@momjian.us> wrote:

This is implemented in 9.0 from vacuumdb:

? ? ? ? ?-Z, --analyze-only ? ? ? ? ? ? ?only update optimizer hints

maybe just noise, but it's not better to say "optimizer statistics"
instead of "optimizer hints"?

Wow, I never noticed that but --analyze used "hints" too, and in 8.4 as
well. I have updated it to call it "statistics" in the attached patch.
The manual page does not call them hints.

-- 
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com
  PG East:  http://www.enterprisedb.com/community/nav-pg-east-2010.do
  + If your life is a hard drive, Christ can be your backup. +

Attachments:

/rtmp/difftext/x-diffDownload+4-4