unbalanced indexes -> fixed via dump/restore?

Started by will trillichabout 25 years ago7 messagesgeneral
Jump to latest
#1will trillich
will@serensoft.com

from reading various posts over the past month, i glean that
indexes can become lopsided and slow if already-sorted input data
is added to the already-indexed table.

if so, is it good practice to

A)
pg_dump -c mydb > db.out.sql
and then
psql mydb < db.out.sql
periodically?

or is it better to

B) merely 'reindex' on occasion?

what are the pro's and con's of each approach?

--

and, is there a mechanical method to determine IF reindexing is
a productive venture -- i.e. benchmarking routines or
lopsided-ness detector algorithms?

--
It is always hazardous to ask "Why?" in science, but it is often
interesting to do so just the same.
-- Isaac Asimov, 'The Genetic Code'

will@serensoft.com
http://newbieDoc.sourceforge.net/ -- we need your brain!
http://www.dontUthink.com/ -- your brain needs us!

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: will trillich (#1)
Re: unbalanced indexes -> fixed via dump/restore?

will trillich <will@serensoft.com> writes:

A)
pg_dump -c mydb > db.out.sql
and then
psql mydb < db.out.sql
periodically?

or is it better to

B) merely 'reindex' on occasion?

Plain old DROP INDEX / CREATE INDEX is probably the best-trodden path.
Your (A) seems like vastly more work than is needed. (B) might be
marginally easier than DROP/CREATE, but I'm not sure how much I trust
REINDEX; it's not been around all that long.

regards, tom lane

#3Alfred Perlstein
bright@wintelcom.net
In reply to: Tom Lane (#2)
Re: unbalanced indexes -> fixed via dump/restore?

* Tom Lane <tgl@sss.pgh.pa.us> [010307 14:30] wrote:

will trillich <will@serensoft.com> writes:

A)
pg_dump -c mydb > db.out.sql
and then
psql mydb < db.out.sql
periodically?

or is it better to

B) merely 'reindex' on occasion?

Plain old DROP INDEX / CREATE INDEX is probably the best-trodden path.
Your (A) seems like vastly more work than is needed. (B) might be
marginally easier than DROP/CREATE, but I'm not sure how much I trust
REINDEX; it's not been around all that long.

Is there a way to do this atomically, meaning so that no one can
get at the table after dropping, but before recreating the index?

lock the table during?

--
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alfred Perlstein (#3)
Re: unbalanced indexes -> fixed via dump/restore?

Alfred Perlstein <bright@wintelcom.net> writes:

* Tom Lane <tgl@sss.pgh.pa.us> [010307 14:30] wrote:

Plain old DROP INDEX / CREATE INDEX is probably the best-trodden path.
Your (A) seems like vastly more work than is needed. (B) might be
marginally easier than DROP/CREATE, but I'm not sure how much I trust
REINDEX; it's not been around all that long.

Is there a way to do this atomically, meaning so that no one can
get at the table after dropping, but before recreating the index?

In 7.1 it should work to do

begin;
drop index fooi;
create index fooi on foo (...);
end;

The DROP acquires an exclusive lock on foo, so there's no need for
an explicit "lock table foo", though you can add one if it seems
clearer that way.

Before 7.1 this is too risky, because if the create index fails for
some reason, you're hosed (the attempted rollback of DROP will screw up).

btw, REINDEX essentially does the same thing as the above, but there's
a lot of strange additional locking code in it, which I don't trust
much... call it a design disagreement with Hiroshi ;-)

regards, tom lane

#5Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: will trillich (#1)
Re: unbalanced indexes -> fixed via dump/restore?

Tom Lane wrote:

Alfred Perlstein <bright@wintelcom.net> writes:

* Tom Lane <tgl@sss.pgh.pa.us> [010307 14:30] wrote:

Plain old DROP INDEX / CREATE INDEX is probably the best-trodden path.
Your (A) seems like vastly more work than is needed. (B) might be
marginally easier than DROP/CREATE, but I'm not sure how much I trust
REINDEX; it's not been around all that long.

Is there a way to do this atomically, meaning so that no one can
get at the table after dropping, but before recreating the index?

In 7.1 it should work to do

begin;
drop index fooi;
create index fooi on foo (...);
end;

The DROP acquires an exclusive lock on foo, so there's no need for
an explicit "lock table foo", though you can add one if it seems
clearer that way.

Before 7.1 this is too risky, because if the create index fails for
some reason, you're hosed (the attempted rollback of DROP will screw up).

btw, REINDEX essentially does the same thing as the above,

Yes REINDEX is safe under postmaster in 7.1.
In addtion REINDEX has some advantages.
1) no necessity to scatter the index definition.
2) it doesn't change any reference among system objects.

but there's
a lot of strange additional locking code in it,which I don't trust
much... call it a design disagreement with Hiroshi ;-)

Is it LockClassForUpdate() ? If so it's never a special function.
It's only implementing a 'FOR UPDATE' part of 'SELECT .. FROM PG_CLASS'
and 'select .. for update' before 'update ..' is an oridinary
sequence of update operations.

Regards,
Hiroshi Inoue

#6Alfred Perlstein
bright@wintelcom.net
In reply to: Hiroshi Inoue (#5)
Re: unbalanced indexes -> fixed via dump/restore?

* Hiroshi Inoue <Inoue@tpf.co.jp> [010308 17:07] wrote:

Tom Lane wrote:

Alfred Perlstein <bright@wintelcom.net> writes:

* Tom Lane <tgl@sss.pgh.pa.us> [010307 14:30] wrote:

Plain old DROP INDEX / CREATE INDEX is probably the best-trodden path.
Your (A) seems like vastly more work than is needed. (B) might be
marginally easier than DROP/CREATE, but I'm not sure how much I trust
REINDEX; it's not been around all that long.

Is there a way to do this atomically, meaning so that no one can
get at the table after dropping, but before recreating the index?

In 7.1 it should work to do

begin;
drop index fooi;
create index fooi on foo (...);
end;

The DROP acquires an exclusive lock on foo, so there's no need for
an explicit "lock table foo", though you can add one if it seems
clearer that way.

Before 7.1 this is too risky, because if the create index fails for
some reason, you're hosed (the attempted rollback of DROP will screw up).

btw, REINDEX essentially does the same thing as the above,

Yes REINDEX is safe under postmaster in 7.1.
In addtion REINDEX has some advantages.
1) no necessity to scatter the index definition.
2) it doesn't change any reference among system objects.

but there's
a lot of strange additional locking code in it,which I don't trust
much... call it a design disagreement with Hiroshi ;-)

Is it LockClassForUpdate() ? If so it's never a special function.
It's only implementing a 'FOR UPDATE' part of 'SELECT .. FROM PG_CLASS'
and 'select .. for update' before 'update ..' is an oridinary
sequence of update operations.

Is there a way to do this under 7.0.3?

--
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]

#7Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: will trillich (#1)
Re: unbalanced indexes -> fixed via dump/restore?

Alfred Perlstein wrote:

* Hiroshi Inoue <Inoue@tpf.co.jp> [010308 17:07] wrote:

Tom Lane wrote:

btw, REINDEX essentially does the same thing as the above,

Yes REINDEX is safe under postmaster in 7.1.
In addtion REINDEX has some advantages.
1) no necessity to scatter the index definition.
2) it doesn't change any reference among system objects.

but there's
a lot of strange additional locking code in it,which I don't trust
much... call it a design disagreement with Hiroshi ;-)

Is it LockClassForUpdate() ? If so it's never a special function.
It's only implementing a 'FOR UPDATE' part of 'SELECT .. FROM PG_CLASS'
and 'select .. for update' before 'update ..' is an oridinary
sequence of update operations.

Is there a way to do this under 7.0.3?

REINDEX for user tables is available in 7.0.3 but it isn't
safe because it must overwrite the existent index files.

Regards,
Hiroshi Inoue