Primary Key Index Bloat?

Started by David G. Johnstonover 12 years ago3 messagesgeneral
Jump to latest
#1David G. Johnston
david.g.johnston@gmail.com

Hi,

I have a table with the following usage characteristic:

INSERT bulk data using INSERT INTO ... SELECT ... FROM
-- this table uses a varchar(50) for the PK
-- the PK is rarely (effectively never unless a mistake was made) altered
-- always appending to the existing table; some bulk deletions in the past
-- no toasting of values

UPDATE these records many, many times (but never the PK)

At the moment the table itself (and thus all columns) is showing as size of
just under 5GB
The corresponding index for the PK (a single column, also present on the
table...) is between 8-10GB [1]I actually inadvertently have two indexes, one unique and one not, over the same field. The sizes of these two indexes are the rough endpoints of the 8-10GB range I provided. It appears that both indexes are being chosen for use by the planner. I have never REINDEXed this table that I
can recall.

I am guessing that it is the need for the index to point to new versions of
the physical record that the index is churning so much and causing this kind
of bloat?

[1]: I actually inadvertently have two indexes, one unique and one not, over the same field. The sizes of these two indexes are the rough endpoints of the 8-10GB range I provided. It appears that both indexes are being chosen for use by the planner.
the same field. The sizes of these two indexes are the rough endpoints of
the 8-10GB range I provided. It appears that both indexes are being chosen
for use by the planner.

I am preparing to REINDEX the unique index and DROP the non-unique one over
the same field - probably Tuesday evening. Does everything I am saying here
sound kosher or would someone like me to provide additional information?

The deletions were fairly voluminous when they occurred and so I would have
expected that enough empty pages would have been freed to avoid any
significant index bloat.

I have mostly left VACUUMing to auto-vacuum though the occasional manual one
has been performed. Never performed VACUUM FULL.

Been running 9.0 on Ubuntu 10.04 since day one with a pg_restore load. A
while ago and no comparative metrics available.

I greatly appreciate any sanity checks you can provide.

Thanks!

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/Primary-Key-Index-Bloat-tp5778840.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#2Chris Ernst
cernst@zvelo.com
In reply to: David G. Johnston (#1)
Re: Primary Key Index Bloat?

On 11/17/2013 11:48 PM, David Johnston wrote:

I am guessing that it is the need for the index to point to new versions of
the physical record that the index is churning so much and causing this kind
of bloat?

Bingo.

I am preparing to REINDEX the unique index and DROP the non-unique one over
the same field - probably Tuesday evening. Does everything I am saying here
sound kosher or would someone like me to provide additional information?

...

I have mostly left VACUUMing to auto-vacuum though the occasional manual one
has been performed. Never performed VACUUM FULL.

REINDEX will only rebuild the indexes (obviously) and VACUUM FULL would
clean up any bloat in the indexes and the table itself. Either would
require an exclusive lock on the table.

If the exclusive lock is a problem, you might look in to pg_repack as an
option as well. It essentially recreates the table in parallel then
swaps in the newly built one. It only requires an exclusive lock for a
moment.

- Chris

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Sergey Konoplev
gray.ru@gmail.com
In reply to: David G. Johnston (#1)
Re: Primary Key Index Bloat?

On Sun, Nov 17, 2013 at 10:48 PM, David Johnston <polobo@yahoo.com> wrote:

I am preparing to REINDEX the unique index and DROP the non-unique one over
the same field - probably Tuesday evening. Does everything I am saying here
sound kosher or would someone like me to provide additional information?

[...]

I have mostly left VACUUMing to auto-vacuum though the occasional manual one
has been performed. Never performed VACUUM FULL.

You can try pgcompact from PgToolkit [1]https://github.com/grayhemp/pgtoolkit

pgcompact -d dbname -t tablename -r -v info

that will concurrently reindex those indexes that are bloated and will
remove bloat from the table without recreating it in the case if it is
bloated only. It might require pgstattuple extension [2]http://www.postgresql.org/docs/9.0/static/pgstattuple.html to be
installed [3]http://www.postgresql.org/docs/9.0/static/contrib.html .

[1]: https://github.com/grayhemp/pgtoolkit
[2]: http://www.postgresql.org/docs/9.0/static/pgstattuple.html
[3]: http://www.postgresql.org/docs/9.0/static/contrib.html

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray.ru@gmail.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general