problems with large table

Started by Mike Charnokyover 18 years ago11 messagesgeneral
Jump to latest
#1Mike Charnoky
noky@nextbus.com

Hi,

I'm using PostgreSQL 8.1.8 and am having trouble with a table which
contains a large amount of data. Data is constantly being inserted into
the table, roughly a million inserts per hour at peak. The table
currently has about 100 million entries which take up 14G of space (24G
with indices).

The problem in nutshell: I noticed that certain queries were
excruciatingly slow, despite the use of an index. A vacuum analyze of
the table would not complete (despite running for 2 days). A reindex
also failed to complete after one day.

The details: I was trying to perform a count(*) based on a timestamp
field in the table (which is indexed). An EXPLAIN ANALYZE showed a high
cost even though an index scan was used. I tried to VACUUM ANALYZE the
table, thinking this might help. Yes, autovacuum is turned on, but
since pg8.1 does not store info about when a table was last vacuumed, I
decided to run this manually. After several hours, the vacuum did not
complete. So, I disabled the process which was writing to this table
and tried "set vacuum_cost_delay=0" before vacuuming. After two days,
the vacuum did not complete, so I stopped it and tried to reindex the
table, thinking that indices were corrupted. This also failed to
complete after one day.

At this point, I'm at a loss. I've searched the archives for similar
problems, but none of the suggestions have worked. Is the data in this
table corrupted? Why are both vacuum and reindex failing to complete?
Is there some sort of fine-tuning I should pay attention to? Any help
is much appreciated.

Mike

#2Joshua D. Drake
jd@commandprompt.com
In reply to: Mike Charnoky (#1)
Re: problems with large table

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Mike Charnoky wrote:

Hi,

At this point, I'm at a loss. I've searched the archives for similar
problems, but none of the suggestions have worked. Is the data in this
table corrupted? Why are both vacuum and reindex failing to complete?
Is there some sort of fine-tuning I should pay attention to? Any help
is much appreciated.

At this point, you are in a world of hurt :). If you stop a vacuum you
have created a huge mess of dead rows in that table. My suggestion is
this to create a new table that is populated from the old table, rename
the old table to big_table new, rename new table to old table. Run analyze.

Try again. :)

Sincerely,

Joshua D. Drake

Mike

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

- --

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997 http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG6BO5ATb/zqfZUUQRAspCAJ4l6oC2C+JM2IRyvRIn8m5Gs+0ofQCcCFx4
HOjgCaz1wE405GtmTzf/dyw=
=x/TT
-----END PGP SIGNATURE-----

#3Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Joshua D. Drake (#2)
Re: problems with large table

Joshua D. Drake wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

At this point, I'm at a loss. I've searched the archives for similar
problems, but none of the suggestions have worked. Is the data in this
table corrupted? Why are both vacuum and reindex failing to complete?
Is there some sort of fine-tuning I should pay attention to? Any help
is much appreciated.

At this point, you are in a world of hurt :). If you stop a vacuum you
have created a huge mess of dead rows in that table. My suggestion is
this to create a new table that is populated from the old table, rename
the old table to big_table new, rename new table to old table. Run analyze.

Running CLUSTER is a faster and less error-prone way of doing the same thing.

--
Alvaro Herrera Valdivia, Chile ICBM: S 39� 49' 18.1", W 73� 13' 56.4"
"No single strategy is always right (Unless the boss says so)"
(Larry Wall)

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joshua D. Drake (#2)
Re: problems with large table

"Joshua D. Drake" <jd@commandprompt.com> writes:

At this point, you are in a world of hurt :). If you stop a vacuum you
have created a huge mess of dead rows in that table.

Only if it was a vacuum full, which he didn't mention having tried.

I'm kinda wondering whether the vacuum and reindex did anything at all,
or were blocked by some other process holding a lock on the table.
If they weren't blocked, then the problem is insufficient patience,
possibly combined with insufficient maintenance_work_mem.

regards, tom lane

#5Mike Charnoky
noky@nextbus.com
In reply to: Tom Lane (#4)
Re: problems with large table

I have never heard that stopping a vacuum is problematic... I have had
to do this many times in the past without any adverse affects. Is there
some sort of documentation which elaborates on this issue?

For the record, I did a VACUUM ANALYZE, not FULL. Now that I think
about it, I probably should have used VERBOSE to see what is happening.
Nothing else was accessing the database, so no process had a lock on
the table.

Tom, regarding insufficient patience: are you suggesting that it is
normal for a vacuum of a table this size to take more than two days
under these circumstances? maintenance_work_mem is 16384.

Joshua: I'm copying the data to a new table right now, I'll see how that
goes.

Alvaro: The cluster suggestion probably won't help in my case since data
in the table should already be naturally ordered by date.

Mike

Tom Lane wrote:

Show quoted text

"Joshua D. Drake" <jd@commandprompt.com> writes:

At this point, you are in a world of hurt :). If you stop a vacuum you
have created a huge mess of dead rows in that table.

Only if it was a vacuum full, which he didn't mention having tried.

I'm kinda wondering whether the vacuum and reindex did anything at all,
or were blocked by some other process holding a lock on the table.
If they weren't blocked, then the problem is insufficient patience,
possibly combined with insufficient maintenance_work_mem.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

#6Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Mike Charnoky (#5)
Re: problems with large table

Mike Charnoky wrote:

Alvaro: The cluster suggestion probably won't help in my case since data
in the table should already be naturally ordered by date.

It's not helpful only for reordering, but also for getting rid of dead
tuples.

--
Alvaro Herrera http://www.flickr.com/photos/alvherre/
"Porque Kim no hacia nada, pero, eso s�,
con extraordinario �xito" ("Kim", Kipling)
/bin/bash: sigcommand: command not found

#7Phoenix Kiula
phoenix.kiula@gmail.com
In reply to: Alvaro Herrera (#6)
Re: problems with large table

On 13/09/2007, Alvaro Herrera <alvherre@commandprompt.com> wrote:

Mike Charnoky wrote:

Alvaro: The cluster suggestion probably won't help in my case since data
in the table should already be naturally ordered by date.

It's not helpful only for reordering, but also for getting rid of dead
tuples.

Apart from creating a new table, indexing it, then renaming it to
original table -- is there an alternative to CLUSTER that doesn't
impose a painful ACCESS EXCLUSIVE lock on the table? We are on
Postgres 8.2.3 and have a heavy duty table that starts showing its
limits after a week or so. Autovacuum is on and working. FSM etc is
fine, maintenance_work_mem is 256MB. But cluster still takes upwards
of 30 minutes, which is unacceptable downtime for our web service.
Thanks for any tips!

#8Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Phoenix Kiula (#7)
Re: problems with large table

Phoenix Kiula escribi�:

On 13/09/2007, Alvaro Herrera <alvherre@commandprompt.com> wrote:

Mike Charnoky wrote:

Alvaro: The cluster suggestion probably won't help in my case since data
in the table should already be naturally ordered by date.

It's not helpful only for reordering, but also for getting rid of dead
tuples.

Apart from creating a new table, indexing it, then renaming it to
original table -- is there an alternative to CLUSTER that doesn't
impose a painful ACCESS EXCLUSIVE lock on the table? We are on
Postgres 8.2.3 and have a heavy duty table that starts showing its
limits after a week or so. Autovacuum is on and working. FSM etc is
fine, maintenance_work_mem is 256MB. But cluster still takes upwards
of 30 minutes, which is unacceptable downtime for our web service.
Thanks for any tips!

How large is this table, and how frequently is it updated?

--
Alvaro Herrera http://www.amazon.com/gp/registry/DXLWNGRJD34J
"El Maquinismo fue proscrito so pena de cosquilleo hasta la muerte"
(Ijon Tichy en Viajes, Stanislaw Lem)

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Phoenix Kiula (#7)
Re: problems with large table

"Phoenix Kiula" <phoenix.kiula@gmail.com> writes:

Apart from creating a new table, indexing it, then renaming it to
original table -- is there an alternative to CLUSTER that doesn't
impose a painful ACCESS EXCLUSIVE lock on the table? We are on
Postgres 8.2.3 and have a heavy duty table that starts showing its
limits after a week or so. Autovacuum is on and working. FSM etc is
fine, maintenance_work_mem is 256MB. But cluster still takes upwards
of 30 minutes, which is unacceptable downtime for our web service.
Thanks for any tips!

If you're seeing steady bloat then FSM isn't as fine as you think.

regards, tom lane

#10Phoenix Kiula
phoenix.kiula@gmail.com
In reply to: Tom Lane (#9)
Re: problems with large table

On 13/09/2007, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Phoenix Kiula" <phoenix.kiula@gmail.com> writes:

Apart from creating a new table, indexing it, then renaming it to
original table -- is there an alternative to CLUSTER that doesn't
impose a painful ACCESS EXCLUSIVE lock on the table? We are on
Postgres 8.2.3 and have a heavy duty table that starts showing its
limits after a week or so. Autovacuum is on and working. FSM etc is
fine, maintenance_work_mem is 256MB. But cluster still takes upwards
of 30 minutes, which is unacceptable downtime for our web service.
Thanks for any tips!

If you're seeing steady bloat then FSM isn't as fine as you think.

I am not sure if there's steady bloat. Of the two databases we have,
the VACUUM ANALYZE VERBOSE shows about 133,000 pages on one and about
77,000 on the other. My max_fsm_pages is 250,000 -- well above that
total limit.

Other possibly related settings:

vacuum_cost_delay = 10
stats_start_collector = on
stats_row_level = on
autovacuum_vacuum_threshold = 300
autovacuum_analyze_threshold = 100
wal_buffers=64
checkpoint_segments=64
checkpoint_timeout=900

Anything wrong with these?

I tried a CLUSTER on one index and it was on for about an hour without
completion.

#11Mike Charnoky
noky@nextbus.com
In reply to: Joshua D. Drake (#2)
Re: problems with large table

Thanks, recreating the table solved my problems. Our team is working on
implementing some performance tuning based on other recommendations from
the list (FSM, etc).

Mike

Joshua D. Drake wrote:

Show quoted text

At this point, you are in a world of hurt :). If you stop a vacuum you
have created a huge mess of dead rows in that table. My suggestion is
this to create a new table that is populated from the old table, rename
the old table to big_table new, rename new table to old table. Run analyze.