Arrays and Performance
A few performance issues using PostgreSQL's arrays led us to the
question how postgres actually stores variable length arrays. First,
let me explain our situation.
We have a rather large table containing a simple integer primary key
and a couple more columns of fixed size. However, there is a dates
column of type "timestamp without time zone[]" that is apparently
causing some severe performance problems.
During a daily update process new timestamps are collected and
existing data rows are being updated (new rows are also being added).
These changes affect a large percentage of the existing rows.
What we have been observing in the last few weeks is, that the
overall database size is increasing rapidly due to this table and
vacuum processes seem to deadlock with other processes querying data
from this table.
Therefore, the the database keeps growing and becomes more and more
unusable. The only thing that helps is dumping and restoring it which
is nothing you are eager to do on a large live system and a daily basis.
This problem led us to the question, how these arrays are stored
internally. Are they stored "in-place" with the other columns or
merely as a pointer to another file?
Would it be more efficient to not use an array for this purpose but
split the table in two parts?
Any help is appreciated!
Marc Philipp
----------------------------------------------------------------
This message was sent using ATIS-Webmail: http://www.atis.uka.de
s_philip@ira.uka.de wrote:
Would it be more efficient to not use an array for this purpose but
split the table in two parts?Any help is appreciated!
This is a duplicate of your post from the other day, to which I
responded, as did Tom Lane:
http://archives.postgresql.org/pgsql-general/2006-01/msg00104.php
http://archives.postgresql.org/pgsql-general/2006-01/msg00108.php
Did you not receive those replies?
Joe
On Fri, Jan 06, 2006 at 09:43:53AM +0100, s_philip@ira.uka.de wrote:
What we have been observing in the last few weeks is, that the
overall database size is increasing rapidly due to this table and
vacuum processes seem to deadlock with other processes querying data
from this table.
Are you seeing deadlock errors? How often are you vacuuming?
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Sorry for the duplicate post! My first post was stalled and my mail
server down for a day or so. I will reply to your original posts.
Regards, Marc Philipp
No, we don't get deadlock errors, but when running a vacuum and another
process writing into the database there progress will stop at some point
and nothing happens until one process is being killed.
I think we used to vacuum every two nights and did a full vacuum once a
week.
Regards, Marc Philipp