vacuum

Started by Mike Cianflonealmost 25 years ago11 messageshackers
Jump to latest
#1Mike Cianflone
mcianflone@littlefeet-inc.com

I know that vacuum has come up in the past, and even saw the
discussion about putting a cron entry to have it run every once in a while,
but I don't remember seeing anything about having it kick off via a trigger
every so may inserts.
Is there a relative consensus for how often to run vacuum? I have a
table of about 8 columns that I fill with 100,000 items simply via a "\i
alarms.sql". After 1,000 items or so it gets extremely slow to fill with
data, and will take over a day to fill the entire thing unless I run vacuum
once a minute.
Is there a way to run vacuum every so many inserts, updates, and
deletes? (with 1 table the trigger is easy, but is there a global way to
check no matter which tables are utilized?).

Do other DBMSs such as Oracle run into the same problem and need to
be vacuumed, in their own sort of way?

Thanks for any insight,
Mike

#2Zeugswetter Andreas SB
ZeugswetterA@wien.spardat.at
In reply to: Mike Cianflone (#1)
AW: vacuum

Is there a relative consensus for how often to run vacuum? I have a
table of about 8 columns that I fill with 100,000 items simply via a "\i
alarms.sql". After 1,000 items or so it gets extremely slow to fill with
data, and will take over a day to fill the entire thing unless I run vacuum
once a minute.

You will have to tell us, what exactly your alarms.sql does, and what indexes
your table has. Above behavior is certainly not to be expected in general,
especially the "vacuum once a minute" is highly suspicious.

For a series of insert only statements, the vacuum is not supposed to help at
all, thus there must be an update hidden somewhere.

Andreas

#3The Hermit Hacker
scrappy@hub.org
In reply to: Zeugswetter Andreas SB (#2)
Re: AW: vacuum

On Wed, 13 Jun 2001, Zeugswetter Andreas SB wrote:

Is there a relative consensus for how often to run vacuum? I have a
table of about 8 columns that I fill with 100,000 items simply via a "\i
alarms.sql". After 1,000 items or so it gets extremely slow to fill with
data, and will take over a day to fill the entire thing unless I run vacuum
once a minute.

You will have to tell us, what exactly your alarms.sql does, and what indexes
your table has. Above behavior is certainly not to be expected in general,
especially the "vacuum once a minute" is highly suspicious.

For a series of insert only statements, the vacuum is not supposed to help at
all, thus there must be an update hidden somewhere.

the one question I'd have is what does the table size look like
before/after this 'once a minute vacuum'? I have one client that has
continuous UPDATEs happening to his table, no INSERTs, no DELETEs ... and
his tables quicklky grow from a 8k table to 65Meg if there is no vacuum
happening every few *hours* ...

The other question is, what are your startup parameters? What sort of
shared memory buffer are you working with? I image that VACUUM does a
flush to disk, so are 1000 items filling your buffer(s), causing an almost
continue fsync to disk for each INSERT after that ... so the VACUUM is
clearing the buffer(s) to disk, giving your space again to do quick
INSERTs for 1000 records again?

#4Andy Samuel
andysamuel@geocities.com
In reply to: The Hermit Hacker (#3)
Re: AW: vacuum

The same question ... how's the size after you vacuum the tables/db ?

continuous UPDATEs happening to his table, no INSERTs, no DELETEs ... and
his tables quicklky grow from a 8k table to 65Meg if there is no vacuum
happening every few *hours* ...

TIA
Andy

#5The Hermit Hacker
scrappy@hub.org
In reply to: Andy Samuel (#4)
Re: AW: vacuum

8k ...

On Wed, 13 Jun 2001, Andy Samuel wrote:

The same question ... how's the size after you vacuum the tables/db ?

continuous UPDATEs happening to his table, no INSERTs, no DELETEs ... and
his tables quicklky grow from a 8k table to 65Meg if there is no vacuum
happening every few *hours* ...

TIA
Andy

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org

#6Zeugswetter Andreas SB
ZeugswetterA@wien.spardat.at
In reply to: The Hermit Hacker (#5)
AW: AW: vacuum

The other question is, what are your startup parameters? What sort of
shared memory buffer are you working with? I image that VACUUM does a
flush to disk, so are 1000 items filling your buffer(s),
causing an almost
continue fsync to disk for each INSERT after that ... so the VACUUM is
clearing the buffer(s) to disk, giving your space again to do quick
INSERTs for 1000 records again?

Aah yes, you mean that the pages are only written during checkpoints ?
Then he would not need a vacuum, but a forced checkpoint.
That would actually impose, that we need yet another checkpoint initiating
event, e.g. do checkpoint when buffer is 75% dirty. Or better yet have
a daemon, that only writes out dirty pages from back to front of the LRU queue
until only a configurable amount of buffers is dirty.

Andreas

#7Doug McNaught
doug@wireboard.com
In reply to: Mike Cianflone (#1)
Re: vacuum

Mike Cianflone <mcianflone@littlefeet-inc.com> writes:

I know that vacuum has come up in the past, and even saw the
discussion about putting a cron entry to have it run every once in a while,
but I don't remember seeing anything about having it kick off via a trigger
every so may inserts.
Is there a relative consensus for how often to run vacuum? I have a
table of about 8 columns that I fill with 100,000 items simply via a "\i
alarms.sql". After 1,000 items or so it gets extremely slow to fill with
data, and will take over a day to fill the entire thing unless I run vacuum
once a minute.

Ummm... Are you wrapping the inserts in a transaction? If not, no
wonder it's slow--you're paying transaction overhead for each insert.

I don't think VACUUM has that much effect on insert performance, as
insert doesn't leave empty space in existing pages.

Try using COPY instead of INSERT, or at least wrap all your inserts in
a transaction. Also, drop indexes before running the insert, and
recreate them afterwards.

-Doug
--
The rain man gave me two cures; he said jump right in,
The first was Texas medicine--the second was just railroad gin,
And like a fool I mixed them, and it strangled up my mind,
Now people just get uglier, and I got no sense of time... --Dylan

#8Mike Cianflone
mcianflone@littlefeet-inc.com
In reply to: Doug McNaught (#7)
RE: vacuum

After the comment by someone about the UPDATE being responsible for
the reason for vacuuming (sorry, I didn't know that), I looked into a stored
procedure that gets triggered during an insert. The stored procedure does an
UPDATE on another table, for every insert. So inserting 100,000 items into
the table causes an update on 100,000 items in another table. I noticed that
the other table's file size gets very large (right now it's over a megabyte
and only 10% complete inserting), even though there are only about 5 items
in that table. Since that table has the UPDATE happening to it, it's getting
large. A vacuum chops it down to 8K.
I tried increasing the buffer size, and that made the 100,000
inserts (with the corresponding update) go longer before hitting the barrier
and slowing down tremendously (until another vacuum is done).

Since vacuum isn't tied to a time, but rather the size of the
buffers? or the indices? it would seem plausible to do as another person had
mentioned and have vacuum kick off when the buffers are xx% full.

Mike

-----Original Message-----
From: Zeugswetter Andreas SB [mailto:ZeugswetterA@wien.spardat.at]
Sent: Wednesday, June 13, 2001 1:04 AM
To: 'Mike Cianflone'; Hackers List
Subject: AW: [HACKERS] vacuum

Is there a relative consensus for how often to run vacuum? I have a
table of about 8 columns that I fill with 100,000 items simply via a "\i
alarms.sql". After 1,000 items or so it gets extremely slow to fill with
data, and will take over a day to fill the entire thing unless I run

vacuum

once a minute.

You will have to tell us, what exactly your alarms.sql does, and what
indexes
your table has. Above behavior is certainly not to be expected in general,
especially the "vacuum once a minute" is highly suspicious.

For a series of insert only statements, the vacuum is not supposed to help
at
all, thus there must be an update hidden somewhere.

Andreas

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

#9Peter Eisentraut
peter_e@gmx.net
In reply to: Mike Cianflone (#8)
RE: vacuum

Mike Cianflone writes:

After the comment by someone about the UPDATE being responsible for
the reason for vacuuming (sorry, I didn't know that), I looked into a stored
procedure that gets triggered during an insert. The stored procedure does an
UPDATE on another table, for every insert. So inserting 100,000 items into
the table causes an update on 100,000 items in another table. I noticed that
the other table's file size gets very large (right now it's over a megabyte
and only 10% complete inserting), even though there are only about 5 items
in that table. Since that table has the UPDATE happening to it, it's getting
large. A vacuum chops it down to 8K.

An UPDATE basically acts as "append and mark old record obsolete". This
is the so-called non-overwriting storage manager which keeps all data,
even deleted data, indefinitely until you run vacuum (hence the name).

I tried increasing the buffer size, and that made the 100,000
inserts (with the corresponding update) go longer before hitting the barrier
and slowing down tremendously (until another vacuum is done).

Increasing the buffer size will keep the data in memory longer before
pushing it out to disk. The net result will be the same though.

--
Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter

#10The Hermit Hacker
scrappy@hub.org
In reply to: Mike Cianflone (#8)
RE: vacuum

Now that you've narrowed it down to a specific table, at least you can
specifically vacuum just that table and ignore the rest of the database
...might help a bit?

On Wed, 13 Jun 2001, Mike Cianflone wrote:

After the comment by someone about the UPDATE being responsible for
the reason for vacuuming (sorry, I didn't know that), I looked into a stored
procedure that gets triggered during an insert. The stored procedure does an
UPDATE on another table, for every insert. So inserting 100,000 items into
the table causes an update on 100,000 items in another table. I noticed that
the other table's file size gets very large (right now it's over a megabyte
and only 10% complete inserting), even though there are only about 5 items
in that table. Since that table has the UPDATE happening to it, it's getting
large. A vacuum chops it down to 8K.
I tried increasing the buffer size, and that made the 100,000
inserts (with the corresponding update) go longer before hitting the barrier
and slowing down tremendously (until another vacuum is done).

Since vacuum isn't tied to a time, but rather the size of the
buffers? or the indices? it would seem plausible to do as another person had
mentioned and have vacuum kick off when the buffers are xx% full.

Mike

-----Original Message-----
From: Zeugswetter Andreas SB [mailto:ZeugswetterA@wien.spardat.at]
Sent: Wednesday, June 13, 2001 1:04 AM
To: 'Mike Cianflone'; Hackers List
Subject: AW: [HACKERS] vacuum

Is there a relative consensus for how often to run vacuum? I have a
table of about 8 columns that I fill with 100,000 items simply via a "\i
alarms.sql". After 1,000 items or so it gets extremely slow to fill with
data, and will take over a day to fill the entire thing unless I run

vacuum

once a minute.

You will have to tell us, what exactly your alarms.sql does, and what
indexes
your table has. Above behavior is certainly not to be expected in general,
especially the "vacuum once a minute" is highly suspicious.

For a series of insert only statements, the vacuum is not supposed to help
at
all, thus there must be an update hidden somewhere.

Andreas

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl

Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org

#11Ross J. Reedstrom
reedstrm@rice.edu
In reply to: The Hermit Hacker (#10)
Re: vacuum

On Wed, Jun 13, 2001 at 06:24:10PM -0300, The Hermit Hacker wrote:

Now that you've narrowed it down to a specific table, at least you can
specifically vacuum just that table and ignore the rest of the database
...might help a bit?

Even better: since he's loading a script anyway, the script could start
a transaction, disable the trigger, load the data, bulk UPDATE the other
table, and re-enable the trigger. I bet it takes only a few minutes to
do the whole thing that way.

Ross

Show quoted text

On Wed, 13 Jun 2001, Mike Cianflone wrote:

After the comment by someone about the UPDATE being responsible for
the reason for vacuuming (sorry, I didn't know that), I looked into a stored
procedure that gets triggered during an insert. The stored procedure does an
UPDATE on another table, for every insert. So inserting 100,000 items into
the table causes an update on 100,000 items in another table. I noticed that
the other table's file size gets very large (right now it's over a megabyte
and only 10% complete inserting), even though there are only about 5 items
in that table. Since that table has the UPDATE happening to it, it's getting
large. A vacuum chops it down to 8K.
I tried increasing the buffer size, and that made the 100,000
inserts (with the corresponding update) go longer before hitting the barrier
and slowing down tremendously (until another vacuum is done).

Since vacuum isn't tied to a time, but rather the size of the
buffers? or the indices? it would seem plausible to do as another person had
mentioned and have vacuum kick off when the buffers are xx% full.

Mike

-----Original Message-----
From: Zeugswetter Andreas SB [mailto:ZeugswetterA@wien.spardat.at]
Sent: Wednesday, June 13, 2001 1:04 AM
To: 'Mike Cianflone'; Hackers List
Subject: AW: [HACKERS] vacuum

Is there a relative consensus for how often to run vacuum? I have a
table of about 8 columns that I fill with 100,000 items simply via a "\i
alarms.sql". After 1,000 items or so it gets extremely slow to fill with
data, and will take over a day to fill the entire thing unless I run

vacuum

once a minute.

You will have to tell us, what exactly your alarms.sql does, and what
indexes
your table has. Above behavior is certainly not to be expected in general,
especially the "vacuum once a minute" is highly suspicious.

For a series of insert only statements, the vacuum is not supposed to help
at
all, thus there must be an update hidden somewhere.

Andreas

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl

Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly