Re: Vacuum time degrading
Wes <wespvp@syntegra.com> writes:
On 3/2/05 10:50 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
It wouldn't be easy --- there are some locking considerations that say
btbulkdelete needs to scan the index in the same order that an ordinary
scan would do. See the nbtree README for details.
Just a follow-up on this..
The vacuum time has been steadily increasing at a seemingly increasing rate,
although there are no deletes or updates to the database. The current DB
size is just over 500 million rows. Last week it was up to 6.84 hours to do
a vacuum. Over the weekend I reindexed all the major indexes. The two
largest indexes took about 10 hours to reindex both. After the reindexing,
the vacuum took only 1.44 hours. This is pretty much a linear scaling from
the original vacuum time I reported.
So, the increasing vacuum times would appear to be as Tom suggested - due to
the fact that vacuum processes indexes in index order, not physical disk
order. I guess we add a periodic reindex to our maintenance procedures...
That doesn't follow from what you said. Did you check that the physical
sizes of the indexes were comparable before and after the reindex?
regards, tom lane
Import Notes
Reply to msg id not found: BE76BBB0.9E2F%wespvp@syntegra.comReference msg id not found: BE76BBB0.9E2F%wespvp@syntegra.com
On 4/4/05 8:50 AM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
That doesn't follow from what you said. Did you check that the physical
sizes of the indexes were comparable before and after the reindex?
No, how do I do that (or where is it documented how to do it)?
How is it not consistent? I believe you suggested the reindex. The initial
timing was shortly after a database reload. The index would have been built
in sorted order, correct? This was the 1 hour time. After a period of
months, the index values have been inserted in random order. The DB size is
up 50% but the vacuum time is up a factor of 6+. Presumably the index is
being read by moving the heads all over the place. I reindex, and the index
is rebuilt in sorted order. Vacuum is now down to 1.5 hours - a linear
scaling from the original numbers. The locality of reference in reading the
indexes in order should be much better.
Wes
Wes <wespvp@syntegra.com> writes:
On 4/4/05 8:50 AM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
That doesn't follow from what you said. Did you check that the physical
sizes of the indexes were comparable before and after the reindex?
No, how do I do that (or where is it documented how to do it)?
The best way is probably to capture the output of VACUUM VERBOSE (not FULL)
for the table before and after REINDEX.
How is it not consistent?
I didn't say it wasn't consistent, just that it doesn't prove the
point. The speedup you saw could have been from elimination of index
bloat more than from bringing the index into physically sorted order.
An estimate of the overall database size doesn't really tell us how
much this particular table's indexes changed in size.
regards, tom lane
On 4/5/05 11:15 AM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
I didn't say it wasn't consistent, just that it doesn't prove the
point. The speedup you saw could have been from elimination of index
bloat more than from bringing the index into physically sorted order.
An estimate of the overall database size doesn't really tell us how
much this particular table's indexes changed in size.
Ok, now I follow. Taking the biggest indexes:
The weekend before:
INFO: index "message_recipients_i_recip_date" now contains 393961361 row
versions in 2435100 pages
INFO: index "message_recipients_i_message" now contains 393934394 row
versions in 1499853 pages
After reindex:
INFO: index "message_recipients_i_recip_date" now contains 401798357 row
versions in 1765613 pages
INFO: index "message_recipients_i_message" now contains 401787237 row
versions in 1322974 pages
Wes
Wes <wespvp@syntegra.com> writes:
Ok, now I follow. Taking the biggest indexes:
The weekend before:
INFO: index "message_recipients_i_recip_date" now contains 393961361 row
versions in 2435100 pages
INFO: index "message_recipients_i_message" now contains 393934394 row
versions in 1499853 pages
After reindex:
INFO: index "message_recipients_i_recip_date" now contains 401798357 row
versions in 1765613 pages
INFO: index "message_recipients_i_message" now contains 401787237 row
versions in 1322974 pages
OK, that's certainly not a factor-of-four difference in size, so I'm
now convinced you're right: bringing the index into physical order is
having a big impact on the runtime.
From a development standpoint, that suggests a couple of TODO items:
* Look harder at whether VACUUM can scan the index in physical instead
of logical order.
* See whether ordinary btree maintenance (ie page splits) can do
anything to maintain/improve the physical ordering of the index.
regards, tom lane