Vacuuming

Started by Paul Lambertalmost 19 years ago9 messagesgeneral
Jump to latest
#1Paul Lambert
paul.lambert@autoledgers.com.au

Is there any point to vacuuming a table if it has been bulk-populated by
data after a truncate?

I.e. If I do this:
TRUNCATE TABLE vehicles;
INSERT INTO vehicles (SELECT DISTINCT ON (dealer_id,vehicle_address) *
FROM vehicles_temp_load WHERE (dealer_id,vehicle_address) is not null);

Is there any point in vacuuming?

Also, is there any point in recreating indexes on this table after a
load like this or will indexes have been correctly maintained/updated by
the above insert. Note: This insert often loads tens of millions of records.

BTW, this is on Windows.

Thanks,
Paul.

--
Paul Lambert
Database Administrator
AutoLedgers

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Paul Lambert (#1)
Re: Vacuuming

Paul Lambert <paul.lambert@autoledgers.com.au> writes:

Is there any point to vacuuming a table if it has been bulk-populated by
data after a truncate?

I.e. If I do this:
TRUNCATE TABLE vehicles;
INSERT INTO vehicles (SELECT DISTINCT ON (dealer_id,vehicle_address) *
FROM vehicles_temp_load WHERE (dealer_id,vehicle_address) is not null);

Is there any point in vacuuming?

The only thing a vacuum would do for you there is set the commit hint
bits on the newly-inserted rows. Which might be worth doing if you want
to get the table into a totally "clean" state, but it's probably a bit
excessive. SELECTs on the table will set the hint bits anyway as
they visit not-yet-hinted rows, so it's really a matter of do you want
to pay that overhead all at once or spread-out.

What you *do* want to do in this situation is an ANALYZE.

regards, tom lane

#3Paul Lambert
paul.lambert@autoledgers.com.au
In reply to: Tom Lane (#2)
Re: Vacuuming

Tom Lane wrote:

The only thing a vacuum would do for you there is set the commit hint
bits on the newly-inserted rows. Which might be worth doing if you want
to get the table into a totally "clean" state, but it's probably a bit
excessive. SELECTs on the table will set the hint bits anyway as
they visit not-yet-hinted rows, so it's really a matter of do you want
to pay that overhead all at once or spread-out.

What you *do* want to do in this situation is an ANALYZE.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org/

Thanks Tom.

Should the ANALYZE be done before or after indexes are built? Or is that
irrelevant? Should I not even bother rebuilding indexes when I do these
loads?

Currently I:
1) Drop Indexes
2) Truncate and copy in new data
3) Vacuum - now changed to analyze.
4) Create indexes

I add steps one and four on the assumption that adding 40 million
records in one hit might get the indexes confused - but if they are
pretty stable I can remove these steps.

P.

--
Paul Lambert
Database Administrator
AutoLedgers

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Paul Lambert (#3)
Re: Vacuuming

Paul Lambert <paul.lambert@autoledgers.com.au> writes:

Tom Lane wrote:

What you *do* want to do in this situation is an ANALYZE.

Should the ANALYZE be done before or after indexes are built? Or is that
irrelevant?

For ordinary indexes it doesn't matter. If you have any expression
indexes then you should build them before running ANALYZE, because
ANALYZE takes the hint to collect stats on those expressions as well
as the raw column values. (Eventually this advice might apply to
multicolumn and partial indexes as well, but right now ANALYZE doesn't
treat those specially, AFAIR.) In any case there's no good reason
to do ANALYZE first if you have a free choice.

Should I not even bother rebuilding indexes when I do these loads?

There's some value in the advice to "drop indexes, load data, recreate
indexes". TRUNCATE will happily truncate the indexes to nothing along
with the table, but when you then load data you are building the indexes
incrementally instead of in-bulk. This process is slower than a bulk
index build and ends up with a more-fragmented index. (At least for
btree indexes --- I'm not sure which other index types are smarter
about bulk vs incremental build.)

Currently I:
1) Drop Indexes
2) Truncate and copy in new data
3) Vacuum - now changed to analyze.
4) Create indexes

I'd interchange steps 3 and 4; otherwise you are good.

regards, tom lane

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#2)
Re: Vacuuming

I wrote:

Is there any point in vacuuming?

The only thing a vacuum would do for you there is set the commit hint
bits on the newly-inserted rows. Which might be worth doing if you want
to get the table into a totally "clean" state, but it's probably a bit
excessive. SELECTs on the table will set the hint bits anyway as
they visit not-yet-hinted rows, so it's really a matter of do you want
to pay that overhead all at once or spread-out.

I forgot to mention that any other operation that examines every table
row will fix all the hint bits as well. In particular a CREATE INDEX
would do that --- so if you are planning to create some indexes then
there's certainly no point in a VACUUM just after a table load.

regards, tom lane

#6Paul Lambert
paul.lambert@autoledgers.com.au
In reply to: Tom Lane (#5)
Re: Vacuuming

Tom Lane wrote:

I forgot to mention that any other operation that examines every table
row will fix all the hint bits as well. In particular a CREATE INDEX
would do that --- so if you are planning to create some indexes then
there's certainly no point in a VACUUM just after a table load.

regards, tom lane

Thanks for all the help Tom, educational as always.

--
Paul Lambert
Database Administrator
AutoLedgers

#7Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: Tom Lane (#5)
Re: Vacuuming

On 5/8/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I forgot to mention that any other operation that examines every table
row will fix all the hint bits as well. In particular a CREATE INDEX
would do that ---

I might be missing something, but I think CREATE INDEX work on
SnapshotAny and hence may not actually examine any table row ?

Thanks,
Pavan

--

EnterpriseDB http://www.enterprisedb.com

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavan Deolasee (#7)
Re: Vacuuming

"Pavan Deolasee" <pavan.deolasee@gmail.com> writes:

On 5/8/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I forgot to mention that any other operation that examines every table
row will fix all the hint bits as well. In particular a CREATE INDEX
would do that ---

I might be missing something, but I think CREATE INDEX work on
SnapshotAny and hence may not actually examine any table row ?

SnapshotAny is a no-op, but HeapTupleSatisfiesVacuum isn't.

regards, tom lane

#9Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: Tom Lane (#8)
Re: Vacuuming

On 5/8/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

SnapshotAny is a no-op, but HeapTupleSatisfiesVacuum isn't.

Oh yes. My apologies for forgetting IndexBuildHeapScan()

Thanks,
Pavan

--

EnterpriseDB http://www.enterprisedb.com