A question about Vacuum analyze

Started by Emi Luabout 20 years ago10 messagesgeneral
Jump to latest
#1Emi Lu
emilu@encs.concordia.ca

Hello,

We have a daily cronjob and in the cronjob we do:
1. truncate one table A
2. insert into table A
3. do comparision table A and table B and update table B accordingly

The doc says "VACUUM ANALYZE command for the affected table. This will
update the system catalogs with the results of all recent changes, and
allow the PostgreSQL query planner to make better choices in planning
queries".
So, I believe after insertion at step 2, comparision will be speed up if
we do a "vacuum analyze" right?

In another way, whenever we "delete/truncate and then insert" data into
a table, it is better to "vacuum anaylze"?

Thanks a lot!
Emi

#2Ragnar
gnari@hive.is
In reply to: Emi Lu (#1)
Re: A question about Vacuum analyze

On fim, 2006-02-16 at 09:12 -0500, Emi Lu wrote:

Hello,

We have a daily cronjob and in the cronjob we do:
1. truncate one table A
2. insert into table A
3. do comparision table A and table B and update table B accordingly

The doc says "VACUUM ANALYZE command for the affected table. This will
update the system catalogs with the results of all recent changes, and
allow the PostgreSQL query planner to make better choices in planning
queries".
So, I believe after insertion at step 2, comparision will be speed up if
we do a "vacuum analyze" right?

after only truncate and inserts , vacuum is not needed,
but ANALYZE is good.

In another way, whenever we "delete/truncate and then insert" data into
a table, it is better to "vacuum anaylze"?

after deleting or updating a significant percentage of
rows, VACUUM can be called for, but usually VACUUM is
done as part of regular maintenance, or by autovacuum.

gnari

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Emi Lu (#1)
Re: A question about Vacuum analyze

Emi Lu <emilu@encs.concordia.ca> writes:

In another way, whenever we "delete/truncate and then insert" data into
a table, it is better to "vacuum anaylze"?

You shouldn't need a VACUUM if you haven't yet done any updates or
deletes since the TRUNCATE. An ANALYZE seems like a good idea, though.
(You could get away without ANALYZE if the new data has essentially the
same statistics as the old, but if you're making only minor changes, why
are you using this technique at all ...)

regards, tom lane

#4Emi Lu
emilu@encs.concordia.ca
In reply to: Tom Lane (#3)
Re: A question about Vacuum analyze

In another way, whenever we "delete/truncate and then insert" data into
a table, it is better to "vacuum anaylze"?

You shouldn't need a VACUUM if you haven't yet done any updates or
deletes since the TRUNCATE. An ANALYZE seems like a good idea, though.
(You could get away without ANALYZE if the new data has essentially the
same statistics as the old, but if you're making only minor changes, why
are you using this technique at all ...)

After truncate table A, around 60,000 will be inserted. Then a
comparision will be done between table A and table B. After that, table
B will be updated according to the comparision result. Records inserted
into table A is increasing everyday.

So, your suggestion is that after the population of table A, the query
planner should be able to find the most efficient query plan because we
do truncate but not delete, and we do not need to do vacuum analyze at
all, right?

Thank you,
Emi

#5Ragnar
gnari@hive.is
In reply to: Emi Lu (#4)
Re: A question about Vacuum analyze

On fim, 2006-02-16 at 16:24 -0500, Emi Lu wrote:

In another way, whenever we "delete/truncate and then insert" data into
a table, it is better to "vacuum anaylze"?

You shouldn't need a VACUUM if you haven't yet done any updates or
deletes since the TRUNCATE. An ANALYZE seems like a good idea, though.
(You could get away without ANALYZE if the new data has essentially the
same statistics as the old, but if you're making only minor changes, why
are you using this technique at all ...)

After truncate table A, around 60,000 will be inserted. Then a
comparision will be done between table A and table B. After that, table
B will be updated according to the comparision result. Records inserted
into table A is increasing everyday.

So, your suggestion is that after the population of table A, the query
planner should be able to find the most efficient query plan because we
do truncate but not delete, and we do not need to do vacuum analyze at
all, right?

no. the suggestion was that a VACUUM is not needed, but
that an ANALYZE might be.

gnari

#6Emi Lu
emilu@encs.concordia.ca
In reply to: Ragnar (#5)
Re: A question about Vacuum analyze

In another way, whenever we "delete/truncate and then insert" data into
a table, it is better to "vacuum anaylze"?

You shouldn't need a VACUUM if you haven't yet done any updates or
deletes since the TRUNCATE. An ANALYZE seems like a good idea, though.
(You could get away without ANALYZE if the new data has essentially the
same statistics as the old, but if you're making only minor changes, why
are you using this technique at all ...)

After truncate table A, around 60,000 will be inserted. Then a
comparision will be done between table A and table B. After that, table
B will be updated according to the comparision result. Records inserted
into table A is increasing everyday.

So, your suggestion is that after the population of table A, the query
planner should be able to find the most efficient query plan because we
do truncate but not delete, and we do not need to do vacuum analyze at
all, right?

no. the suggestion was that a VACUUM is not needed, but
that an ANALYZE might be.

Thank you gnari for your answer. But I am a bit confused about not running vacuum but only "analyze". Can I seperate these two operations? I guess "vacuum analyze" do both vacuum and analyze. Or "EXPLAIN ANALYZE" can do it for me?

Emi

#7Scott Marlowe
smarlowe@g2switchworks.com
In reply to: Emi Lu (#6)
Re: A question about Vacuum analyze

On Fri, 2006-02-17 at 11:06, Emi Lu wrote:

In another way, whenever we "delete/truncate and then insert" data into
a table, it is better to "vacuum anaylze"?

You shouldn't need a VACUUM if you haven't yet done any updates or
deletes since the TRUNCATE. An ANALYZE seems like a good idea, though.
(You could get away without ANALYZE if the new data has essentially the
same statistics as the old, but if you're making only minor changes, why
are you using this technique at all ...)

After truncate table A, around 60,000 will be inserted. Then a
comparision will be done between table A and table B. After that, table
B will be updated according to the comparision result. Records inserted
into table A is increasing everyday.

So, your suggestion is that after the population of table A, the query
planner should be able to find the most efficient query plan because we
do truncate but not delete, and we do not need to do vacuum analyze at
all, right?

no. the suggestion was that a VACUUM is not needed, but
that an ANALYZE might be.

Thank you gnari for your answer. But I am a bit confused about not running vacuum but only "analyze". Can I seperate these two operations? I guess "vacuum analyze" do both vacuum and analyze. Or "EXPLAIN ANALYZE" can do it for me?

Yeah, vacuum analyze is kind of a leftover from the olden days when you
could only run an analyze as part of a vacuum command. analyze has been
it's own command for quite some time now.

#8Ragnar
gnari@hive.is
In reply to: Emi Lu (#6)
Re: A question about Vacuum analyze

On f�s, 2006-02-17 at 12:06 -0500, Emi Lu wrote:

In another way, whenever we "delete/truncate and then insert" data into
a table, it is better to "vacuum anaylze"?

...

So, your suggestion is that after the population of table A, the query
planner should be able to find the most efficient query plan because we
do truncate but not delete, and we do not need to do vacuum analyze at
all, right?

...

Thank you gnari for your answer. But I am a bit confused about not running vacuum but only "analyze". Can I seperate these two operations? I guess "vacuum analyze" do both vacuum and analyze.
Or "EXPLAIN ANALYZE" can do it for me?

VACUUM ensures that dead rows can be reused. Dead rows
are created by DELETE and UPDATE.
If you have done a significant number of DELETEs
or UPDATEs, you might want to VACUUM

ANALYZE collect statistical information about
your tables. this helps the planner make good plans.
After having changed your data significantly, you
might want to ANALYZE, for example after lots of
INSERTs, UPDATEs or DELETEs

TRUNCATE does not create dead rows, so you do
not need to VACUUM just because of that, but
you still might have to ANALYZE.

If you TRUNCATE a table and then repopulate it
with similar data as before, you do not have to
ANALYZE, as plans based on the old statistics
would assumedly be just as good.

EXPLAIN dislays the plan that will be chosen
for a query, along with some estimated cost
information.

EXPLAIN ANALYZE actually executes the query, and
shows same info as EXPLAIN, and in addition actual
cost information

Hope this makes it more clear

gnari

#9Chris Browne
cbbrowne@acm.org
In reply to: Emi Lu (#1)
Re: A question about Vacuum analyze

Quoth emilu@encs.concordia.ca (Emi Lu):

no. the suggestion was that a VACUUM is not needed, but that an
ANALYZE might be.

Thank you gnari for your answer. But I am a bit confused about not
running vacuum but only "analyze". Can I seperate these two
operations? I guess "vacuum analyze" do both vacuum and analyze. Or
"EXPLAIN ANALYZE" can do it for me?

EXPLAIN, ANALYZE, and VACUUM are different things; ANALYZE gets used
in two different contexts...

1. VACUUM is what cleans dead tuples out of tables.

e.g. VACUUM my_table;

2. VACUUM ANALYZE cleans out dead tuples and recalculates data
distributions

e.g. VACUUM ANALYZE my_table;

3. EXPLAIN describes query plans

e.g. EXPLAIN select * from my_table;

4. EXPLAIN ANALYZE compares query plan estimates to real results

e.g. EXPLAIN ANALYZE select * from my_table;

5. ANALYZE recalculates data distributions (as in 2, but without
cleaning out dead tuples).

e.g. ANALYZE my_table;

Pointedly, EXPLAIN ANALYZE is entirely distinct from ANALYZE and
VACUUM ANALYZE...
--
let name="cbbrowne" and tld="cbbrowne.com" in name ^ "@" ^ tld;;
http://linuxdatabases.info/info/slony.html
Rules of the Evil Overlord #218. "I will not pick up a glowing ancient
artifact and shout "Its power is now mine!!!" Instead I will grab some
tongs, transfer it to a hazardous materials container, and transport
it back to my lab for study." <http://www.eviloverlord.com/&gt;

#10Emi Lu
emilu@encs.concordia.ca
In reply to: Chris Browne (#9)
Re: A question about Vacuum analyze

Thank you very much for all your inputs. I believe "analyze" is the one
I should use .

Show quoted text

Quoth emilu@encs.concordia.ca (Emi Lu):

no. the suggestion was that a VACUUM is not needed, but that an
ANALYZE might be.

Thank you gnari for your answer. But I am a bit confused about not
running vacuum but only "analyze". Can I seperate these two
operations? I guess "vacuum analyze" do both vacuum and analyze. Or
"EXPLAIN ANALYZE" can do it for me?

EXPLAIN, ANALYZE, and VACUUM are different things; ANALYZE gets used
in two different contexts...

1. VACUUM is what cleans dead tuples out of tables.

e.g. VACUUM my_table;

2. VACUUM ANALYZE cleans out dead tuples and recalculates data
distributions

e.g. VACUUM ANALYZE my_table;

3. EXPLAIN describes query plans

e.g. EXPLAIN select * from my_table;

4. EXPLAIN ANALYZE compares query plan estimates to real results

e.g. EXPLAIN ANALYZE select * from my_table;

5. ANALYZE recalculates data distributions (as in 2, but without
cleaning out dead tuples).

e.g. ANALYZE my_table;

Pointedly, EXPLAIN ANALYZE is entirely distinct from ANALYZE and
VACUUM ANALYZE...