Thread: A question about Vacuum analyze

A question about Vacuum analyze

From
Emi Lu
Date:
Hello,

We have a daily cronjob and in the cronjob we do:
  1.  truncate one table A
  2.  insert into table A
  3.  do comparision table A and table B and update table B accordingly

The doc says "VACUUM ANALYZE command for the affected table. This will
update the system catalogs with the results of all recent changes, and
allow the PostgreSQL query planner to make better choices in planning
queries".
So, I believe after insertion at step 2, comparision will be speed up if
we do a "vacuum analyze" right?

In another way, whenever we "delete/truncate and then insert" data into
a table, it is better to "vacuum anaylze"?

Thanks a lot!
Emi


Re: A question about Vacuum analyze

From
Ragnar
Date:
On fim, 2006-02-16 at 09:12 -0500, Emi Lu wrote:
> Hello,
>
> We have a daily cronjob and in the cronjob we do:
>   1.  truncate one table A
>   2.  insert into table A
>   3.  do comparision table A and table B and update table B accordingly
>
> The doc says "VACUUM ANALYZE command for the affected table. This will
> update the system catalogs with the results of all recent changes, and
> allow the PostgreSQL query planner to make better choices in planning
> queries".
> So, I believe after insertion at step 2, comparision will be speed up if
> we do a "vacuum analyze" right?

after only truncate and inserts , vacuum is not needed,
but ANALYZE is good.

> In another way, whenever we "delete/truncate and then insert" data into
> a table, it is better to "vacuum anaylze"?

after deleting or updating a significant percentage of
rows, VACUUM can be called for, but usually VACUUM is
done as part of regular maintenance, or by autovacuum.

gnari



Re: A question about Vacuum analyze

From
Tom Lane
Date:
Emi Lu <emilu@encs.concordia.ca> writes:
> In another way, whenever we "delete/truncate and then insert" data into
> a table, it is better to "vacuum anaylze"?

You shouldn't need a VACUUM if you haven't yet done any updates or
deletes since the TRUNCATE.  An ANALYZE seems like a good idea, though.
(You could get away without ANALYZE if the new data has essentially the
same statistics as the old, but if you're making only minor changes, why
are you using this technique at all ...)

            regards, tom lane

Re: A question about Vacuum analyze

From
Emi Lu
Date:
>>In another way, whenever we "delete/truncate and then insert" data into
>>a table, it is better to "vacuum anaylze"?
>>
>>
>You shouldn't need a VACUUM if you haven't yet done any updates or
>deletes since the TRUNCATE.  An ANALYZE seems like a good idea, though.
>(You could get away without ANALYZE if the new data has essentially the
>same statistics as the old, but if you're making only minor changes, why
>are you using this technique at all ...)
>
>
After truncate table A, around 60,000 will be inserted. Then a
comparision will be done between table A and table B.  After that, table
B will be updated according to the comparision result. Records inserted
into table A is increasing everyday.

So, your suggestion is that after the population of table A, the query
planner should be able to find the most efficient query plan because we
do truncate but not delete, and we do not need to do vacuum analyze at
all, right?

Thank you,
Emi


Re: A question about Vacuum analyze

From
Ragnar
Date:
On fim, 2006-02-16 at 16:24 -0500, Emi Lu wrote:
> >>In another way, whenever we "delete/truncate and then insert" data into
> >>a table, it is better to "vacuum anaylze"?
> >>
> >>
> >You shouldn't need a VACUUM if you haven't yet done any updates or
> >deletes since the TRUNCATE.  An ANALYZE seems like a good idea, though.
> >(You could get away without ANALYZE if the new data has essentially the
> >same statistics as the old, but if you're making only minor changes, why
> >are you using this technique at all ...)
> >
> >
> After truncate table A, around 60,000 will be inserted. Then a
> comparision will be done between table A and table B.  After that, table
> B will be updated according to the comparision result. Records inserted
> into table A is increasing everyday.
>
> So, your suggestion is that after the population of table A, the query
> planner should be able to find the most efficient query plan because we
> do truncate but not delete, and we do not need to do vacuum analyze at
> all, right?

no. the suggestion was that a VACUUM is not needed, but
that an ANALYZE might be.

gnari



Re: A question about Vacuum analyze

From
Emi Lu
Date:
>>>>In another way, whenever we "delete/truncate and then insert" data into
>>>>a table, it is better to "vacuum anaylze"?
>>>>
>>>>
>>>>
>>>>
>>>You shouldn't need a VACUUM if you haven't yet done any updates or
>>>deletes since the TRUNCATE.  An ANALYZE seems like a good idea, though.
>>>(You could get away without ANALYZE if the new data has essentially the
>>>same statistics as the old, but if you're making only minor changes, why
>>>are you using this technique at all ...)
>>>
>>>
>>>
>>>
>>After truncate table A, around 60,000 will be inserted. Then a
>>comparision will be done between table A and table B.  After that, table
>>B will be updated according to the comparision result. Records inserted
>>into table A is increasing everyday.
>>
>>So, your suggestion is that after the population of table A, the query
>>planner should be able to find the most efficient query plan because we
>>do truncate but not delete, and we do not need to do vacuum analyze at
>>all, right?
>>
>>
>
>no. the suggestion was that a VACUUM is not needed, but
>that an ANALYZE might be.
>
>

Thank you gnari for your answer. But I am a bit confused about not running vacuum but only "analyze". Can I seperate
thesetwo operations? I guess "vacuum analyze" do both vacuum and analyze. Or "EXPLAIN ANALYZE" can do it for me? 

Emi








Re: A question about Vacuum analyze

From
Scott Marlowe
Date:
On Fri, 2006-02-17 at 11:06, Emi Lu wrote:
> >>>>In another way, whenever we "delete/truncate and then insert" data into
> >>>>a table, it is better to "vacuum anaylze"?
> >>>>
> >>>>
> >>>>
> >>>>
> >>>You shouldn't need a VACUUM if you haven't yet done any updates or
> >>>deletes since the TRUNCATE.  An ANALYZE seems like a good idea, though.
> >>>(You could get away without ANALYZE if the new data has essentially the
> >>>same statistics as the old, but if you're making only minor changes, why
> >>>are you using this technique at all ...)
> >>>
> >>>
> >>>
> >>>
> >>After truncate table A, around 60,000 will be inserted. Then a
> >>comparision will be done between table A and table B.  After that, table
> >>B will be updated according to the comparision result. Records inserted
> >>into table A is increasing everyday.
> >>
> >>So, your suggestion is that after the population of table A, the query
> >>planner should be able to find the most efficient query plan because we
> >>do truncate but not delete, and we do not need to do vacuum analyze at
> >>all, right?
> >>
> >>
> >
> >no. the suggestion was that a VACUUM is not needed, but
> >that an ANALYZE might be.
> >
> >
>
> Thank you gnari for your answer. But I am a bit confused about not running vacuum but only "analyze". Can I seperate
thesetwo operations? I guess "vacuum analyze" do both vacuum and analyze. Or "EXPLAIN ANALYZE" can do it for me? 


Yeah, vacuum analyze is kind of a leftover from the olden days when you
could only run an analyze as part of a vacuum command.  analyze has been
it's own command for quite some time now.

Re: A question about Vacuum analyze

From
Ragnar
Date:
On fös, 2006-02-17 at 12:06 -0500, Emi Lu wrote:
> >>>>In another way, whenever we "delete/truncate and then insert" data into
> >>>>a table, it is better to "vacuum anaylze"?
> >>>>
> ...
> >>
> >>So, your suggestion is that after the population of table A, the query
> >>planner should be able to find the most efficient query plan because we
> >>do truncate but not delete, and we do not need to do vacuum analyze at
> >>all, right?
> >>
> >>
> ...
>
> Thank you gnari for your answer. But I am a bit confused about not running vacuum but only "analyze". Can I seperate
thesetwo operations? I guess "vacuum analyze" do both vacuum and analyze.  
> Or "EXPLAIN ANALYZE" can do it for me?

VACUUM ensures that dead rows can be reused. Dead rows
are created by DELETE and UPDATE.
If you have done a significant number of DELETEs
or UPDATEs, you might want to VACUUM

ANALYZE collect statistical information about
your tables. this helps the planner make good plans.
After having changed your data significantly, you
might want to ANALYZE, for example after lots of
INSERTs, UPDATEs or DELETEs

TRUNCATE does not create dead rows, so you do
not need to VACUUM just because of that, but
you still might have to ANALYZE.

If you TRUNCATE a table and then repopulate it
with similar data as before, you do not have to
ANALYZE, as plans based on the old statistics
would assumedly be just as good.

EXPLAIN dislays the plan that will be chosen
for a query, along with some estimated cost
information.

EXPLAIN ANALYZE actually executes the query, and
shows same info as EXPLAIN, and in addition actual
cost information

Hope this makes it more clear

gnari



Re: A question about Vacuum analyze

From
Christopher Browne
Date:
Quoth emilu@encs.concordia.ca (Emi Lu):
>> no. the suggestion was that a VACUUM is not needed, but that an
>> ANALYZE might be.
>
> Thank you gnari for your answer. But I am a bit confused about not
> running vacuum but only "analyze". Can I seperate these two
> operations? I guess "vacuum analyze" do both vacuum and analyze. Or
> "EXPLAIN ANALYZE" can do it for me?

EXPLAIN, ANALYZE, and VACUUM are different things; ANALYZE gets used
in two different contexts...

1.  VACUUM is what cleans dead tuples out of tables.

  e.g. VACUUM my_table;

2.  VACUUM ANALYZE cleans out dead tuples and recalculates data
    distributions

  e.g. VACUUM ANALYZE my_table;

3.  EXPLAIN describes query plans

  e.g. EXPLAIN select * from my_table;

4.  EXPLAIN ANALYZE compares query plan estimates to real results

  e.g. EXPLAIN ANALYZE select * from my_table;

5.  ANALYZE recalculates data distributions (as in 2, but without
    cleaning out dead tuples).

  e.g. ANALYZE my_table;

Pointedly, EXPLAIN ANALYZE is entirely distinct from ANALYZE and
VACUUM ANALYZE...
--
let name="cbbrowne" and tld="cbbrowne.com" in name ^ "@" ^ tld;;
http://linuxdatabases.info/info/slony.html
Rules of the Evil Overlord #218. "I will not pick up a glowing ancient
artifact and shout "Its power is now mine!!!" Instead I will grab some
tongs, transfer  it to a hazardous materials  container, and transport
it back to my lab for study." <http://www.eviloverlord.com/>

Re: A question about Vacuum analyze

From
Emi Lu
Date:
Thank you very much for all your inputs. I believe "analyze" is the one
I should use .

>Quoth emilu@encs.concordia.ca (Emi Lu):
>
>
>>>no. the suggestion was that a VACUUM is not needed, but that an
>>>ANALYZE might be.
>>>
>>>
>>Thank you gnari for your answer. But I am a bit confused about not
>>running vacuum but only "analyze". Can I seperate these two
>>operations? I guess "vacuum analyze" do both vacuum and analyze. Or
>>"EXPLAIN ANALYZE" can do it for me?
>>
>>
>
>EXPLAIN, ANALYZE, and VACUUM are different things; ANALYZE gets used
>in two different contexts...
>
>1.  VACUUM is what cleans dead tuples out of tables.
>
>  e.g. VACUUM my_table;
>
>2.  VACUUM ANALYZE cleans out dead tuples and recalculates data
>    distributions
>
>  e.g. VACUUM ANALYZE my_table;
>
>3.  EXPLAIN describes query plans
>
>  e.g. EXPLAIN select * from my_table;
>
>4.  EXPLAIN ANALYZE compares query plan estimates to real results
>
>  e.g. EXPLAIN ANALYZE select * from my_table;
>
>5.  ANALYZE recalculates data distributions (as in 2, but without
>    cleaning out dead tuples).
>
>  e.g. ANALYZE my_table;
>
>Pointedly, EXPLAIN ANALYZE is entirely distinct from ANALYZE and
>VACUUM ANALYZE...
>
>