Thread: A question about Vacuum analyze
Hello, We have a daily cronjob and in the cronjob we do: 1. truncate one table A 2. insert into table A 3. do comparision table A and table B and update table B accordingly The doc says "VACUUM ANALYZE command for the affected table. This will update the system catalogs with the results of all recent changes, and allow the PostgreSQL query planner to make better choices in planning queries". So, I believe after insertion at step 2, comparision will be speed up if we do a "vacuum analyze" right? In another way, whenever we "delete/truncate and then insert" data into a table, it is better to "vacuum anaylze"? Thanks a lot! Emi
On fim, 2006-02-16 at 09:12 -0500, Emi Lu wrote: > Hello, > > We have a daily cronjob and in the cronjob we do: > 1. truncate one table A > 2. insert into table A > 3. do comparision table A and table B and update table B accordingly > > The doc says "VACUUM ANALYZE command for the affected table. This will > update the system catalogs with the results of all recent changes, and > allow the PostgreSQL query planner to make better choices in planning > queries". > So, I believe after insertion at step 2, comparision will be speed up if > we do a "vacuum analyze" right? after only truncate and inserts , vacuum is not needed, but ANALYZE is good. > In another way, whenever we "delete/truncate and then insert" data into > a table, it is better to "vacuum anaylze"? after deleting or updating a significant percentage of rows, VACUUM can be called for, but usually VACUUM is done as part of regular maintenance, or by autovacuum. gnari
Emi Lu <emilu@encs.concordia.ca> writes: > In another way, whenever we "delete/truncate and then insert" data into > a table, it is better to "vacuum anaylze"? You shouldn't need a VACUUM if you haven't yet done any updates or deletes since the TRUNCATE. An ANALYZE seems like a good idea, though. (You could get away without ANALYZE if the new data has essentially the same statistics as the old, but if you're making only minor changes, why are you using this technique at all ...) regards, tom lane
>>In another way, whenever we "delete/truncate and then insert" data into >>a table, it is better to "vacuum anaylze"? >> >> >You shouldn't need a VACUUM if you haven't yet done any updates or >deletes since the TRUNCATE. An ANALYZE seems like a good idea, though. >(You could get away without ANALYZE if the new data has essentially the >same statistics as the old, but if you're making only minor changes, why >are you using this technique at all ...) > > After truncate table A, around 60,000 will be inserted. Then a comparision will be done between table A and table B. After that, table B will be updated according to the comparision result. Records inserted into table A is increasing everyday. So, your suggestion is that after the population of table A, the query planner should be able to find the most efficient query plan because we do truncate but not delete, and we do not need to do vacuum analyze at all, right? Thank you, Emi
On fim, 2006-02-16 at 16:24 -0500, Emi Lu wrote: > >>In another way, whenever we "delete/truncate and then insert" data into > >>a table, it is better to "vacuum anaylze"? > >> > >> > >You shouldn't need a VACUUM if you haven't yet done any updates or > >deletes since the TRUNCATE. An ANALYZE seems like a good idea, though. > >(You could get away without ANALYZE if the new data has essentially the > >same statistics as the old, but if you're making only minor changes, why > >are you using this technique at all ...) > > > > > After truncate table A, around 60,000 will be inserted. Then a > comparision will be done between table A and table B. After that, table > B will be updated according to the comparision result. Records inserted > into table A is increasing everyday. > > So, your suggestion is that after the population of table A, the query > planner should be able to find the most efficient query plan because we > do truncate but not delete, and we do not need to do vacuum analyze at > all, right? no. the suggestion was that a VACUUM is not needed, but that an ANALYZE might be. gnari
>>>>In another way, whenever we "delete/truncate and then insert" data into >>>>a table, it is better to "vacuum anaylze"? >>>> >>>> >>>> >>>> >>>You shouldn't need a VACUUM if you haven't yet done any updates or >>>deletes since the TRUNCATE. An ANALYZE seems like a good idea, though. >>>(You could get away without ANALYZE if the new data has essentially the >>>same statistics as the old, but if you're making only minor changes, why >>>are you using this technique at all ...) >>> >>> >>> >>> >>After truncate table A, around 60,000 will be inserted. Then a >>comparision will be done between table A and table B. After that, table >>B will be updated according to the comparision result. Records inserted >>into table A is increasing everyday. >> >>So, your suggestion is that after the population of table A, the query >>planner should be able to find the most efficient query plan because we >>do truncate but not delete, and we do not need to do vacuum analyze at >>all, right? >> >> > >no. the suggestion was that a VACUUM is not needed, but >that an ANALYZE might be. > > Thank you gnari for your answer. But I am a bit confused about not running vacuum but only "analyze". Can I seperate thesetwo operations? I guess "vacuum analyze" do both vacuum and analyze. Or "EXPLAIN ANALYZE" can do it for me? Emi
On Fri, 2006-02-17 at 11:06, Emi Lu wrote: > >>>>In another way, whenever we "delete/truncate and then insert" data into > >>>>a table, it is better to "vacuum anaylze"? > >>>> > >>>> > >>>> > >>>> > >>>You shouldn't need a VACUUM if you haven't yet done any updates or > >>>deletes since the TRUNCATE. An ANALYZE seems like a good idea, though. > >>>(You could get away without ANALYZE if the new data has essentially the > >>>same statistics as the old, but if you're making only minor changes, why > >>>are you using this technique at all ...) > >>> > >>> > >>> > >>> > >>After truncate table A, around 60,000 will be inserted. Then a > >>comparision will be done between table A and table B. After that, table > >>B will be updated according to the comparision result. Records inserted > >>into table A is increasing everyday. > >> > >>So, your suggestion is that after the population of table A, the query > >>planner should be able to find the most efficient query plan because we > >>do truncate but not delete, and we do not need to do vacuum analyze at > >>all, right? > >> > >> > > > >no. the suggestion was that a VACUUM is not needed, but > >that an ANALYZE might be. > > > > > > Thank you gnari for your answer. But I am a bit confused about not running vacuum but only "analyze". Can I seperate thesetwo operations? I guess "vacuum analyze" do both vacuum and analyze. Or "EXPLAIN ANALYZE" can do it for me? Yeah, vacuum analyze is kind of a leftover from the olden days when you could only run an analyze as part of a vacuum command. analyze has been it's own command for quite some time now.
On fös, 2006-02-17 at 12:06 -0500, Emi Lu wrote: > >>>>In another way, whenever we "delete/truncate and then insert" data into > >>>>a table, it is better to "vacuum anaylze"? > >>>> > ... > >> > >>So, your suggestion is that after the population of table A, the query > >>planner should be able to find the most efficient query plan because we > >>do truncate but not delete, and we do not need to do vacuum analyze at > >>all, right? > >> > >> > ... > > Thank you gnari for your answer. But I am a bit confused about not running vacuum but only "analyze". Can I seperate thesetwo operations? I guess "vacuum analyze" do both vacuum and analyze. > Or "EXPLAIN ANALYZE" can do it for me? VACUUM ensures that dead rows can be reused. Dead rows are created by DELETE and UPDATE. If you have done a significant number of DELETEs or UPDATEs, you might want to VACUUM ANALYZE collect statistical information about your tables. this helps the planner make good plans. After having changed your data significantly, you might want to ANALYZE, for example after lots of INSERTs, UPDATEs or DELETEs TRUNCATE does not create dead rows, so you do not need to VACUUM just because of that, but you still might have to ANALYZE. If you TRUNCATE a table and then repopulate it with similar data as before, you do not have to ANALYZE, as plans based on the old statistics would assumedly be just as good. EXPLAIN dislays the plan that will be chosen for a query, along with some estimated cost information. EXPLAIN ANALYZE actually executes the query, and shows same info as EXPLAIN, and in addition actual cost information Hope this makes it more clear gnari
Quoth emilu@encs.concordia.ca (Emi Lu): >> no. the suggestion was that a VACUUM is not needed, but that an >> ANALYZE might be. > > Thank you gnari for your answer. But I am a bit confused about not > running vacuum but only "analyze". Can I seperate these two > operations? I guess "vacuum analyze" do both vacuum and analyze. Or > "EXPLAIN ANALYZE" can do it for me? EXPLAIN, ANALYZE, and VACUUM are different things; ANALYZE gets used in two different contexts... 1. VACUUM is what cleans dead tuples out of tables. e.g. VACUUM my_table; 2. VACUUM ANALYZE cleans out dead tuples and recalculates data distributions e.g. VACUUM ANALYZE my_table; 3. EXPLAIN describes query plans e.g. EXPLAIN select * from my_table; 4. EXPLAIN ANALYZE compares query plan estimates to real results e.g. EXPLAIN ANALYZE select * from my_table; 5. ANALYZE recalculates data distributions (as in 2, but without cleaning out dead tuples). e.g. ANALYZE my_table; Pointedly, EXPLAIN ANALYZE is entirely distinct from ANALYZE and VACUUM ANALYZE... -- let name="cbbrowne" and tld="cbbrowne.com" in name ^ "@" ^ tld;; http://linuxdatabases.info/info/slony.html Rules of the Evil Overlord #218. "I will not pick up a glowing ancient artifact and shout "Its power is now mine!!!" Instead I will grab some tongs, transfer it to a hazardous materials container, and transport it back to my lab for study." <http://www.eviloverlord.com/>
Thank you very much for all your inputs. I believe "analyze" is the one I should use . >Quoth emilu@encs.concordia.ca (Emi Lu): > > >>>no. the suggestion was that a VACUUM is not needed, but that an >>>ANALYZE might be. >>> >>> >>Thank you gnari for your answer. But I am a bit confused about not >>running vacuum but only "analyze". Can I seperate these two >>operations? I guess "vacuum analyze" do both vacuum and analyze. Or >>"EXPLAIN ANALYZE" can do it for me? >> >> > >EXPLAIN, ANALYZE, and VACUUM are different things; ANALYZE gets used >in two different contexts... > >1. VACUUM is what cleans dead tuples out of tables. > > e.g. VACUUM my_table; > >2. VACUUM ANALYZE cleans out dead tuples and recalculates data > distributions > > e.g. VACUUM ANALYZE my_table; > >3. EXPLAIN describes query plans > > e.g. EXPLAIN select * from my_table; > >4. EXPLAIN ANALYZE compares query plan estimates to real results > > e.g. EXPLAIN ANALYZE select * from my_table; > >5. ANALYZE recalculates data distributions (as in 2, but without > cleaning out dead tuples). > > e.g. ANALYZE my_table; > >Pointedly, EXPLAIN ANALYZE is entirely distinct from ANALYZE and >VACUUM ANALYZE... > >