Thread: dead tuples and VACUUM
Hi, everybody! I wanted to ask a few questions about VACUUM, and dead tuples in general Suppose, I have a table with a few million rows, and every now and then (say, monthly) every row in the table is updated. First of all, it is my understanding that this monthly job will double the size of the table (create a copy of every tuple it is updating). Is that right? Now, if I run VACUUM on that table, it is supposed to reclaim those dead tuples.. Will it actually? I remember some discussions about 7.2, where it was mentioned that plain (not FULL) VACUUM doesn't move rows between pages (so, it seems to me, that if every row is updated, plain vacuum is useless). Is it any better in 7.3? Or does it still have to be VACUUM FULL? And finally, if I do *not* run VACUUM, and let it just sit there with the doubled number of tuples, what will happen next month, when every row gets updated again? Will it be able to reuse the dead tuples then, or will it just keep creating the new ones? My problem is that, as I described above, I am afraid that plain VACUUM just won't help, and running VACUUM FULL takes *days* on my database, and we just cannot afford that much downtime (in fact, it is quicker, for some reason to dump and reload the table then to vacuum full it!). I'd love to hear what you guys know about it! Thanks a lot! Dima
On Sat, May 31, 2003 at 14:15:04 -0400, Dima Tkach <dmitry@openratings.com> wrote: > > First of all, it is my understanding that this monthly job will > double the size of the table (create a copy of every tuple it is > updating). Is that right? Yes. > Now, if I run VACUUM on that table, it is supposed to reclaim those dead > tuples.. Will it actually? I remember some discussions about 7.2, where > it was mentioned that plain (not FULL) VACUUM doesn't move rows between > pages (so, it seems to me, that if every row is updated, plain vacuum is > useless). Is it any better in 7.3? Or does it still have to be VACUUM > FULL? You need to make sure FSM is set large enough to plain vacuum to recover all of the tuples. > And finally, if I do *not* run VACUUM, and let it just sit there with > the doubled number of tuples, what will happen next month, when every > row gets updated again? Will it be able to reuse the dead tuples then, > or will it just keep creating the new ones? If you don't run even plain vacuum the deleted tuples won't be reused. If you have run plain vacuum with FSM set high enough the free space will be reused.
On Sat, May 31, 2003 at 02:15:04PM -0400, Dima Tkach wrote: > First of all, it is my understanding that this monthly job will > double the size of the table (create a copy of every tuple it is > updating). Is that right? Sort of. Read on. > Now, if I run VACUUM on that table, it is supposed to reclaim those dead > tuples.. Will it actually? I remember some discussions about 7.2, where > it was mentioned that plain (not FULL) VACUUM doesn't move rows between > pages (so, it seems to me, that if every row is updated, plain vacuum is > useless). Is it any better in 7.3? Or does it still have to be VACUUM > FULL? If you replace every tuple in the table, you probably don't have a large enough fress space map to track all of that. So VACUUM FULL is a good idea. But another answer is to VACUUM every (say) couple thousand UPDATEs. That'll keep the table size managable. You could even run a parallel VACUUM. If the whole table is UPDATEd in one transaction, though, that won't help. > And finally, if I do *not* run VACUUM, and let it just sit there with > the doubled number of tuples, what will happen next month, when every > row gets updated again? Will it be able to reuse the dead tuples then, > or will it just keep creating the new ones? It can't use the free space until you've run VACUUM. A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
Sorry for this very basic question. After installing Postgresql 7.3.2 and migrating table description. I can not modify the column types of some tables. To do the modification, I tried both pgphpadmin and pgadmin with the administrator user which comes by default after installation. I am pretty sure this is link to the privilegies but I can not find out the solution. Could you please help ? GH
Thanks for the reply! I'd still like to clariofy some points... >If you replace every tuple in the table, you probably don't have a >large enough fress space map to track all of that. So VACUUM FULL is >a good idea. > > Where do I set that free space map? Is it a configuration parameter? And what exactly does it mean? If it is not high enough and I never run VACUUM FULL, are my dead tuples lost forever? >But another answer is to VACUUM every (say) couple thousand UPDATEs. >That'll keep the table size managable. > I am afraid, that is not feasible too - you see, that monthly update job I mentioned runs continuosly and takes about a couple of weeks to complete as it is. If I pause it every couple thousands of rows to do a vacuum, it will, I am afraid, take months (and I have to stay under at least 1 month, because the whole idea is to update every month). > You could even run a parallel >VACUUM. If the whole table is UPDATEd in one transaction, though, >that won't help. > It is not one transaction. I am running about 10 parallel jobs, and each of them commits every now and then (every 10K rows, I believe). If I run VACUUM in parallel, what exactly is going to happen? Will I still be loosing some of those dead tuples after each update? Basically, my question is - if I let it run this way forever, will the database just keep growing on me until I am out of the disk space? Is there any way to prevent that, without having to take it offline for days? Thanks! Dima
On Mon, Jun 02, 2003 at 10:57:18 -0400, Dmitry Tkach <dmitry@openratings.com> wrote: > Thanks for the reply! > > I'd still like to clariofy some points... > > >If you replace every tuple in the table, you probably don't have a > >large enough fress space map to track all of that. So VACUUM FULL is > >a good idea. > > > > > Where do I set that free space map? Is it a configuration parameter? > And what exactly does it mean? If it is not high enough and I never run > VACUUM FULL, are my dead tuples lost forever? It is a configuration paramter. I believe the name is FSM. > > >But another answer is to VACUUM every (say) couple thousand UPDATEs. > >That'll keep the table size managable. > > > I am afraid, that is not feasible too - you see, that monthly update job > I mentioned runs continuosly and takes about a couple of weeks to > complete as it is. > If I pause it every couple thousands of rows to do a vacuum, it will, I > am afraid, take months (and I have to stay under at least 1 month, > because the whole idea is to update every month). If you are doing normal vacuum (not vacuum full), it doesn't lock tables so you can run it along with other updates without taking a big performance hit. All it does is mark deleted tuples that are not visible to any current transactions as safe to reuse. By running vacuum frequently you can maintain a steady state size for your database. If it grows unusually large and you need to recover some space for other uses then you would want to use vacuum full. > > >You could even run a parallel > >VACUUM. If the whole table is UPDATEd in one transaction, though, > >that won't help. > > > It is not one transaction. I am running about 10 parallel jobs, and each > of them commits every now and then (every 10K rows, I believe). > If I run VACUUM in parallel, what exactly is going to happen? It will work just fine. > > Will I still be loosing some of those dead tuples after each update? Vacuum will mark deleted tuples as reusuable if there are no open transactions that can still see them. > > Basically, my question is - if I let it run this way forever, will the > database just keep growing on me until I am out of the disk space? > Is there any way to prevent that, without having to take it offline for > days? It should reach a steady state size that depends on how often you vacuum and how often you update. If something goes wrong (say a transaction is started intereactively and then left open for a day or two) then the database may grow unusually large and you may need to do a vacuum full. > > Thanks! > > Dima > > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org