Re: [PERFORM] Realtime VACUUM, was: performance of insert/delete/update - Mailing list pgsql-hackers
From | Tommi Maekitalo |
---|---|
Subject | Re: [PERFORM] Realtime VACUUM, was: performance of insert/delete/update |
Date | |
Msg-id | 200211271634.04404.t.maekitalo@epgmbh.de Whole thread Raw |
In response to | Re: [PERFORM] Realtime VACUUM, was: performance of insert/delete/update ("Nicolai Tufar" <ntufar@apb.com.tr>) |
List | pgsql-hackers |
Or just reorg. Am Mittwoch, 27. November 2002 15:02 schrieb Nicolai Tufar: > I always wandered if VACUUM is the right name for the porcess. Now, when > PostgreSQL > is actively challenging in Enterprise space, it might be a good idea to > give it a more > enterprise-like name. Try to think how it is looking for an outside person > to see > us, database professionals hold lenghty discussions about the ways we > vacuum a database. Why should you need to vacuum a database? Is it > dirty? In my personal opinion, something like "space reclaiming daemon", > "free-list organizer", "tuple recyle job" or "segment coalesce process" > would > sound more business-like . > > Regards, > Nick > > > ----- Original Message ----- > From: "Bruce Momjian" <pgman@candle.pha.pa.us> > To: "Curtis Faith" <curtis@galtair.com> > Cc: "Tom Lane" <tgl@sss.pgh.pa.us>; "Ron Johnson" <ron.l.johnson@cox.net>; > "PgSQL Performance ML" <pgsql-performance@postgresql.org>; > <pgsql-hackers@postgresql.org> > Sent: Tuesday, November 26, 2002 9:09 PM > Subject: Re: [PERFORM] [HACKERS] Realtime VACUUM, was: performance of > insert/delete/update > > > Good ideas. I think the master solution is to hook the statistics > > daemon information into an automatic vacuum that could _know_ which > > tables need attention. > > > > ------------------------------------------------------------------------- > >- > > - > > > Curtis Faith wrote: > > > tom lane wrote: > > > > Sure, it's just shuffling the housekeeping work from one place to > > > > another. The thing that I like about Postgres' approach is that we > > > > put the housekeeping in a background task (VACUUM) rather than in the > > > > critical path of foreground transaction commit. > > > > > > Thinking with my marketing hat on, MVCC would be a much bigger win if > > VACUUM > > > > was not required (or was done automagically). The need for periodic > > VACUUM > > > > just gives ammunition to the PostgreSQL opponents who can claim we are > > > deferring work but that it amounts to the same thing. > > > > > > A fully automatic background VACUUM will significantly reduce but will > > not > > > > eliminate this perceived weakness. > > > > > > However, it always seemed to me there should be some way to reuse the > > space > > > > more dynamically and quickly than a background VACUUM thereby reducing > > the > > > > percentage of tuples that are expired in heavy update cases. If only a > > very > > > > tiny number of tuples on the disk are expired this will reduce the > > aggregate > > > > performance/space penalty of MVCC into insignificance for the majority > > of > > > > uses. > > > > > > Couldn't we reuse tuple and index space as soon as there are no > > transactions > > > > that depend on the old tuple or index values. I have imagined that this > > was > > > > always part of the long-term master plan. > > > > > > Couldn't we keep a list of dead tuples in shared memory and look in the > > list > > > > first when deciding where to place new values for inserts or updates so > > we > > > > don't have to rely on VACUUM (even a background one)? If there are > > expired > > > > tuple slots in the list these would be used before allocating a new > > > slot > > from > > > > the tuple heap. > > > > > > The only issue is determining the lowest transaction ID for in-process > > > transactions which seems relatively easy to do (if it's not already > > > done somewhere). > > > > > > In the normal shutdown and startup case, a tuple VACUUM could be > > performed > > > > automatically. This would normally be very fast since there would not > > > be > > many > > > > tuples in the list. > > > > > > Index slots would be handled differently since these cannot be > > substituted > > > > one for another. However, these could be recovered as part of every > > index > > > > page update. Pages would be scanned before being written and any > > > expired slots that had transaction ID's lower than the lowest active > > > slot would > > be > > > > removed. This could be done for non-leaf pages as well and would result > > in > > > > only reorganizing a page that is already going to be written thereby > > > not adding much to the overall work. > > > > > > I don't think that internal pages that contain pointers to values in > > nodes > > > > further down the tree that are no longer in the leaf nodes because of > > this > > > > partial expired entry elimination will cause a problem since searches > > and > > > > scans will still work fine. > > > > > > Does VACUUM do something that could not be handled in this realtime > > manner? > > > > - Curtis > > > > > > > > > > > > ---------------------------(end of > > > broadcast)--------------------------- TIP 4: Don't 'kill -9' the > > > postmaster > > > > -- > > Bruce Momjian | http://candle.pha.pa.us > > pgman@candle.pha.pa.us | (610) 359-1001 > > + If your life is a hard drive, | 13 Roberts Road > > + Christ can be your backup. | Newtown Square, Pennsylvania > > 19073 > > > ---------------------------(end of broadcast)--------------------------- > > TIP 2: you can get off all lists at once with the unregister command > > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster -- Dr. Eckhardt + Partner GmbH http://www.epgmbh.de
pgsql-hackers by date: