Re: vacuum, performance, and MVCC - Mailing list pgsql-hackers
From | Hannu Krosing |
---|---|
Subject | Re: vacuum, performance, and MVCC |
Date | |
Msg-id | 1151333933.3885.31.camel@localhost.localdomain Whole thread Raw |
In response to | Re: vacuum, performance, and MVCC (Bruce Momjian <bruce@momjian.us>) |
Responses |
Re: vacuum, performance, and MVCC
|
List | pgsql-hackers |
Ühel kenal päeval, E, 2006-06-26 kell 10:50, kirjutas Bruce Momjian: > Hannu Krosing wrote: > > ?hel kenal p?eval, E, 2006-06-26 kell 14:56, kirjutas Martijn van > > Oosterhout: > > > On Mon, Jun 26, 2006 at 07:17:31AM -0400, Bruce Momjian wrote: > > > > Correct! We use the same pointers used by normal UPDATEs, except we set > > > > a bit on the old tuple indicating it is a single-index tuple, and we > > > > don't create index entries for the new tuple. Index scan routines will > > > > need to be taught about the new chains, but because only one tuple in > > > > the chain is visible to a single backend, the callers should not need to > > > > be modified. > > > > > > I suppose we would also change the index_getmulti() function to return > > > a set of ctids plus flags so the caller knows to follow the chains, > > > right? > > > > It is probably better to always return the pointer to the head of CITC > > chain (the one an index points to) and do extra visibility checks and > > chain-following on each access. This would keep the change internal to > > tuple fetching functions. > > So index_getnext() traverses the chain and returns one member per call. > Makes sense. Just realize you are in a single index entry returning > multiple tuples. We will need some record keeping to track that. Maybe we need to push visibility checks further down, so that index_getnext() returns only the one heap row that is visible. > > > And for bitmap index scans you would only remember the page in > > > the case of such a tuple, since you can't be sure the exact ctid you've > > > got is the one you want. > > > > no, you should only use the pointer to CITC head outside tuple access > > funtions. And this pointer to CITC head is what is always passed to > > those access functions/macros. > > > > The VACUUM would run its passes thus: > > > > pass 1: run over heap, collect pointers to single dead tuples, and fully > > dead CITC chains (fully dead = no live tuples on this page). Clean up > > old tuples from CITC chains and move live tuples around so that CITC > > points to oldest possibly visible (not vacuumed) tuple. Doing this there > > frees us from need to collect a separate set of pointers for those. Or > > have you planned that old tuples from CITC chains are collected on the > > go/as needed ? Of course we could do both. > > Non-visible CITC members should be freed during an UPDATE on the same > page, so vacuum doesn't have to be involved. Ok. > > pass 2: clean indexes based on ctid from pass 1 > > > > pass 3: clean heap based on ctid from pass 1 > > > > If yo do it this way, you dont need to invent new data structures to > > pass extra info about CITC internals to passes 2 and 3 > > > > On more thing - when should free space map be notified about free space > > in pages with CITC chains ? > > Uh, well, I am thinking we only free CITC space when we are going to use > it for an UPDATE, rather than free things while doing an operation. It > is good to keep the cleanup overhead out of the main path as much as > possible. So vacuum should only remove dead CITC chains and leave the ones with live tuples to CITC internal use ? That would also suggest that pages having live CITC chains and less than N% of free space should mot be reported to FSM. > Also, seems I can't spell algorithms very well: > > Definition: Single-Index-Tuple Chain (SITC) > - > Thinking of vacuum, right now it does these cleanups: > > o non-visible UPDATEs on the same page with no key changes > o non-visible UPDATEs on the same page with key changes > o non-visible UPDATEs on different pages > o DELETEs > o aborted transactions > > The big question is what percentage of dead space is the first one? My > guess is 65%. Can be from 0% to 99.9%, very much dependent on application. -- ---------------- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com NOTICE: This communication contains privileged or other confidential information. If you have received it in error, please advise the sender by reply email and immediately delete the message and any attachments without copying or disclosing the contents.
pgsql-hackers by date: