Re: vacuum, performance, and MVCC - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: vacuum, performance, and MVCC
Date
Msg-id 1151331354.3885.18.camel@localhost.localdomain
Whole thread Raw
In response to Re: vacuum, performance, and MVCC  (Martijn van Oosterhout <kleptog@svana.org>)
Responses Re: vacuum, performance, and MVCC
List pgsql-hackers
Ühel kenal päeval, E, 2006-06-26 kell 14:56, kirjutas Martijn van
Oosterhout:
> On Mon, Jun 26, 2006 at 07:17:31AM -0400, Bruce Momjian wrote:
> > Correct!  We use the same pointers used by normal UPDATEs, except we set
> > a bit on the old tuple indicating it is a single-index tuple, and we
> > don't create index entries for the new tuple.  Index scan routines will
> > need to be taught about the new chains, but because only one tuple in
> > the chain is visible to a single backend, the callers should not need to
> > be modified.
> 
> I suppose we would also change the index_getmulti() function to return
> a set of ctids plus flags so the caller knows to follow the chains,
> right? 

It is probably better to always return the pointer to the head of CITC
chain (the one an index points to) and do extra visibility checks and
chain-following on each access. This would keep the change internal to
tuple fetching functions.

> And for bitmap index scans you would only remember the page in
> the case of such a tuple, since you can't be sure the exact ctid you've
> got is the one you want.

no, you should only use the pointer to CITC head outside tuple access
funtions. And this pointer to CITC head is what is always passed to
those access functions/macros.

The VACUUM would run its passes thus:

pass 1: run over heap, collect pointers to single dead tuples, and fully
dead CITC chains (fully dead = no live tuples on this page). Clean up
old tuples from CITC chains and move live tuples around so that CITC
points to oldest possibly visible (not vacuumed) tuple. Doing this there
frees us from need to collect a separate set of pointers for those. Or
have you planned that old tuples from CITC chains are collected on the
go/as needed ? Of course we could do both.

pass 2: clean indexes based on ctid from pass 1

pass 3: clean heap based on ctid from pass 1

If yo do it this way, you dont need to invent new data structures to
pass extra info about CITC internals to passes 2 and 3

On more thing - when should free space map be notified about free space
in pages with CITC chains ?

-- 
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com



pgsql-hackers by date:

Previous
From: "Jonah H. Harris"
Date:
Subject: Re: vacuum row?
Next
From: "Alexandru Coseru"
Date:
Subject: ERROR: invalid page header in block