Re: Dead Space Map - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Dead Space Map
Date
Msg-id Pine.OSF.4.61.0602272114460.130558@kosh.hut.fi
Whole thread Raw
In response to Re: Dead Space Map  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Dead Space Map
List pgsql-hackers
On Mon, 27 Feb 2006, Tom Lane wrote:

> Heikki Linnakangas <hlinnaka@iki.fi> writes:
>> Vacuum will need to be modified to use index lookups to find index tuples
>> corresponding the dead heap tuples. Otherwise you have to scan through
>> all the indexes anyway.
>
> This strikes me as a fairly bad idea, because it makes VACUUM dependent
> on correct functioning of user-written code --- consider a functional
> index involving a user-written function that was claimed to be immutable
> and is not.

If the user-defined function is broken, you're in more or less trouble 
anyway. A VACUUM FULL or REINDEX can be used to recover.

> There are concurrency-safety issues too, I think, having to
> do with the way that btree ensures we don't delete any index tuple that
> some scan is stopped on.

Hmm, I see. I'll have to study the btree implementation more thoroughly.

>> * implementation of index-only scans
>
>> An index scan would not have to check the visibility information of heap
>> tuples on those heap pages that are marked as clean in the dead space map.
>> This requires that the dead space map is implemented so that a page is
>> reliably marked as dirty in all circumstances when it contains any tuples
>> that are not visible to all backends.
>
> The "reliably" part of this is likely to make it a non-starter.

AFAICS all heap access goes through the functions in heapam.c. They need 
to be modified to update the dead space map. Also on recovery. The 
locking semantics of the dead space map need to be thought out, but I 
don't see any insurmountable problems.

> Another
> problem is that the semantics needed by this are not quite the same as
> the semantics of whether a page needs to be visited by vacuum.

True. We might have to have two bits per page. It's still not that 
bad, though, compared to the benefit.

>> My current implementation stores a bitmap of 32k bits in the special space
>> of every 32k heap pages. Each bit in the bitmap corresponds one heap page.
>> The bit is set every time a tuple is updated, and it's cleared by vacuum.
>> This is a very simple approach, and doesn't take much space.
>
> I thought the plan was to use out-of-line storage associated with each
> table "segment" file.

You're probably referring to Alvaro's auto-vacuum todo list from July:

http://archives.postgresql.org/pgsql-hackers/2005-07/msg01029.php

I find it more attractive to put the bitmap in the special space, for the 
reasons stated earlier by Jan Wieck:

http://archives.postgresql.org/pgsql-hackers/2004-03/msg00957.php

That is, so that you can utilize the common buffer management code. Jan 
also had a plan there for the locking.

- Heikki


pgsql-hackers by date:

Previous
From: "Mark Woodward"
Date:
Subject: Re: pg_config, pg_service.conf, postgresql.conf ....
Next
From: "Jeffrey W. Baker"
Date:
Subject: Re: Any conclusion on the Xeon context-switching issue?