Re: Dead Space Map - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Dead Space Map
Date
Msg-id 10601.1141106640@sss.pgh.pa.us
Whole thread Raw
In response to Re: Dead Space Map  ("Jim C. Nasby" <jnasby@pervasive.com>)
Responses Re: Dead Space Map
List pgsql-hackers
"Jim C. Nasby" <jnasby@pervasive.com> writes:
> On Mon, Feb 27, 2006 at 03:05:41PM -0500, Tom Lane wrote:
>> Moreover, you haven't pointed to any strong reason to adopt this
>> methodology.  It'd only be a win when vacuuming pretty small numbers
>> of tuples, which is not the design center for VACUUM, and isn't likely
>> to be the case in practice either if you're using autovacuum.  If you're
>> removing say 1% of the tuples, you are likely to be hitting every index
>> page to do it, meaning that the scan approach will be significantly
>> *more* efficient than retail lookups.

> The use case is any large table that sees updates in 'hot spots'.
> Anything that's based on current time is a likely candidate, since often
> most activity only concerns the past few days of data.

I'm unmoved by that argument too.  If the updates are clustered then
another effect kicks in: the existing btbulkdelete approach is able to
collapse all the deletions on a given index page into one WAL record.
With retail deletes it'd be difficult if not impossible to do that,
resulting in a significant increase in WAL traffic during a vacuum.
(We know it's significant because we saw a good improvement when we
fixed btbulkdelete to work that way, instead of issuing a separate
WAL record per deleted index entry as it once did.)
        regards, tom lane


pgsql-hackers by date:

Previous
From: Christopher Kings-Lynne
Date:
Subject: Re: character encoding in StartupMessage
Next
From: Greg Stark
Date:
Subject: Re: Dead Space Map