Re: crash-safe visibility map, take four - Mailing list pgsql-hackers

From Robert Haas
Subject Re: crash-safe visibility map, take four
Date
Msg-id AANLkTinn7DB=eTf4OmVZLUwfzufv7PkYbw-8hSCmXYr=@mail.gmail.com
Whole thread Raw
In response to Re: crash-safe visibility map, take four  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: crash-safe visibility map, take four
List pgsql-hackers
On Wed, Mar 30, 2011 at 8:52 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> Yeah, that's a straightforward way to fix it. I don't think the performance
> hit will be too bad. But we need to be careful not to hold locks while doing
> I/O, which might require some rearrangement of the code. We might want to do
> a similar dance that we do in vacuum, and call visibilitymap_pin first, then
> lock and update the heap page, and then set the VM bit while holding the
> lock on the heap page.

Looking at heap_insert(), for example, we get the target buffer by
calling RelationGetBufferForTuple(), which kicks out an already
x-locked buffer.  Until we have an x-lock on the buffer, we don't
actually know that there's enough free space for the new tuple.  But
if we release the x-lock to pin the visibility map page, then, one,
we're adding an additional lock acquisition and release cycle, and,
two, by the time we reacquire the x-lock the amount of free space may
no longer be sufficient.  Maybe we could check PD_ALL_VISIBLE before
taking the buffer lock - if it appears to be set, then we pin the
visibility map page before taking the buffer lock.  Otherwise, we take
the buffer lock at once.  Either way, once we have the lock, we
recheck the bit.  Only if it's set and we haven't got a pin do we need
to do the drop-lock-pin-reacquire-lock dance.  Is that at all
sensible?

I also wonder if we should be thinking about holding on to the
visibility map pin longer, rather than potentially reacquiring it for
every tuple inserted.  For bulk loading it doesn't matter; we'll
usually be extending the relation anyway, so the PD_ALL_VISIBLE bits
won't be set and we needn't ever hit the map.  But for a table where
there's a large amount of internal free space, it might matter.  Then
again maybe we don't clear all-visible bits often enough for it to be
an issue.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Another swing at JSON
Next
From: Merlin Moncure
Date:
Subject: Re: Process local hint bit cache