Re: removing PD_ALL_VISIBLE - Mailing list pgsql-hackers

From Robert Haas
Subject Re: removing PD_ALL_VISIBLE
Date
Msg-id CA+TgmoYd4UQhq0Vopu-7X2JB7TcsSX1z0=OJDUVRikWabaST1A@mail.gmail.com
Whole thread Raw
Responses Re: removing PD_ALL_VISIBLE  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-hackers
On Wed, May 29, 2013 at 1:11 PM, Jeff Davis <pgsql@j-davis.com> wrote:
> On Tue, 2013-05-28 at 19:51 -0400, Robert Haas wrote:
>> > If we just wanted to reduce read cost, why not just take a simpler
>> > approach and give the visibility map a "isfrozen" bit?  Then we'd know
>> > which pages didn't need rescanning without nearly as much complexity.
>>
>> That would break pg_upgrade, which would have to remove visibility map
>> forks when upgrading.  More importantly, it would require another
>> round of complex changes to the write-ahead logging in this area.
>> It's not obvious to me that we'd end up ahead of where we are today,
>> although perhaps I am a pessimist.
>
> If we removed PD_ALL_VISIBLE, then this would be very simple, right? We
> would just follow normal logging rules for setting the visible or frozen
> bit.

I don't see how that makes Josh's proposal any simpler.  His proposal
was to change, in a backward-incompatible fashion, the contents of the
visibility map.  Getting rid of PD_ALL_VISIBLE will not eliminate that
backward-incompatibility.  Neither will it eliminate the need to keep
the visibility/freeze map in sync with the heap itself.  Whether we
get rid of PD_ALL_VISIBLE or not, we'll still have to go look at every
type of WAL record that clears the visibility map bit and make it
clear both the visibility and freeze bits.  We'll still need a WAL
record to set the visibility map bit, just as we do today, and we'll
also need a new WAL record type (or a change to the existing WAL
record type) to set the all-frozen bit, when applicable.

Now, independently of Josh's proposal, we could change PD_ALL_VISIBLE
to emit FPIs for every heap page it touches.  For pages that have been
hit by updates or deletes, this would be pretty much free, in 9.3,
since the PD_ALL_VISIBLE bit will probably be set at the same time
we're setting dead line pointers to unused, which is a WAL-logged
operation anyway.  However, for pages that have been hit only by
inserts, this would emit many extra FPIs.

Again independently of Josh's proposal, we could eliminate
PD_ALL_VISIBLE.  This would require either surrendering the
optimization whereby sequential scans can skip visibility checks on
individual tuples within the page, or referring to the visibility map
to get the bit that way.  I know you tested this and couldn't measure
an impact, but with all respect I find that result hard to accept.
Contention around buffer locks and pins is very real; why should it
matter on other workloads and not matter on this one?  It would also
require page modifications prior to consistency to clear the
visibility map bit unconditionally, instead of only when
PD_ALL_VISIBLE is set on the page (though I think it'd be OK to pay
that price if it ended there).

AFAICS, the main benefit of eliminating PD_ALL_VISIBLE is that we
eliminate one write cycle; that is, we won't dirty the page once to
hint it and then again to mark it all-visible.  But as of 9.3, that
should really only be a problem in the insert-only case.  And in that
case, my proposal to consider all-visible pages as frozen would be a
huge win, because you'd only need to emit XLOG_HEAP_VISIBLE for every
page in the heap, rather than XLOG_HEAP_FREEZE.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Running pgindent
Next
From: Amit Langote
Date:
Subject: Behavior of a pg_trgm index for 2 (or < 3) character LIKE queries