Re: Visibility map and hint bits - Mailing list pgsql-hackers

From Merlin Moncure
Subject Re: Visibility map and hint bits
Date
Msg-id BANLkTikAaHVVBNfJ82UwWeQfA4jrfy3TAw@mail.gmail.com
Whole thread Raw
In response to Re: Visibility map and hint bits  (Bruce Momjian <bruce@momjian.us>)
Responses Re: Visibility map and hint bits  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
List pgsql-hackers
On Thu, May 5, 2011 at 1:34 PM, Bruce Momjian <bruce@momjian.us> wrote:
> Merlin Moncure wrote:
>> On Thu, May 5, 2011 at 11:59 AM, Bruce Momjian <bruce@momjian.us> wrote:
>> > There has been a lot of recent discussion about the visibility map (for
>> > index-only scans) and hint bits (trying to avoid double-writing a
>> > table).
>>
>> I still think a small tqual.c maintained cache of hint bits will
>> effectively eliminate hint bit i/o issues surrounding bulk loads.  Tom
>> fired a shot across the bow regarding the general worthiness of that
>> technique though (see:
>> http://postgresql.1045698.n5.nabble.com/Process-local-hint-bit-cache-td4270229.html)
>> :(.  I can rig up a cleaned up version of the patch pretty
>> easily...it's a local change and fairly simple.
>>
>> I don't think there is any way to remove the hint bits without
>> suffering some other problem.
>
> Was that the idea that the pages had to fit in the cache and be updated
> with hint bits before being written to disk?  Restricting that to the
> size of the buffer cache seemed very limiting.
>
> One 8k visibilty map page can hold bits for 1/2 gig of heap pages so I
> thought that would be a better all-visible indictor and avoid many
> all-visible page writes in bulk load cases.

no, that was my first idea -- check visibility when you evict.  that
helps a different problem but not bulk loads.  One way it could help
is for marking PD_ALL_VISIBLE.  This might also be a winner but there
is some valid skepticism that adding more work for bgwriter is really
a good idea.

The tqual cache idea is such that there is a small cache that
remembers the commit/cancel status of recently seen transactions. If
scan a tuple and the status is known via cache, you set the bit but
don't mark the page dirty.  That way, if you are scanning a lot of
unhinted tuples with similar xid, you don't need to jam out i/o.  I
think the general concept is clean, but it might need some buy in from
tom and some performance testing for justification.

The alternate 'cleaner' approach of maintaining larger transam.c cache
had some downsides I saw no simple workaround for.

merlin


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Visibility map and hint bits
Next
From: "Kevin Grittner"
Date:
Subject: Re: Visibility map and hint bits