Re: limiting hint bit I/O - Mailing list pgsql-hackers

From Merlin Moncure
Subject Re: limiting hint bit I/O
Date
Msg-id AANLkTimwi1JzV4FvOYixj4jsVWp_qUYk7a6_y+44girV@mail.gmail.com
Whole thread Raw
In response to Re: limiting hint bit I/O  (Jim Nasby <jim@nasby.net>)
Responses Re: limiting hint bit I/O  (Jim Nasby <jim@nasby.net>)
Re: limiting hint bit I/O  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Tue, Jan 18, 2011 at 3:47 AM, Jim Nasby <jim@nasby.net> wrote:
> On Jan 16, 2011, at 4:37 PM, Kevin Grittner wrote:
>> Robert Haas  wrote:
>>
>>> a quick-and-dirty attempt to limit the amount of I/O caused by hint
>>> bits. I'm still very interested in knowing what people think about
>>> that.
>>
>> I found the elimination of the response-time spike promising.  I
>> don't think I've seen enough data yet to feel comfortable endorsing
>> it, though.  I guess the question in my head is: how much of the
>> lingering performance hit was due to having to go to clog and how
>> much was due to competition with the deferred writes?  If much of it
>> is due to repeated recalculation of visibility based on clog info, I
>> think there would need to be some way to limit how many times that
>> happened before the hint bits were saved.
>
> What if we sped up the case where hint bits aren't set? Has anyone collected data on the actual pain points of
checkingvisibility when hint bits aren't set? How about when setting hint bits is intentionally delayed? I wish we had
somemore infrastructure around the XIDCACHE counters; having that info available for people's general workloads might
beextremely valuable. Even if I was to compile with it turned on, it seems the only way to get at it is via stderr,
whichis very hard to deal with. 
>
> Lacking performance data (and for my own education), I've spent the past few hours studying HeapTupleSatisfiesNow().
IfI'm understanding it correctly, the three critical functions from a performance standpoint are
TransactionIdIsCurrentTransactionId,TransactionIdIsInProgress and TransactionIdDidCommit. Note that all 3 can
potentiallybe called twice; once to check xmin and once to check xmax. 

hint bits give you two benefits: you don't have to lwlock the clog and
you don't have to go look them up.  a lookup is either a lru cache
lookup or an i/o lookup on the clog.  the cost of course is extra
writing out the bits.  in most workloads they are not even noticed but
in particular cases they are an i/o multiplier.

a few weeks back I hacked an experimental patch that removed the hint
bit action completely.  the results were very premature and/or
incorrect, but my initial findings suggested that hint bits might not
be worth the cost from performance standpoint.  i'd like to see some
more investigation in this direction before going with a complex
application mechanism (although that would be beneficial vs the status
quo).

an ideal testing environment to compare would be a mature database
(full clog) with some verifiable performance tests and a mixed
olap/oltp workload.

merlin


pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: pg_basebackup for streaming base backups
Next
From: David Fetter
Date:
Subject: Re: pg_filedump moved to pgfoundry