Re: limiting hint bit I/O - Mailing list pgsql-hackers

From Jim Nasby
Subject Re: limiting hint bit I/O
Date
Msg-id C45117AC-5AE6-4101-B722-6CE4E159D154@nasby.net
Whole thread Raw
In response to Re: limiting hint bit I/O  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Responses Re: limiting hint bit I/O
Re: limiting hint bit I/O
List pgsql-hackers
On Jan 16, 2011, at 4:37 PM, Kevin Grittner wrote:
> Robert Haas  wrote:
>
>> a quick-and-dirty attempt to limit the amount of I/O caused by hint
>> bits. I'm still very interested in knowing what people think about
>> that.
>
> I found the elimination of the response-time spike promising.  I
> don't think I've seen enough data yet to feel comfortable endorsing
> it, though.  I guess the question in my head is: how much of the
> lingering performance hit was due to having to go to clog and how
> much was due to competition with the deferred writes?  If much of it
> is due to repeated recalculation of visibility based on clog info, I
> think there would need to be some way to limit how many times that
> happened before the hint bits were saved.

What if we sped up the case where hint bits aren't set? Has anyone collected data on the actual pain points of checking
visibilitywhen hint bits aren't set? How about when setting hint bits is intentionally delayed? I wish we had some more
infrastructurearound the XIDCACHE counters; having that info available for people's general workloads might be
extremelyvaluable. Even if I was to compile with it turned on, it seems the only way to get at it is via stderr, which
isvery hard to deal with. 

Lacking performance data (and for my own education), I've spent the past few hours studying HeapTupleSatisfiesNow(). If
I'munderstanding it correctly, the three critical functions from a performance standpoint are
TransactionIdIsCurrentTransactionId,TransactionIdIsInProgress and TransactionIdDidCommit. Note that all 3 can
potentiallybe called twice; once to check xmin and once to check xmax. 

ISTM TransactionIdIsCurrentTransactionId is missing a shortcut: shouldn't we be able to immediately return false if the
XIDwe're checking is older than some value, like global xmin? Maybe it's only worth checking that case if we hit a
subtransaction,but if the check is faster than one or two loops through the binary search... I would think this at
leastwarrants a one XID cache ala cachedFetchXidStatus (though it would need to be a different cache...) Another issue
isthat TransactionIdIsInProgress will call this function as well, unless it skips out because the transaction is <
RecentXmin.

TransactionIdIsInProgress does a fair amount of easy checking already... the biggest thing is that if it's less than
RecentXminwe bounce out immediately. If we can't bounce out immediately though, this routine gets pretty expensive
unlessthe XID is currently running and is top-level. It's worse if there are subxacts and can be horribly bad if any
subxactcaches have overflowed. Note that if anything has overflowed, then we end up going to clog and possibly
pg_subtrans.

Finally, TransactionIdDidCommit hits clog.

So the degenerate cases seem to be:

- Really old XIDs. These suck because there's a good chance we'll have to read from clog.
- XIDs > RecontXmin that are not currently running top-level transactions. The pain here increases with subtransaction
use.

For the second case, if we can ensure that RecentXmin is not very old then there's generally a smaller chance that
TransactionIdIsInProgresshas to do a lot of work. My experience is that most systems that have a high transaction rate
don'tend up with a lot of long-running transactions. Storing a list of the X oldest transactions would allow us to keep
RecentXmincloser to the most recent XID. 

For the first case, we should be able to create a more optimized clog lookup method that works for older XIDs. If we
restrictthis to XIDs that are older than GlobalXmin then we can simplify things because we don't have to worry about
transactionsthat are in-progress. We also don't need to differentiate between subtransactions and their parents
(though,we obviously need to figure out whether a subtransaction is considered to be committed or not). Because we're
restrictingthis to XIDs that we know we can determine the state of, we only need to store a maximum of 1 bit per XID.
That'salready half the size of clog. But because we don't have to build this list on the fly (we're don't need to
updateit on every commit/abort as long as we know the range of XIDs that are stored), we don't have to support random
writes.That means we can use a structure that's more complex to maintain than a simple bitmap. Or maybe we stick with a
bitmapbut compress it. 
--
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net




pgsql-hackers by date:

Previous
From: Itagaki Takahiro
Date:
Subject: Re: multiset patch review
Next
From: Heikki Linnakangas
Date:
Subject: Re: Confusing comment in TransactionIdIsInProgress