Re: Process local hint bit cache - Mailing list pgsql-hackers
From | Merlin Moncure |
---|---|
Subject | Re: Process local hint bit cache |
Date | |
Msg-id | BANLkTikLvwFMmFPvmDmX3OONJhTP+deKog@mail.gmail.com Whole thread Raw |
In response to | Re: Process local hint bit cache (Simon Riggs <simon@2ndQuadrant.com>) |
Responses |
Re: Process local hint bit cache
|
List | pgsql-hackers |
On Wed, Jun 29, 2011 at 9:09 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > On Sat, Apr 2, 2011 at 8:40 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Merlin Moncure <mmoncure@gmail.com> writes: > >>> aside: >>> Moving TransactionIdInProgress below TransactionIdDidCommit can help >>> in once sense: TransactionIdDidCommit grabs the XidStatus but discards >>> the knowledge if the transaction is known aborted. >> >> Doesn't the single-entry cache help with that? > > Yes, it does. correct. that line of analysis was refuted both in terms of benefit and on structural grounds by Tom. > I'm a little worried by some of the statements around this patch. It > would be good to get a single clear analysis, then build the patch > from that. > > * TransactionIdDidCommit grabs XidStatus and then keeps it - if the > xact is known aborted or known committed. > * In the top of the patch you mention that "It's tempting to try and > expand the simple last transaction status in > transam.c but this check happens too late in the critical visibility > routines to provide much benefit. For the fastest possible cache > performance with the least amount of impact on scan heavy cases, you have > to maintain the cache here if you want to avoid dirtying the page > based on interaction with the cache." > > We call TransactionIdIsKnownCompleted() before we touch shared memory > in TransactionIdIsInProgress(), which seems early enough for me. > Maybe I've misunderstood your comments. This is correct. If you are scanning tuples all with the same transaction, the transam.c cache will prevent some of the worst problems. However, going through all the mechanics in the transam.c covered case is still fairly expensive. You have to call several non-inlined functions, including XLogNeedsFlush(), over and over. This is more expensive than it looks and a transam.c cache implementation is hamstrung out of the gate if you are trying avoid i/o...it isn't really all that much cheaper than jumping out to the slru and will penalize repetitive scans (I can prove this with benchmarks). Bear in mind I started out with the intent of working in transam.c, and dropped it when it turned out not to work. > Also, the reason we set the hint bits is (in my understanding) so that > other users will avoid having to do clog lookups. If we only cache xid > status locally, we would actually generate more clog lookups, not > less. I realise that reduces I/O, so it seems to be a straight > tradeoff between I/O and clog access. That is not correct. Any cache 'miss' on a page continues to fall through to SetHintBitsCache() which dirties the page as it always has.This is a key innovation the patch IMNSHO. Your worstcase is the old behavior -- the maximum number of times the fault to clog can happen for a page is once. It's impossible to even get one more clog access than you did with stock postgres. > For me, it makes sense for us to set hint bits on already-dirty pages > when they are written out by the bgwriter. At the moment we do very > little to amortise the I/O that must already happen. I could agree with this. However if the bgwriter is forcing out pages before the hint bits can be set you have a problem. Also adding more work to the bgwriter may cause other secondary performance issues. merlin
pgsql-hackers by date: