Re: XLogFlush - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: XLogFlush
Date
Msg-id f67928030908310848w32a4d4bcr835a7f54cb2b1f91@mail.gmail.com
Whole thread Raw
In response to XLogFlush  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-hackers
On Fri, Aug 21, 2009 at 1:18 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
Maybe this is one of those things that is obvious when someone points
it out to you, but right now I am not seeing it.  If you look at the
last eight lines of this snippet from XLogFlush, you see that if we
obtain WriteRqstPtr under the WALInsertLock, then we both write and
flush up to the highest write request.  But if we obtain it under the
info_lck, then we write up to the highest write request but flush only
up to our own records flush request.  Why the disparate treatment?
The effect of this seems to be that when WALInsertLock is busy, group
commits are suppressed.

I realized I was misinterpreting this.  XLogWrite doesn't just flush up to WriteRqst.Flush, because fsync doesn't work that way.  If it flushes at all (which I think it always will when invoked from XLogFlush, as otherwise XLogFlush would not call it), it will flush up to WriteRqst.Write anyway, even if WriteRqst.Flush is behind.  So as long as record <= WriteRqst.Flush <= WriteRqst.Write, then it doesn't matter exactly what WriteRqst.Flush is.  The problem with group commit on a busy WALInsertLock is that if the xlogctl->LogwrtRqst.Write does get advanced by someone else, it is almost surely going to be while we are waiting on the WALWriteLock, and so too late for us to have discovered it when we previously checked under the protection of info_lck.  We should probably have an else branch on the LWLockConditionalAcquire so that if it fails, we get the info_lck and check again for advancement of xlogctl->LogwrtRqst.Write. 

But since Simon is doing big changes as part of sync rep, I'll hold off on doing much experimentation on this until then.

 
               LWLockRelease(WALInsertLock);
               WriteRqst.Write = WriteRqstPtr;
               WriteRqst.Flush = WriteRqstPtr;
       }
       else
       {
               WriteRqst.Write = WriteRqstPtr;
               WriteRqst.Flush = record;
       }

Cheers,

Jeff
 

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: autovacuum launcher using InitPostgres
Next
From: Alvaro Herrera
Date:
Subject: Re: autovacuum launcher using InitPostgres