Re: Proposed LogWriter Scheme, WAS: Potential Large Performance - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Proposed LogWriter Scheme, WAS: Potential Large Performance
Date
Msg-id 200210051201.g95C12C19377@candle.pha.pa.us
Whole thread Raw
List pgsql-hackers
pgman wrote:
> Curtis Faith wrote:
> > Back-end servers would not issue fsync calls. They would simply block
> > waiting until the LogWriter had written their record to the disk, i.e.
> > until the sync'd block # was greater than the block that contained the
> > XLOG_XACT_COMMIT record. The LogWriter could wake up committed back-
> > ends after its log write returns.
> > 
> > The log file would be opened O_DSYNC, O_APPEND every time. The LogWriter
> > would issue writes of the optimal size when enough data was present or
> > of smaller chunks if enough time had elapsed since the last write.
> 
> So every backend is to going to wait around until its fsync gets done by
> the backend process?  How is that a win?  This is just another version
> of our GUC parameters:
>     
>     #commit_delay = 0               # range 0-100000, in microseconds
>     #commit_siblings = 5            # range 1-1000
> 
> which attempt to delay fsync if other backends are nearing commit.  
> Pushing things out to another process isn't a win;  figuring out if
> someone else is coming for commit is.  Remember, write() is fast, fsync
> is slow.

Let me add to what I just said:

While the above idea doesn't win for normal operation, because each
backend waits for the fsync, and we have no good way of determining of
other backends are nearing commit, a background WAL fsync process would
be nice if we wanted an option between fsync on (wait for fsync before
reporting commit), and fsync off (no crash recovery).

We could have a mode where we did an fsync every X milliseconds, so we
issue a COMMIT to the client, but wait a few milliseconds before
fsync'ing.  Many other databases have such a mode, but we don't, and I
always felt it would be valuable.  It may allow us to remove the fsync
option in favor of one that has _some_ crash recovery.
--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Proposed LogWriter Scheme, WAS: Potential Large Performance
Next
From: Hannu Krosing
Date:
Subject: Re: Proposed LogWriter Scheme, WAS: Potential Large