Re: Proposed LogWriter Scheme, WAS: Potential Large - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: Proposed LogWriter Scheme, WAS: Potential Large
Date
Msg-id 1033825452.9687.16.camel@taru.tm.ee
Whole thread Raw
In response to Re: Proposed LogWriter Scheme, WAS: Potential Large Performance  (Bruce Momjian <pgman@candle.pha.pa.us>)
Responses Re: Proposed LogWriter Scheme, WAS: Potential Large Performance
List pgsql-hackers
Bruce Momjian kirjutas L, 05.10.2002 kell 13:49:
> Curtis Faith wrote:
> > Back-end servers would not issue fsync calls. They would simply block
> > waiting until the LogWriter had written their record to the disk, i.e.
> > until the sync'd block # was greater than the block that contained the
> > XLOG_XACT_COMMIT record. The LogWriter could wake up committed back-
> > ends after its log write returns.
> > 
> > The log file would be opened O_DSYNC, O_APPEND every time. The LogWriter
> > would issue writes of the optimal size when enough data was present or
> > of smaller chunks if enough time had elapsed since the last write.
> 
> So every backend is to going to wait around until its fsync gets done by
> the backend process?  How is that a win?  This is just another version
> of our GUC parameters:
>     
>     #commit_delay = 0               # range 0-100000, in microseconds
>     #commit_siblings = 5            # range 1-1000
> 
> which attempt to delay fsync if other backends are nearing commit.  
> Pushing things out to another process isn't a win;  figuring out if
> someone else is coming for commit is. 

Exactly. If I understand correctly what Curtis is proposing, you don't
have to figure it out under his scheme - you just issue a WALWait
command and the WAL writing process notifies you when your transactions
WAL is safe storage. 

If the other committer was able to get his WALWait in before the actual
write took place, it will notified too, if not, it will be notified
about 1/166th sec. later (for 10K rpm disk) when it's write is done on
the next rev of disk platters.

The writer process should just issue a continuous stream of
aio_write()'s while there are any waiters and keep track which waiters
are safe to continue - thus no guessing of who's gonna commit.

If supported by platform this should use zero-copy writes - it should be
safe because WAL is append-only.

-----------
Hannu



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Proposed LogWriter Scheme, WAS: Potential Large Performance
Next
From: "Curtis Faith"
Date:
Subject: Re: Proposed LogWriter Scheme, WAS: Potential Large PerformanceGain in WAL synching