AW: AW: AW: WAL does not recover gracefully from out-of -dis k-sp ace - Mailing list pgsql-hackers

From Zeugswetter Andreas SB
Subject AW: AW: AW: WAL does not recover gracefully from out-of -dis k-sp ace
Date
Msg-id 11C1E6749A55D411A9670001FA687963368239@sdexcsrv1.f000.d0188.sd.spardat.at
Whole thread Raw
Responses Re: AW: AW: AW: WAL does not recover gracefully from out-of -dis k-sp ace  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
> > A short test shows, that opening the file O_SYNC, and thus avoiding fsync()
> > would cut the effective time needed to sync write the xlog more than in half.
> > Of course we would need to buffer >= 1 xlog page before write (or commit)
> > to gain the full advantage.
> 
> > prewrite 0 + write and fsync:        60.4 sec
> > sparse file + write with O_SYNC:        37.5 sec
> > no prewrite + write with O_SYNC:        36.8 sec
> > prewrite 0 + write with O_SYNC:        24.0 sec
> 
> This seems odd.  As near as I can tell, O_SYNC is simply a command to do
> fsync implicitly during each write call.  It cannot save any I/O unless
> I'm missing something significant.  Where is the performance difference
> coming from?

Yes, odd, but sure very reproducible here.

> The reason I'm inclined to question this is that what we want is not an
> fsync per write but an fsync per transaction, and we can't easily buffer
> all of a transaction's XLOG writes...

Yes, that is something to consider, but it would probably be sufficient to buffer 
1-3 optimal IO blocks (32-256k here).
I assumed that with a few busy clients the fsyncs would come close to 
one xlog page, but that is probably too few. 

Andreas


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: PQfinish(const PGconn *conn) question
Next
From: Bruce Momjian
Date:
Subject: Performance monitor ready