Re: Syncrep and improving latency due to WAL throttling - Mailing list pgsql-hackers
From | Jakub Wartak |
---|---|
Subject | Re: Syncrep and improving latency due to WAL throttling |
Date | |
Msg-id | CAKZiRmyR_OBZfvaG03piRxxg7XDC+dmGx50P6Pmn-tMBLLdhVQ@mail.gmail.com Whole thread Raw |
In response to | Re: Syncrep and improving latency due to WAL throttling (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Responses |
Re: Syncrep and improving latency due to WAL throttling
|
List | pgsql-hackers |
On Thu, Feb 2, 2023 at 11:03 AM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > > I agree that some other concurrent backend's > > COMMIT could fsync it, but I was wondering if that's sensible > > optimization to perform (so that issue_fsync() would be called for > > only commit/rollback records). I can imagine a scenario with 10 such > > concurrent backends running - all of them with this $thread-GUC set - > > but that would cause 20k unnecessary fsyncs (?) -- (assuming single > > HDD with IOlat=20ms and standby capable of sync-ack < 0.1ms , that > > would be wasted close to 400s just due to local fsyncs?). I don't have > > a strong opinion or in-depth on this, but that smells like IO waste. > > > > Not sure what optimization you mean, Let me clarify, let's say something like below (on top of the v3) just to save IOPS: --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -2340,6 +2340,7 @@ XLogWrite(XLogwrtRqst WriteRqst, TimeLineID tli, bool flexible) if (sync_method != SYNC_METHOD_OPEN && sync_method != SYNC_METHOD_OPEN_DSYNC) { + bool openedLogFile = false; if (openLogFile >= 0 && !XLByteInPrevSeg(LogwrtResult.Write, openLogSegNo, wal_segment_size)) @@ -2351,9 +2352,15 @@ XLogWrite(XLogwrtRqst WriteRqst, TimeLineID tli, bool flexible) openLogTLI = tli; openLogFile = XLogFileOpen(openLogSegNo, tli); ReserveExternalFD(); + openedLogFile = true; } - issue_xlog_fsync(openLogFile, openLogSegNo, tli); + /* can we bypass fsyncing() XLOG from the backend if + * we have been called without commit request? + * usually the feature will be off here (XLogDelayPending=false) + */ + if(openedLogFile == true || XLogDelayPending == false) + issue_xlog_fsync(openLogFile, openLogSegNo, tli); } + maybe some additional logic to ensure that this micro-optimization for saving IOPS would be not enabled if the backend is calling that XLogFlush/Write() for actual COMMIT record > But I think the backends still have to sleep at some point, so that they > don't queue too much unflushed WAL - that's kinda the whole point, no? Yes, but it can be flushed to standby, flushed locally but not fsynced locally (?) - provided that it was not COMMIT - I'm just wondering whether it makes sense (Question 1) > The issue is more about triggering the throttling too early, before we > hit the bandwidth limit. Which happens simply because we don't have a > very good way to decide whether the latency is growing, so the patch > just throttles everything. Maximum TCP bandwidth limit seems to be fluctuating in the real world I suppose, so it couldn't be a hard limit. On the other hand I can imagine operators setting "throttle-those-backends-if-global-WALlatencyORrate>XXX" (administrative decision). That would be cool to have but yes it would require WAL latency and rate measurement first (on its own that would make a very nice addition to the pg_stat_replication). But one thing to note would be that there could be many potential latencies (& WAL throughput rates) to consider (e.g. quorum of 3 standby sync having different latencies) - which one to choose? (Question 2) I think we have reached simply a decision point on whether the WIP/PoC is good enough as it is (like Andres wanted and you +1 to this) or it should work as you propose or maybe keep it as an idea for the future? -J.
pgsql-hackers by date: