Re: AdvanceXLInsertBuffers() vs wal_sync_method=open_datasync - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: AdvanceXLInsertBuffers() vs wal_sync_method=open_datasync
Date
Msg-id 0005ca83-c5e5-4ea7-94a6-17e973fa47d8@iki.fi
Whole thread Raw
In response to AdvanceXLInsertBuffers() vs wal_sync_method=open_datasync  (Andres Freund <andres@anarazel.de>)
Responses Re: AdvanceXLInsertBuffers() vs wal_sync_method=open_datasync
List pgsql-hackers
On 10/11/2023 05:54, Andres Freund wrote:
> In this case I had used wal_sync_method=open_datasync - it's often faster and
> if we want to scale WAL writes more we'll have to use it more widely (you
> can't have multiple fdatasyncs in progress and reason about which one affects
> what, but you can have multiple DSYNC writes in progress at the same time).

Not sure I understand that. If you issue an fdatasync, it will sync all 
writes that were complete before the fdatasync started. Right? If you 
have multiple fdatasyncs in progress, that's true for each fdatasync. Or 
is there a bottleneck in the kernel with multiple in-progress fdatasyncs 
or something?

> After a bit of confused staring and debugging I figured out that the problem
> is that the RequestXLogSwitch() within the code for starting a basebackup was
> triggering writing back the WAL in individual 8kB writes via
> GetXLogBuffer()->AdvanceXLInsertBuffer(). With open_datasync each of these
> writes is durable - on this drive each take about 1ms.

I see. So the assumption in AdvanceXLInsertBuffer() is that XLogWrite() 
is relatively fast. But with open_datasync, it's not.

> To fix this, I suspect we need to make
> GetXLogBuffer()->AdvanceXLInsertBuffer() flush more aggressively. In this
> specific case, we even know for sure that we are going to fill a lot more
> buffers, so no heuristic would be needed. In other cases however we need some
> heuristic to know how much to write out.

+1. Maybe use the same logic as in XLogFlush().

I wonder if the 'flexible' argument to XLogWrite() is too inflexible. It 
would be nice to pass a hard minimum XLogRecPtr that it must write up 
to, but still allow it to write more than that if it's convenient.

-- 
Heikki Linnakangas
Neon (https://neon.tech)




pgsql-hackers by date:

Previous
From: jian he
Date:
Subject: Re: EXCLUDE COLLATE in CREATE/ALTER TABLE document
Next
From: Nathan Bossart
Date:
Subject: Re: CRC32C Parallel Computation Optimization on ARM