Re: Two fsync related performance issues? - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Two fsync related performance issues?
Date
Msg-id 20200512060430.GI88791@paquier.xyz
Whole thread Raw
In response to Re: Two fsync related performance issues?  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Responses Re: Two fsync related performance issues?
List pgsql-hackers
On Tue, May 12, 2020 at 12:55:37PM +0900, Fujii Masao wrote:
> On 2020/05/12 9:42, Paul Guo wrote:
>> 1. StartupXLOG() does fsync on the whole data directory early in
>> the crash recovery. I'm wondering if we could skip some
>> directories (at least the pg_log/, table directories) since wal,
>> etc could ensure consistency.
>
> I agree that we can skip log directory but I'm not sure if skipping
> table directory is really safe. Also ISTM that we can skip the directories
> that those contents are removed or zeroed during recovery,
> for example, pg_snapshots, pg_substrans, etc.

Basically excludeDirContents[] as of basebackup.c.

>> RecreateTwoPhaseFile() writes a state file for a prepared
>> transaction and does fsync. It might be good to do fsync for all
>> files once after writing them, given the kernel is able to do
>> asynchronous flush when writing those file contents. If
>> the TwoPhaseState->numPrepXacts is large we could do batching to
>> avoid the fd resource limit. I did not test them yet but this
>> should be able to speed up checkpoint/restartpoint a bit.
>
> It seems worth making the patch and measuring the performance improvement.

You would need to do some micro-benchmarking here, so you could
plug-in some pg_rusage_init() & co within this code path with many 2PC
files present at the same time.  However, I would believe that this is
not really worth the potential code complications.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: A comment fix
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: A comment fix