Re: fdatasync performance problem with large number of DB files - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: fdatasync performance problem with large number of DB files
Date
Msg-id 52453bfc-40fb-7771-05ee-cbd04b6b9b6e@oss.nttdata.com
Whole thread Raw
In response to Re: fdatasync performance problem with large number of DB files  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: fdatasync performance problem with large number of DB files  (Paul Guo <guopa@vmware.com>)
List pgsql-hackers

On 2021/03/17 12:45, Thomas Munro wrote:
> On Tue, Mar 16, 2021 at 9:29 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>> On 2021/03/16 8:15, Thomas Munro wrote:
>>> I don't want to add a hypothetical sync_after_crash=none, because it
>>> seems like generally a bad idea.  We already have a
>>> running-with-scissors mode you could use for that: fsync=off.
>>
>> I heard that some backup tools sync the database directory when restoring it.
>> I guess that those who use such tools might want the option to disable such
>> startup sync (i.e., sync_after_crash=none) because it's not necessary.
> 
> Hopefully syncfs() will return quickly in that case, without doing any work?

Yes, in Linux.

> 
>> They can skip that sync by fsync=off. But if they just want to skip only that
>> startup sync and make subsequent recovery (or standby server) work with
>> fsync=on, they would need to shutdown the server after that startup sync
>> finishes, enable fsync, and restart the server. In this case, since the server
>> is restarted with the state=DB_SHUTDOWNED_IN_RECOVERY, the startup sync
>> would not be performed. This procedure is tricky. So IMO supporting
>> sync_after_crash=none would be helpful for this case and simple.
> 
> I still do not like this footgun :-)  However, perhaps I am being
> overly dogmatic.  Consider the change in d8179b00, which decided that
> I/O errors in this phase should be reported at LOG level rather than
> ERROR.  In contrast, my "sync_after_crash=wal" mode (which I need to
> rebase over this) will PANIC in this case, because any syncing will be
> handled through the usual checkpoint codepaths.
> 
> Do you think it would be OK to commit this feature with just "fsync"
> and "syncfs", and then to continue to consider adding "none" as a
> possible separate commit?

+1. "syncfs" feature is useful whether we also support "none" mode or not.
It's good idea to commit "syncfs" in advance.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: a verbose option for autovacuum
Next
From: Michael Paquier
Date:
Subject: psql tab completion for \h with IMPORT FOREIGN SCHEMA