Re: fsync data directory after DB crash - Mailing list pgsql-general

From Thomas Munro
Subject Re: fsync data directory after DB crash
Date
Msg-id CA+hUKGL9prvoxxE2gBk=Cm+B5=yynB5Vbj+7hnHFFY7S_rO3eA@mail.gmail.com
Whole thread Raw
In response to 回复: fsync data directory after DB crash  ("Pandora" <yeyukui@qq.com>)
List pgsql-general
On Wed, Jul 19, 2023 at 2:09 PM Pandora <yeyukui@qq.com> wrote:
> Yes, I saw the usage of syncfs in PG14, but it is recommended to use it on Linux 5.8 or higher. If my OS version is
lowerthan 5.8, can I still enable it? 

Nothing stops you from enabling it, it's fairly ancient and should
work.  It just doesn't promise to report errors before Linux 5.8,
which is why we don't recommend it, so you have to figure out the
risks.  One way to think about the risks: all we do is log the errors,
but you could probably also check the kernel logs for errors.

The edge cases around writeback failure are a tricky subject.  If the
reason we are running crash recovery is because we experienced an I/O
error and PANIC'd before, then it's possible for
recovery_init_sync_method=fsync to succeed while there is still
phantom data in the page cache masquerading as "clean" (ie will never
be sent to the disk by Linux).  So at least in some cases, it's no
better than older Linux's syncfs for our purposes.

(I think the comment that Michael quoted assumes the default FreeBSD
caching model: that cached data stays dirty until it's transferred to
disk or the file system is force-removed, whereas the Linux model is:
cached data stays dirty until the kernel has attempted to transfer it
to disk just once, and then it'll report an error to user space one
time (or, in older versions, sometimes fewer) and it is undefined (ie
depends on file system) whether the affected data is forgotten from
cache, or still present as phantom data that is bogusly considered
clean.  The reason this probably isn't a bigger deal than it sounds
may be that "transient" I/O failures are probably rare -- it's more
likely that a system with failing storage just completely
self-destructs and you never reach these fun edge cases.  But as
database hackers, we try to think about this stuff... perhaps one day
soon we'll be able to just go around this particular molehill with
direct I/O.)



pgsql-general by date:

Previous
From: Albrecht Dreß
Date:
Subject: pg_upgradecluster fails if pg_hba.conf contains "@file" entries
Next
From: gzh
Date:
Subject: Re: How to improve the performance of my SQL query?