Re: should crash recovery ignore checkpoint_flush_after ? - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: should crash recovery ignore checkpoint_flush_after ?
Date
Msg-id 20200119161357.GR26045@telsasoft.com
Whole thread Raw
In response to Re: should crash recovery ignore checkpoint_flush_after ?  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Sat, Jan 18, 2020 at 03:32:02PM -0800, Andres Freund wrote:
> On 2020-01-19 09:52:21 +1300, Thomas Munro wrote:
> > On Sun, Jan 19, 2020 at 3:08 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
> > Does sync_file_range() even do anything for non-mmap'd files on ZFS?
> 
> Good point. Next time it might be worthwhile to use strace -T to see
> whether the sync_file_range calls actually take meaningful time.

> Yea, it requires the pages to be in the pagecache to do anything:

>     if (!mapping_cap_writeback_dirty(mapping) ||
>         !mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
>         return 0;

That logic is actually brand new (Sep 23, 2019, linux 5.4)

https://github.com/torvalds/linux/commit/c3aab9a0bd91b696a852169479b7db1ece6cbf8c#diff-fd2d793b8b4760b4887c8c7bbb3451d7

Running a manual CHECKPOINT, I saw stuff like:

sync_file_range(0x15f, 0x1442c000, 0x2000, 0x2) = 0 <2.953956>
sync_file_range(0x15f, 0x14430000, 0x4000, 0x2) = 0 <0.006395>
sync_file_range(0x15f, 0x14436000, 0x4000, 0x2) = 0 <0.003859>
sync_file_range(0x15f, 0x1443e000, 0x2000, 0x2) = 0 <0.027975>
sync_file_range(0x15f, 0x14442000, 0x2000, 0x2) = 0 <0.000048>

And actually, that server had been running its DB instance on a centos6 VM
(kernel-2.6.32-754.23.1.el6.x86_64), shared with the appserver, to mitigate
another issue last year.  I moved the DB back to its own centos7 VM
(kernel-3.10.0-862.14.4.el7.x86_64), and I cannot see that anymore.
It seems if there's any issue (with postgres or otherwise), it's vastly
mitigated or much harder to hit under modern kernels.

I also found these:
https://github.com/torvalds/linux/commit/23d0127096cb91cb6d354bdc71bd88a7bae3a1d5 (master v5.5-rc6...v4.4-rc1)
https://github.com/torvalds/linux/commit/ee53a891f47444c53318b98dac947ede963db400 (master v5.5-rc6...v2.6.29-rc1)

The 2nd commit is maybe the cause of the issue.

The first commit is supposedly too new to explain the difference between the
two kernels, but I'm guessing redhat maybe backpatched it into the 3.10 kernel.

Thanks,
Justin



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: SLRU statistics
Next
From: "曾文旌(义从)"
Date:
Subject: Re: [Proposal] Global temporary tables