Home > mailing lists

Re: should crash recovery ignore checkpoint_flush_after ? - Mailing list pgsql-hackers

From	Justin Pryzby
Subject	Re: should crash recovery ignore checkpoint_flush_after ?
Date	January 19, 2020 16:13:57
Msg-id	20200119161357.GR26045@telsasoft.com Whole thread Raw
In response to	Re: should crash recovery ignore checkpoint_flush_after ? (Andres Freund <andres@anarazel.de>)
List	pgsql-hackers

Tree view

On Sat, Jan 18, 2020 at 03:32:02PM -0800, Andres Freund wrote:
> On 2020-01-19 09:52:21 +1300, Thomas Munro wrote:
> > On Sun, Jan 19, 2020 at 3:08 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
> > Does sync_file_range() even do anything for non-mmap'd files on ZFS?
> 
> Good point. Next time it might be worthwhile to use strace -T to see
> whether the sync_file_range calls actually take meaningful time.

> Yea, it requires the pages to be in the pagecache to do anything:

>     if (!mapping_cap_writeback_dirty(mapping) ||
>         !mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
>         return 0;

That logic is actually brand new (Sep 23, 2019, linux 5.4)

https://github.com/torvalds/linux/commit/c3aab9a0bd91b696a852169479b7db1ece6cbf8c#diff-fd2d793b8b4760b4887c8c7bbb3451d7

Running a manual CHECKPOINT, I saw stuff like:

sync_file_range(0x15f, 0x1442c000, 0x2000, 0x2) = 0 <2.953956>
sync_file_range(0x15f, 0x14430000, 0x4000, 0x2) = 0 <0.006395>
sync_file_range(0x15f, 0x14436000, 0x4000, 0x2) = 0 <0.003859>
sync_file_range(0x15f, 0x1443e000, 0x2000, 0x2) = 0 <0.027975>
sync_file_range(0x15f, 0x14442000, 0x2000, 0x2) = 0 <0.000048>

And actually, that server had been running its DB instance on a centos6 VM
(kernel-2.6.32-754.23.1.el6.x86_64), shared with the appserver, to mitigate
another issue last year.  I moved the DB back to its own centos7 VM
(kernel-3.10.0-862.14.4.el7.x86_64), and I cannot see that anymore.
It seems if there's any issue (with postgres or otherwise), it's vastly
mitigated or much harder to hit under modern kernels.

I also found these:
https://github.com/torvalds/linux/commit/23d0127096cb91cb6d354bdc71bd88a7bae3a1d5 (master v5.5-rc6...v4.4-rc1)
https://github.com/torvalds/linux/commit/ee53a891f47444c53318b98dac947ede963db400 (master v5.5-rc6...v2.6.29-rc1)

The 2nd commit is maybe the cause of the issue.

The first commit is supposedly too new to explain the difference between the
two kernels, but I'm guessing redhat maybe backpatched it into the 3.10 kernel.

Thanks,
Justin

pgsql-hackers by date:

From: Tomas Vondra
Date: 19 January 2020, 14:37:07
Subject: SLRU statistics

From: "曾文旌(义从)"
Date: 19 January 2020, 17:04:38
Subject: Re: [Proposal] Global temporary tables

Re: should crash recovery ignore checkpoint_flush_after ? - Mailing list pgsql-hackers

Previous

Next