Re: Problem while setting the fpw with SIGHUP - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Problem while setting the fpw with SIGHUP
Date
Msg-id CAFiTN-vT5m4=CLuZ6pGtNE=_+2rN5nzE6Cd9g9GWv1rLsnyPcQ@mail.gmail.com
Whole thread Raw
In response to Re: Problem while setting the fpw with SIGHUP  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Problem while setting the fpw with SIGHUP  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
On Fri, Mar 16, 2018 at 10:53 AM, Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Mar 13, 2018 at 05:04:04PM +0900, Michael Paquier wrote:
> Instead of doing what you are suggesting, why not moving
> InitXLogInsert() out of InitXLOGAccess() and change InitPostgres() so as
> the allocations for WAL inserts is done unconditionally?  This has
> the cost of also making this allocation even for backends which are
> started during recovery, still we are talking about allocating a couple
> of bytes in exchange of addressing completely all race conditions in
> this area.  InitXLogInsert() does not depend on any post-recovery data
> like ThisTimeLineId, so a split is possible.

I have been hacking things this way, and it seems to me that it takes
care of all this class of problems.  CreateCheckPoint() actually
mentions that InitXLogInsert() cannot be called in a critical section,
so the comments don't match the code.  I also think that we still want
to be able to use RecoveryInProgress() in critical sections to do
decision-making for the generation of WAL records

Thanks for the patch, the idea looks good to me.  Please find some comments and updated patch.

I think like WALWriterProcess, we need to call InitXLogInsert for the CheckpointerProcess as well as for the BgWriterProcess
because earlier they were calling InitXLogInsert while check RecoveryInProgress before inserting the WAL.

see below crash:
#0  0x00007f89273a65d7 in raise () from /lib64/libc.so.6
#1  0x00007f89273a7cc8 in abort () from /lib64/libc.so.6
#2  0x00000000009fd24e in errfinish (dummy=0) at elog.c:556
#3  0x00000000009ff70b in elog_finish (elevel=20, fmt=0xac0d1d "too much WAL data") at elog.c:1378
#4  0x0000000000558766 in XLogRegisterData (data=0xf3efac <fullPageWrites> "\001", len=1) at xloginsert.c:330
#5  0x000000000055080e in UpdateFullPageWrites () at xlog.c:9569
#6  0x00000000007ea831 in UpdateSharedMemoryConfig () at checkpointer.c:1360
#7  0x00000000007e95b1 in CheckpointerMain () at checkpointer.c:370
#8  0x0000000000561680 in AuxiliaryProcessMain (argc=2, argv=0x7fffcfd4bec0) at bootstrap.c:447

I have modified you patch and called InitXLogInsert for CheckpointerProcess and BgWriterProcess also. After that the
issue is solved and fpw is getting set properly.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachment

pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: jsonpath
Next
From: Tom Lane
Date:
Subject: Re: jsonpath