Re: BUG #13368: standby cluster immediately promotes after pg_basebackup from previously promoted master - Mailing list pgsql-bugs

From Fujii Masao
Subject Re: BUG #13368: standby cluster immediately promotes after pg_basebackup from previously promoted master
Date
Msg-id CAHGQGwHoHNmZGZ4UXta8UtxXUu0-BpLk-qNVyiNs0ONX2mnB+Q@mail.gmail.com
Whole thread Raw
In response to Re: BUG #13368: standby cluster immediately promotes after pg_basebackup from previously promoted master  (Fujii Masao <masao.fujii@gmail.com>)
List pgsql-bugs
On Fri, Jul 3, 2015 at 9:26 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Fri, Jul 3, 2015 at 4:03 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> On Thu, Jul 2, 2015 at 10:00 PM, Fujii Masao wrote:
>>>> At the beginning of
>>>> StartupXLOG() before parsing recovery.conf we could check for the
>>>> existence of promotion trigger files and unlink them unconditionally.
>>>
>>> There seems to be still race condition: postmaster can receive SIGUSR1
>>> before the startup process removes the promotion trigger file. Then
>>> the variable promote_triggered can be set to true unexpectedly.
>>
>> Ah, yes. That's indeed possible.
>>
>>> So, what about making postmaster remove the trigger file unconditionally
>>> before the startup process is forked, instead? For example, just after
>>> PostmasterMain() calls RemovePgTempFiles()?
>>
>> Sounds fine, no other processes are running at this point.
>
> Thanks for updating the patch!
>
> +                * Skip promote signal files. Those files may have
> been created by
> +                * pg_ctl to trigger a promotion but they do not
> belong to a base
> +                * backup.
> +                */
> +               if (strcmp(de->d_name, PROMOTE_SIGNAL_FILE) == 0 ||
> +                       strcmp(de->d_name, FALLBACK_PROMOTE_SIGNAL_FILE) == 0)
> +                       continue;
>
> Do we really want this? This is not required because the files are
> always removed at the server startup. Also the backup performance
> gain by skipping them would be almost zero since their size is zero.
> That is, no need to add this for performance. OTOH, it's harmless
> to add this. But I don't want to enlarge the "skip file list" for
> pg_basebackup beyond necessary. So I removed this from the patch.
>
> BTW, if we really want this, not only pg_basebackup but also pg_rewind
> would need to skip the files.
>
> +        * Remove any trigger file able to trigger promotion, this is safe from
> +        * race conditions involving signals as no other processes are running
> +        * yet.
> +        */
> +       RemoveStandbyTrigger();
>
> I changed the function name to RemovePromoteSignalFiles().
>
> I added more detail comment about the race condition that we're
> talking about, into here.
>
> Updated version of the patch attached.

Pushed.

Regards,

--
Fujii Masao

pgsql-bugs by date:

Previous
From: Fujii Masao
Date:
Subject: Re: GRANT USAGE ON SEQUENCE missing from psql command completion
Next
From: hikkis21c@naver.com
Date:
Subject: BUG #13610: upgrade fail