On 2021/05/18 14:53, Dilip Kumar wrote:
> On Mon, May 17, 2021 at 7:59 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>>
>> If a promotion is triggered while recovery is paused, the paused state ends
>> and promotion continues. But currently pg_get_wal_replay_pause_state()
>> returns 'paused' in that case. Isn't this a bug?
>>
>> Attached patch fixes this issue by resetting the recovery pause state to
>> 'not paused' when standby promotion is triggered.
>>
>> Thought?
>>
>
> I think, prior to commit 496ee647ecd2917369ffcf1eaa0b2cdca07c8730
> (Prefer standby promotion over recovery pause.) this behavior was fine
> because the pause was continued but after this commit now we are
> giving preference to pause so this is a bug so need to be fixed.
>
> The fix looks fine but I think along with this we should also return
> immediately from the pause loop if promotion is requested. Because if
> we recheck the recovery pause then someone can pause again and we will
> be in loop so better to exit as soon as promotion is requested, see
> attached patch. Should be applied along with your patch.
But this change can cause the recovery to continue with insufficient parameter
settings if a promotion is requested while the server is in the paused state
because of such invalid settings. This behavior seems not safe.
If this my understanding is right, the recovery should abort immediately
(i.e., FATAL error ""recovery aborted because of insufficient parameter settings"
should be thrown) if a promotion is requested in that case, like when
pg_wal_replay_resume() is executed in that case. Thought?
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION