Thread: Changes to recovery_min_apply_delay are ignored while waiting for delay
Changes to recovery_min_apply_delay are ignored while waiting for delay
From
Soumyadeep Chakraborty
Date:
Hello, I came across this issue while I was tweaking a TAP test with Ashwin for this thread [1]. We noticed that while the startup process waits for a recovery delay, it does not respect changes to the recovery_min_apply_delay GUC. So: // Standby is waiting on recoveryWakeupLatch with a large recovery_min_apply_delay value node_standby->safe_psql('postgres', 'ALTER SYSTEM SET recovery_min_apply_delay TO 0;'); node_standby->reload; // Standby is still waiting, it does not proceed until the old timeout is elapsed. I attached a small patch to fix this. It makes it so that recovery_min_apply_delay is reconsulted in the loop to redetermine the wait interval. This makes it similar to the loop in CheckpointerMain, where CheckPointTimeout is consulted after interrupts are handled: if (elapsed_secs >= CheckPointTimeout) continue; /* no sleep for us ... */ I have also added a test to 005_replay_delay.pl. Regards, Soumyadeep(VMware) [1] https://www.postgresql.org/message-id/flat/CANXE4Tc3FNvZ_xAimempJWv_RH9pCvsZH7Yq93o1VuNLjUT-mQ%40mail.gmail.com
Attachment
Re: Changes to recovery_min_apply_delay are ignored while waiting for delay
From
Kyotaro Horiguchi
Date:
At Mon, 2 Aug 2021 22:21:56 -0700, Soumyadeep Chakraborty <soumyadeep2007@gmail.com> wrote in > Hello, > > I came across this issue while I was tweaking a TAP test with Ashwin for > this thread [1]. > > We noticed that while the startup process waits for a recovery delay, it does > not respect changes to the recovery_min_apply_delay GUC. So: > > // Standby is waiting on recoveryWakeupLatch with a large > recovery_min_apply_delay value > node_standby->safe_psql('postgres', 'ALTER SYSTEM SET > recovery_min_apply_delay TO 0;'); > node_standby->reload; > // Standby is still waiting, it does not proceed until the old timeout > is elapsed. > > I attached a small patch to fix this. It makes it so that > recovery_min_apply_delay is reconsulted in the loop to redetermine the wait > interval. This makes it similar to the loop in CheckpointerMain, where > CheckPointTimeout is consulted after interrupts are handled: > > if (elapsed_secs >= CheckPointTimeout) > continue; /* no sleep for us ... */ > > I have also added a test to 005_replay_delay.pl. > > Regards, > Soumyadeep(VMware) > > [1] https://www.postgresql.org/message-id/flat/CANXE4Tc3FNvZ_xAimempJWv_RH9pCvsZH7Yq93o1VuNLjUT-mQ%40mail.gmail.com Sounds reasonable and the patch looks fine as a whole. Applied cleanly and works as expected. The added test properly catches the issue. One comment from me. +$node_standby->safe_psql('postgres', "ALTER SYSTEM SET recovery_min_apply_delay TO 0;"); It might be better do "SET reco.. TO DEFAULT" instead. And how about adding the new test at the end of existing tests. We can get rid of the extra lines in the diff. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Changes to recovery_min_apply_delay are ignored while waiting for delay
From
Soumyadeep Chakraborty
Date:
Hi Kyotaro, Thanks for the review! On Mon, Aug 2, 2021 at 11:42 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > One comment from me. > > +$node_standby->safe_psql('postgres', "ALTER SYSTEM SET recovery_min_apply_delay TO 0;"); > > It might be better do "SET reco.. TO DEFAULT" instead. > Sure. > And how about adding the new test at the end of existing tests. We can > get rid of the extra lines in the diff. No problem. See attached v2. I didn't do that initially as the test file looks as though it is split into two halves: one for testing recovery_min_apply_delay and the other for testing recovery pause. So while making this change, I added a header comment for the newly added test case. Please take a look. Regards, Soumyadeep (VMware)
Attachment
Re: Changes to recovery_min_apply_delay are ignored while waiting for delay
From
Soumyadeep Chakraborty
Date:
Rebased. Also added a stronger check to see if the standby is stuck in recovery_min_apply_delay: $node_standby->poll_query_until('postgres', qq{ SELECT wait_event = 'RecoveryApplyDelay' FROM pg_stat_activity WHERE backend_type='startup'; }) or die "Timed out checking if startup is in recovery_min_apply_delay"; Attached v3. Regards, Soumyadeep (VMware)
Attachment
Re: Changes to recovery_min_apply_delay are ignored while waiting for delay
From
Michael Paquier
Date:
On Fri, Aug 06, 2021 at 04:59:55PM -0700, Soumyadeep Chakraborty wrote: > Rebased. Also added a stronger check to see if the standby is stuck in > recovery_min_apply_delay: > > $node_standby->poll_query_until('postgres', qq{ > SELECT wait_event = 'RecoveryApplyDelay' FROM pg_stat_activity > WHERE backend_type='startup'; > }) or die "Timed out checking if startup is in recovery_min_apply_delay"; Relying on wait events is a good idea here. This makes the test more reliable to make sure that the startup process is stuck in the WaitLatch phase. > Attached v3. Agreed that the current behavior is confusing. As you are using the commit timestamp for the comparison, this is right. One small-ish comment I have about the code is that we should mention recovery_min_apply_delay for HandleStartupProcInterrupts(), and not only the trigger file location. +# First, set the delay for the next commit to some obscenely high value. It has no need to be obscenely high, just high enough to give the time to slow machines to catch the wait event lookup done. So this could use better words to explain this choice. Anyway, it seems to me that something is incorrect in this new test (manual tests pass here, the TAP test is off). One thing we need to make sure for any new test added is that it correctly breaks if the fix proposed is *not* in place. And as far as I can see, the test passes even if the recalculation of delayUntil is done within the loop that reloads the configuration. The thing may be a bit tricky to make reliable as the parameter reloading may cause wait_event to not be RecoveryApplyDelay, so we should have at least a check based on a scan of pg_stat_replication.replay_lsn on the primary. Perhaps you have an other idea? When using wait_for_catchup(), I would recommend to list "replay" for this test, even if that's the default mode, to make clear what the intention is. -- Michael
Attachment
Re: Changes to recovery_min_apply_delay are ignored while waiting for delay
From
Soumyadeep Chakraborty
Date:
Hey Michael, Really appreciate the review! On Wed, Aug 11, 2021 at 12:40 AM Michael Paquier <michael@paquier.xyz> wrote: > Agreed that the current behavior is confusing. As you are using the > commit timestamp for the comparison, this is right. One small-ish > comment I have about the code is that we should mention > recovery_min_apply_delay for HandleStartupProcInterrupts(), and not > only the trigger file location. Cool, updated. > +# First, set the delay for the next commit to some obscenely high value. > It has no need to be obscenely high, just high enough to give the time > to slow machines to catch the wait event lookup done. So this could > use better words to explain this choice. Sounds good. Done. > Anyway, it seems to me that something is incorrect in this new test > (manual tests pass here, the TAP test is off). One thing we need to > make sure for any new test added is that it correctly breaks if the > fix proposed is *not* in place. And as far as I can see, the test > passes even if the recalculation of delayUntil is done within the loop > that reloads the configuration. The thing may be a bit tricky to make > reliable as the parameter reloading may cause wait_event to not be > RecoveryApplyDelay, so we should have at least a check based on a scan > of pg_stat_replication.replay_lsn on the primary. Perhaps you have > an other idea? Hmm, please see attached v4 which just shifts the test to the middle, like v1. When I run the test without the code change, the test hangs as expected in: # Now the standby should catch up. $node_primary->wait_for_catchup('standby', 'replay'); and passes with the code change, as expected. I can't explain why the test doesn't freeze up in v3 in wait_for_catchup() at the end. As for checking on the wait event, since we only signal the standby after performing the check, that should be fine. Nothing else would be sending a SIGHUP before the check. Is that assumption correct? > When using wait_for_catchup(), I would recommend to list "replay" for > this test, even if that's the default mode, to make clear what the > intention is. Done. Regards, Soumyadeep (VMware)
Attachment
Re: Changes to recovery_min_apply_delay are ignored while waiting for delay
From
Michael Paquier
Date:
On Fri, Aug 13, 2021 at 05:59:21PM -0700, Soumyadeep Chakraborty wrote: > and passes with the code change, as expected. I can't explain why the > test doesn't freeze up in v3 in wait_for_catchup() at the end. It took me some some to understand why. If I am right, that's because of the intermediate test block working on $standby_2 and the two INSERT queries of the primary. In v1 and v4, we have no activity on the primary between the first set of tests and yours, meaning that $standby has nothing to do. In v2 and v3, the two INSERT queries run on the primary for the purpose of the recovery pause make $standby_1 wait for the default value of recovery_min_apply_delay, aka 3s, in parallel. If the set of tests for $standby_2 is faster than that, we'd bump on the phase where the code still waited for 3s, not the 2 hours set, visibly. After considering this stuff, the order dependency we'd introduce in this test makes the whole thing more brittle than it should. And such an edge case does not seem worth spending extra cycles testing anyway, as if things break we'd finish with a test stuck for an unnecessary long time by relying on wait_for_catchup("replay"). We could use something else, say based on a lookup of pg_stat_activity but this still requires extra run time for the wait phases needed. So at the end I have dropped the test, but backpatched the fix. -- Michael
Attachment
Re: Changes to recovery_min_apply_delay are ignored while waiting for delay
From
Soumyadeep Chakraborty
Date:
On Sun, Aug 15, 2021 at 8:16 PM Michael Paquier <michael@paquier.xyz> wrote: > > On Fri, Aug 13, 2021 at 05:59:21PM -0700, Soumyadeep Chakraborty wrote: > > and passes with the code change, as expected. I can't explain why the > > test doesn't freeze up in v3 in wait_for_catchup() at the end. > > It took me some some to understand why. If I am right, that's because > of the intermediate test block working on $standby_2 and the two > INSERT queries of the primary. In v1 and v4, we have no activity on > the primary between the first set of tests and yours, meaning that > $standby has nothing to do. In v2 and v3, the two INSERT queries run > on the primary for the purpose of the recovery pause make $standby_1 > wait for the default value of recovery_min_apply_delay, aka 3s, in > parallel. If the set of tests for $standby_2 is faster than that, > we'd bump on the phase where the code still waited for 3s, not the 2 > hours set, visibly. I see, thanks a lot for the explanation. Thanks to your investigation, I can now kind of reuse some of the test mechanisms for the other patch that I am working on [1]. There, we don't have multiple standbys getting in the way, thankfully. > After considering this stuff, the order dependency we'd introduce in > this test makes the whole thing more brittle than it should. And such > an edge case does not seem worth spending extra cycles testing anyway, > as if things break we'd finish with a test stuck for an unnecessary > long time by relying on wait_for_catchup("replay"). We could use > something else, say based on a lookup of pg_stat_activity but this > still requires extra run time for the wait phases needed. So at the > end I have dropped the test, but backpatched the fix. > -- Fair. Regards, Soumyadeep (VMware) [1] https://www.postgresql.org/message-id/flat/CANXE4Tc3FNvZ_xAimempJWv_RH9pCvsZH7Yq93o1VuNLjUT-mQ%40mail.gmail.com