Thread: Changes to recovery_min_apply_delay are ignored while waiting for delay

Changes to recovery_min_apply_delay are ignored while waiting for delay

From
Soumyadeep Chakraborty
Date:
Hello,

I came across this issue while I was tweaking a TAP test with Ashwin for
this thread [1].

We noticed that while the startup process waits for a recovery delay, it does
not respect changes to the recovery_min_apply_delay GUC. So:

// Standby is waiting on recoveryWakeupLatch with a large
recovery_min_apply_delay value
node_standby->safe_psql('postgres', 'ALTER SYSTEM SET
recovery_min_apply_delay TO 0;');
node_standby->reload;
// Standby is still waiting, it does not proceed until the old timeout
is elapsed.

I attached a small patch to fix this. It makes it so that
recovery_min_apply_delay is reconsulted in the loop to redetermine the wait
interval. This makes it similar to the loop in CheckpointerMain, where
CheckPointTimeout is consulted after interrupts are handled:

if (elapsed_secs >= CheckPointTimeout)
continue; /* no sleep for us ... */

I have also added a test to 005_replay_delay.pl.

Regards,
Soumyadeep(VMware)

[1] https://www.postgresql.org/message-id/flat/CANXE4Tc3FNvZ_xAimempJWv_RH9pCvsZH7Yq93o1VuNLjUT-mQ%40mail.gmail.com

Attachment

Re: Changes to recovery_min_apply_delay are ignored while waiting for delay

From
Kyotaro Horiguchi
Date:
At Mon, 2 Aug 2021 22:21:56 -0700, Soumyadeep Chakraborty <soumyadeep2007@gmail.com> wrote in 
> Hello,
> 
> I came across this issue while I was tweaking a TAP test with Ashwin for
> this thread [1].
> 
> We noticed that while the startup process waits for a recovery delay, it does
> not respect changes to the recovery_min_apply_delay GUC. So:
> 
> // Standby is waiting on recoveryWakeupLatch with a large
> recovery_min_apply_delay value
> node_standby->safe_psql('postgres', 'ALTER SYSTEM SET
> recovery_min_apply_delay TO 0;');
> node_standby->reload;
> // Standby is still waiting, it does not proceed until the old timeout
> is elapsed.
> 
> I attached a small patch to fix this. It makes it so that
> recovery_min_apply_delay is reconsulted in the loop to redetermine the wait
> interval. This makes it similar to the loop in CheckpointerMain, where
> CheckPointTimeout is consulted after interrupts are handled:
> 
> if (elapsed_secs >= CheckPointTimeout)
> continue; /* no sleep for us ... */
> 
> I have also added a test to 005_replay_delay.pl.
> 
> Regards,
> Soumyadeep(VMware)
> 
> [1] https://www.postgresql.org/message-id/flat/CANXE4Tc3FNvZ_xAimempJWv_RH9pCvsZH7Yq93o1VuNLjUT-mQ%40mail.gmail.com

Sounds reasonable and the patch looks fine as a whole. Applied cleanly
and works as expected.  The added test properly catches the issue.

One comment from me.

+$node_standby->safe_psql('postgres', "ALTER SYSTEM SET recovery_min_apply_delay TO 0;");

It might be better do "SET reco.. TO DEFAULT" instead.

And how about adding the new test at the end of existing tests. We can
get rid of the extra lines in the diff.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Changes to recovery_min_apply_delay are ignored while waiting for delay

From
Soumyadeep Chakraborty
Date:
Hi Kyotaro,

Thanks for the review!

On Mon, Aug 2, 2021 at 11:42 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

> One comment from me.
>
> +$node_standby->safe_psql('postgres', "ALTER SYSTEM SET recovery_min_apply_delay TO 0;");
>
> It might be better do "SET reco.. TO DEFAULT" instead.
>

Sure.

> And how about adding the new test at the end of existing tests. We can
> get rid of the extra lines in the diff.

No problem. See attached v2. I didn't do that initially as the test file
looks as though it is split into two halves: one for testing
recovery_min_apply_delay and the other for testing recovery pause. So while
making this change, I added a header comment for the newly added test case.
Please take a look.

Regards,
Soumyadeep (VMware)

Attachment

Re: Changes to recovery_min_apply_delay are ignored while waiting for delay

From
Soumyadeep Chakraborty
Date:
Rebased. Also added a stronger check to see if the standby is stuck in
recovery_min_apply_delay:

$node_standby->poll_query_until('postgres', qq{
SELECT wait_event = 'RecoveryApplyDelay' FROM pg_stat_activity
WHERE backend_type='startup';
}) or die "Timed out checking if startup is in recovery_min_apply_delay";

Attached v3.

Regards,
Soumyadeep (VMware)

Attachment

Re: Changes to recovery_min_apply_delay are ignored while waiting for delay

From
Michael Paquier
Date:
On Fri, Aug 06, 2021 at 04:59:55PM -0700, Soumyadeep Chakraborty wrote:
> Rebased. Also added a stronger check to see if the standby is stuck in
> recovery_min_apply_delay:
>
> $node_standby->poll_query_until('postgres', qq{
> SELECT wait_event = 'RecoveryApplyDelay' FROM pg_stat_activity
> WHERE backend_type='startup';
> }) or die "Timed out checking if startup is in recovery_min_apply_delay";

Relying on wait events is a good idea here.  This makes the test more
reliable to make sure that the startup process is stuck in the
WaitLatch phase.

> Attached v3.

Agreed that the current behavior is confusing.  As you are using the
commit timestamp for the comparison, this is right.  One small-ish
comment I have about the code is that we should mention
recovery_min_apply_delay for HandleStartupProcInterrupts(), and not
only the trigger file location.

+# First, set the delay for the next commit to some obscenely high value.
It has no need to be obscenely high, just high enough to give the time
to slow machines to catch the wait event lookup done.  So this could
use better words to explain this choice.

Anyway, it seems to me that something is incorrect in this new test
(manual tests pass here, the TAP test is off).  One thing we need to
make sure for any new test added is that it correctly breaks if the
fix proposed is *not* in place.  And as far as I can see, the test
passes even if the recalculation of delayUntil is done within the loop
that reloads the configuration.  The thing may be a bit tricky to make
reliable as the parameter reloading may cause wait_event to not be
RecoveryApplyDelay, so we should have at least a check based on a scan
of pg_stat_replication.replay_lsn on the primary.  Perhaps you have
an other idea?

When using wait_for_catchup(), I would recommend to list "replay" for
this test, even if that's the default mode, to make clear what the
intention is.
--
Michael

Attachment

Re: Changes to recovery_min_apply_delay are ignored while waiting for delay

From
Soumyadeep Chakraborty
Date:
Hey Michael,

Really appreciate the review!

On Wed, Aug 11, 2021 at 12:40 AM Michael Paquier <michael@paquier.xyz> wrote:

> Agreed that the current behavior is confusing.  As you are using the
> commit timestamp for the comparison, this is right.  One small-ish
> comment I have about the code is that we should mention
> recovery_min_apply_delay for HandleStartupProcInterrupts(), and not
> only the trigger file location.

Cool, updated.

> +# First, set the delay for the next commit to some obscenely high value.
> It has no need to be obscenely high, just high enough to give the time
> to slow machines to catch the wait event lookup done.  So this could
> use better words to explain this choice.

Sounds good. Done.

> Anyway, it seems to me that something is incorrect in this new test
> (manual tests pass here, the TAP test is off).  One thing we need to
> make sure for any new test added is that it correctly breaks if the
> fix proposed is *not* in place.  And as far as I can see, the test
> passes even if the recalculation of delayUntil is done within the loop
> that reloads the configuration.  The thing may be a bit tricky to make
> reliable as the parameter reloading may cause wait_event to not be
> RecoveryApplyDelay, so we should have at least a check based on a scan
> of pg_stat_replication.replay_lsn on the primary.  Perhaps you have
> an other idea?

Hmm, please see attached v4 which just shifts the test to the middle,
like v1. When I run the test without the code change, the test hangs
as expected in:

# Now the standby should catch up.
$node_primary->wait_for_catchup('standby', 'replay');

and passes with the code change, as expected. I can't explain why the
test doesn't freeze up in v3 in wait_for_catchup() at the end.

As for checking on the wait event, since we only signal the standby
after performing the check, that should be fine. Nothing else would be
sending a SIGHUP before the check. Is that assumption correct?

> When using wait_for_catchup(), I would recommend to list "replay" for
> this test, even if that's the default mode, to make clear what the
> intention is.

Done.

Regards,
Soumyadeep (VMware)

Attachment

Re: Changes to recovery_min_apply_delay are ignored while waiting for delay

From
Michael Paquier
Date:
On Fri, Aug 13, 2021 at 05:59:21PM -0700, Soumyadeep Chakraborty wrote:
> and passes with the code change, as expected. I can't explain why the
> test doesn't freeze up in v3 in wait_for_catchup() at the end.

It took me some some to understand why.  If I am right, that's because
of the intermediate test block working on $standby_2 and the two
INSERT queries of the primary.  In v1 and v4, we have no activity on
the primary between the first set of tests and yours, meaning that
$standby has nothing to do.  In v2 and v3, the two INSERT queries run
on the primary for the purpose of the recovery pause make $standby_1
wait for the default value of recovery_min_apply_delay, aka 3s, in
parallel.  If the set of tests for $standby_2 is faster than that,
we'd bump on the phase where the code still waited for 3s, not the 2
hours set, visibly.

After considering this stuff, the order dependency we'd introduce in
this test makes the whole thing more brittle than it should.  And such
an edge case does not seem worth spending extra cycles testing anyway,
as if things break we'd finish with a test stuck for an unnecessary
long time by relying on wait_for_catchup("replay").  We could use
something else, say based on a lookup of pg_stat_activity but this
still requires extra run time for the wait phases needed.  So at the
end I have dropped the test, but backpatched the fix.
--
Michael

Attachment

Re: Changes to recovery_min_apply_delay are ignored while waiting for delay

From
Soumyadeep Chakraborty
Date:
On Sun, Aug 15, 2021 at 8:16 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Fri, Aug 13, 2021 at 05:59:21PM -0700, Soumyadeep Chakraborty wrote:
> > and passes with the code change, as expected. I can't explain why the
> > test doesn't freeze up in v3 in wait_for_catchup() at the end.
>
> It took me some some to understand why.  If I am right, that's because
> of the intermediate test block working on $standby_2 and the two
> INSERT queries of the primary.  In v1 and v4, we have no activity on
> the primary between the first set of tests and yours, meaning that
> $standby has nothing to do.  In v2 and v3, the two INSERT queries run
> on the primary for the purpose of the recovery pause make $standby_1
> wait for the default value of recovery_min_apply_delay, aka 3s, in
> parallel.  If the set of tests for $standby_2 is faster than that,
> we'd bump on the phase where the code still waited for 3s, not the 2
> hours set, visibly.

I see, thanks a lot for the explanation. Thanks to your investigation, I
can now kind of reuse some of the test mechanisms for the other patch that
I am working on [1]. There, we don't have multiple standbys getting in the
way, thankfully.

> After considering this stuff, the order dependency we'd introduce in
> this test makes the whole thing more brittle than it should.  And such
> an edge case does not seem worth spending extra cycles testing anyway,
> as if things break we'd finish with a test stuck for an unnecessary
> long time by relying on wait_for_catchup("replay").  We could use
> something else, say based on a lookup of pg_stat_activity but this
> still requires extra run time for the wait phases needed.  So at the
> end I have dropped the test, but backpatched the fix.
> --

Fair.

Regards,
Soumyadeep (VMware)

[1] https://www.postgresql.org/message-id/flat/CANXE4Tc3FNvZ_xAimempJWv_RH9pCvsZH7Yq93o1VuNLjUT-mQ%40mail.gmail.com