On Saturday, March 2, 2024 6:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sat, Mar 2, 2024 at 9:21 AM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com>
> wrote:
> >
> > Apart from the comments, the code in WalSndWaitForWal was refactored a
> > bit to make it neater. Thanks Shveta for helping writing the code and doc.
> >
>
> A few more comments:
Thanks for the comments.
> ==================
> 1.
> +# Wait until the primary server logs a warning indicating that it is
> +waiting # for the sb1_slot to catch up.
> +$primary->wait_for_log(
> + qr/replication slot \"sb1_slot\" specified in parameter
> standby_slot_names does not have active_pid/,
> + $offset);
>
> Shouldn't we wait for such a LOG even in the first test as well which involves two
> standbys and two logical subscribers?
Yes, we should. Added.
>
> 2.
> +##################################################
> +# Test that logical replication will wait for the user-created inactive
> +# physical slot to catch up until we remove the slot from standby_slot_names.
> +##################################################
>
>
> I don't see anything different tested in this test from what we already tested in
> the first test involving two standbys and two logical subscribers. Can you
> please clarify if I am missing something?
I think the intention was to test that the wait loop is ended due to GUC config
reload, while the first test is for the case when the loop is ended due to
restart_lsn movement. But it seems we tested the config reload with xx_get_changes() as
well, so I can remove it if you agree.
>
> 3.
> Note that after receiving the shutdown signal, an ERROR
> + * is reported if any slots are dropped, invalidated, or inactive. This
> + * measure is taken to prevent the walsender from waiting indefinitely.
> + */
> + if (NeedToWaitForStandby(target_lsn, flushed_lsn, wait_event))
>
> Isn't this part of the comment should be moved inside
> NeedToWaitForStandby()?
Moved.
>
> 4.
> + /*
> + * Update our idea of the currently flushed position only if we are
> + * not waiting for standbys to catch up, otherwise the standby would
> + * have to catch up to a newer WAL location in each cycle.
> + */
> + if (wait_event != WAIT_EVENT_WAIT_FOR_STANDBY_CONFIRMATION)
> + {
>
> This functionality (in function WalSndWaitForWal()) seems to ensure that we
> first wait for the required WAL to be flushed and then wait for standbys. If true,
> we should cover that point in the comments here or somewhere in the function
> WalSndWaitForWal().
>
> Apart from this, I have made a few modifications in the comments.
Thanks. I have reviewed and merged them.
Here is the V104 patch which addressed above and Peter's comments.
Best Regards,
Hou zj