On Mon, May 17, 2021 at 10:40 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, May 17, 2021 at 9:06 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, May 13, 2021 at 7:06 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Thu, May 13, 2021 at 4:41 PM Michael Paquier <michael@paquier.xyz> wrote:
> >
> > Few comments:
> > 1.
> > + # Ensure a transactional logical decoding message shows up on the slot
> > + $node_subscriber->safe_psql('postgres', "ALTER SUBSCRIPTION tap_sub DISABLE");
> >
> > After you have encapsulated this command in the function, the above
> > comment doesn't make sense because we do this for both transactional
> > and non-transactional messages. I suggest we can change it to
> > something like: "This is done to ensure a logical decoding message is
> > shown up on the slot".
> >
> > 2.
> > +# Setup the subscription before checking pg_logical_slot_peek_binary_changes
> > +sub setup_subscription
> >
> > I think here the functionality is more for the catchup of
> > subscription, so it might be better to name the function as
> > subscription_catchup or catchup_subscription. I think you can expand
> > the comments atop this function a bit as suggested by Michael.
> >
>
> One more point:
> + $node_publisher->wait_for_catchup('tap_sub');
> +
> + # Ensure a transactional logical decoding message shows up on the slot
> + $node_subscriber->safe_psql('postgres', "ALTER SUBSCRIPTION tap_sub DISABLE");
> +
> + # wait for the replication connection to drop from the publisher
> + $node_publisher->poll_query_until('postgres',
> + "SELECT COUNT(*) FROM pg_catalog.pg_replication_slots WHERE
> slot_name = 'tap_sub' AND active='f'", 1);
>
> In the above sequence, wait_for_catchup will query pg_stat_replication
> whereas after disabling subscription we are checking
> pg_replication_slots. I understand from your explanation why we can't
> rely on pg_stat_replication after DISABLE but it might be better to
> check that the slot is active before disabling it. I think currently,
> the test assumes that, isn't it better to have an explicit check for
> that?
I felt this is not required, wait_for_catchup will poll_query_until
the state = 'streaming', even if START_REPLICATION takes time, state
will be in 'startup' state, this way poll_query_until will take care
of handling this.
On further analysis I found that we need to do "Alter subscription
tap_sub ENABLE" and "ALTER subscription tap_sub DISABLE" multiple
time, Instead we can change pg_logical_slot_peek_binary_changes to
pg_logical_slot_get_binary_changes at appropriate steps. This way the
common function can be removed and the enable/disable multiple times
can be removed. If we are going ahead with this approach the above
comments provided are no more valid.
I have made the changes in similar lines in the attached patch. If you
are ok we can go ahead with the new approach which will simplify the
changes required.
Thoughts?
Regards,
Vignesh