Re: subscriptioncheck failure - Mailing list pgsql-hackers

From vignesh C
Subject Re: subscriptioncheck failure
Date
Msg-id CALDaNm2f3j2NmZVRAiSPYiHCM9ktRijSq4vMoxb8LFEDkoAfDw@mail.gmail.com
Whole thread Raw
In response to Re: subscriptioncheck failure  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: subscriptioncheck failure
List pgsql-hackers
On Tue, May 18, 2021 at 9:09 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, May 17, 2021 at 5:48 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Mon, May 17, 2021 at 10:40 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > One more point:
> > > + $node_publisher->wait_for_catchup('tap_sub');
> > > +
> > > + # Ensure a transactional logical decoding message shows up on the slot
> > > + $node_subscriber->safe_psql('postgres', "ALTER SUBSCRIPTION tap_sub DISABLE");
> > > +
> > > + # wait for the replication connection to drop from the publisher
> > > + $node_publisher->poll_query_until('postgres',
> > > + "SELECT COUNT(*) FROM pg_catalog.pg_replication_slots WHERE
> > > slot_name = 'tap_sub' AND active='f'", 1);
> > >
> > > In the above sequence, wait_for_catchup will query pg_stat_replication
> > > whereas after disabling subscription we are checking
> > > pg_replication_slots. I understand from your explanation why we can't
> > > rely on pg_stat_replication after DISABLE but it might be better to
> > > check that the slot is active before disabling it. I think currently,
> > > the test assumes that, isn't it better to have an explicit check for
> > > that?
> >
> > I felt this is not required, wait_for_catchup will poll_query_until
> > the state = 'streaming', even if START_REPLICATION takes time, state
> > will be in 'startup' state, this way poll_query_until will take care
> > of handling this.
> >
>
> makes sense, but let's add some comments to clarify the same.
>

Modified.

> > On further analysis I found that we need to do "Alter subscription
> > tap_sub ENABLE" and "ALTER subscription tap_sub DISABLE" multiple
> > time, Instead we can change pg_logical_slot_peek_binary_changes to
> > pg_logical_slot_get_binary_changes at appropriate steps. This way the
> > common function can be removed and the enable/disable multiple times
> > can be removed.
> >
>
> I think that is a valid point. This was probably kept so that we can
> peek multiple times for the same message to test various things but
> that can be achieved with the way you have changed the test.
>
> One more thing, shouldn't we make auto_vacuum=off for this test by
> using 'append_conf' before starting the publisher. That will avoid the
> risk of empty transactions.

I felt that makes sense, added it.

Thanks for the comments, the attached patch has the changes for the same.

Regards,
Vignesh

Attachment

pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: pg_get_wal_replay_pause_state() should not return 'paused' while a promotion is ongoing.
Next
From: Amit Langote
Date:
Subject: Re: Forget close an open relation in ReorderBufferProcessTXN()