Re: Failure of subscription tests with topminnow - Mailing list pgsql-hackers

From Ajin Cherian
Subject Re: Failure of subscription tests with topminnow
Date
Msg-id CAFPTHDbGSt4G9JdsTv-0ACZZWiTKExNxc5w4e5z=8YbCC+Ft5g@mail.gmail.com
Whole thread Raw
In response to Re: Failure of subscription tests with topminnow  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Failure of subscription tests with topminnow  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
On Wed, Aug 25, 2021 at 11:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Aug 25, 2021 at 6:10 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I did a quick check with the following tap test code:
> >
> > $node_publisher->poll_query_until('postgres',
> >                                   qq(
> > select 1 != foo.column1 from (values(0), (1)) as foo;
> > ));
> >
> > The query returns {t, f} but poll_query_until() never finished. The
> > same is true when the query returns {f, t}.
> >

Yes, this is true, I also see the same behaviour.

>
> This means something different is going on in Ajin's setup. Ajin, can
> you please share how did you confirm your findings about poll_query?

Relooking at my logs, I think what happens is this:

1. First walsender 'a' is running.
2. Second walsender 'b' starts and attempts at acquiring the slot
finds that the slot is active for pid a.
3. Now both walsenders are active, the query does not return.
4. First walsender 'a' times out and exits.
5. Now only the second walsender is active and the query returns OK
because pid != a.
6. Second walsender exits with error.
7. Another query attempts to get the pid of the running walsender for
tap_sub but returns null because both walsender exits.
8. This null return value results in the next query erroring out and
the test failing.

>Can you additionally check the value of 'state' from
>pg_stat_replication for both the old and new walsender sessions?

Yes, will try this and post a patch tomorrow.

regards,
Ajin Cherian
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Dagfinn Ilmari Mannsåker
Date:
Subject: Re: Remove Value node struct
Next
From: Robert Haas
Date:
Subject: Re: Mark all GUC variable as PGDLLIMPORT