On Thu, Aug 26, 2021 at 1:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Aug 26, 2021 at 9:21 AM Ajin Cherian <itsajin@gmail.com> wrote:
> >
> > On Thu, Aug 26, 2021 at 1:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > >
> > > You have a point but if we see the below logs, it seems the second
> > > walsender (#step6) seemed to exited before the first walsender
> > > (#step4).
> > >
> > > 2021-08-15 18:44:38.041 CEST [16475:10] tap_sub LOG: disconnection:
> > > session time: 0:00:00.036 user=nm database=postgres host=[local]
> > > 2021-08-15 18:44:38.043 CEST [16336:14] tap_sub LOG: disconnection:
> > > session time: 0:00:06.367 user=nm database=postgres host=[local]
> > >
> > > Isn't it possible that pid is cleared in the other order due to which
> > > we are seeing this problem?
> >
> > If the pid is cleared in the other order, wouldn't the query [1] return a false?
> >
> > [1] - " SELECT pid != 16336 FROM pg_stat_replication WHERE
> > application_name = 'tap_sub';"
> >
>
> I think it should return true because pid for 16336 is cleared first
> and the remaining one will be 16475.
Yes, that was what I explained as well. 16336 is PID 'a' (first
walsender) in my explanation. The first walsender should
be cleared first for this theory to work.
regards,
Ajin Cherian
Fujitsu Australia