Re: Failure in subscription test 004_sync.pl - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Failure in subscription test 004_sync.pl
Date
Msg-id CAA4eK1L8KHCxtvMQP64uRfW9ZCKKEVKUOV=4x9hT=7-CpFFD0g@mail.gmail.com
Whole thread Raw
In response to Re: Failure in subscription test 004_sync.pl  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Failure in subscription test 004_sync.pl
List pgsql-hackers
On Mon, Jun 14, 2021 at 10:41 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> >
> > I think it is showing a race condition issue in the code. In
> > DropSubscription, we first stop the worker that is receiving the WAL,
> > and then in a separate connection with the publisher, it tries to drop
> > the slot which leads to this error. The reason is that walsender is
> > still active as we just wait for wal receiver (or apply worker) to
> > stop. Normally, as soon as the apply worker is stopped the walsender
> > detects it and exits but in this case, it took some time to exit, and
> > in the meantime, we tried to drop the slot which is still in use by
> > walsender.
>
> There might be possible.
>
> That's weird since DROP SUBSCRIPTION executes DROP_REPLICATION_SLOT
> command with WAIT option. I found a bug that is possibly an oversight
> of commit 1632ea4368.
>
..
>
> The condition should be the opposite; we should raise the error when
> 'nowait' is true. I think this is the cause of the test failure. Even
> if DROP SUBSCRIPTION tries to drop the slot with the WAIT option, we
> don't wait but raise the error.
>
> Attached a small patch fixes it.
>

Yes, this should fix the recent buildfarm failures. Alvaro, would you
like to take care of this?

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Yugo NAGATA
Date:
Subject: Re: pgbench bug candidate: negative "initial connection time"
Next
From: Fabien COELHO
Date:
Subject: Re: Fix around conn_duration in pgbench