Re: Single transaction in the tablesync worker? - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Single transaction in the tablesync worker?
Date
Msg-id CAA4eK1+ayKaOk_qZ3CCq9xHaHj7TP-mngygqYdKGAZ5E2dcnmQ@mail.gmail.com
Whole thread Raw
In response to Re: Single transaction in the tablesync worker?  (Peter Smith <smithpb2250@gmail.com>)
Responses Re: Single transaction in the tablesync worker?  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Fri, Jan 8, 2021 at 7:14 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Thu, Jan 7, 2021 at 3:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Jan 6, 2021 at 3:39 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Wed, Jan 6, 2021 at 2:13 PM Peter Smith <smithpb2250@gmail.com> wrote:
> > > >
> > > > I think it makes sense. If there can be a race between the tablesync
> > > > re-launching (after error), and the AlterSubscription_refresh removing
> > > > some table’s relid from the subscription then there could be lurking
> > > > slot/origin tablesync resources (of the removed table) which a
> > > > subsequent DROP SUBSCRIPTION cannot discover. I will think more about
> > > > how/if it is possible to make this happen. Anyway, I suppose I ought
> > > > to refactor/isolate some of the tablesync cleanup code in case it
> > > > needs to be commonly called from DropSubscription and/or from
> > > > AlterSubscription_refresh.
> > > >
> > >
> > > Fair enough.
> > >
> >
> > I think before implementing, we should once try to reproduce this
> > case. I understand this is a timing issue and can be reproduced only
> > with the help of debugger but we should do that.
>
> FYI, I was able to reproduce this case in debugger. PSA logs showing details.
>

Thanks for reproducing as I was worried about exactly this case. I
have one question related to logs:

##
## ALTER SUBSCRIPTION to REFRESH the publication

## This blocks on some latch until the tablesync worker dies, then it continues
##

Did you check which exact latch or lock blocks this? It is important
to retain this interlock as otherwise even if decide to drop slot (and
or origin) the tablesync worker might continue.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: PoC/WIP: Extended statistics on expressions
Next
From: Amit Kapila
Date:
Subject: Re: [PATCH] Simple progress reporting for COPY command