Re: Single transaction in the tablesync worker? - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Single transaction in the tablesync worker?
Date
Msg-id CAA4eK1Kv8D8hvN-77Jvk6=FuyPz7H-wbdjCmSyVnhdEqotSHAw@mail.gmail.com
Whole thread Raw
In response to Re: Single transaction in the tablesync worker?  (Peter Smith <smithpb2250@gmail.com>)
Responses Re: Single transaction in the tablesync worker?  (Peter Smith <smithpb2250@gmail.com>)
List pgsql-hackers
On Tue, Jan 5, 2021 at 3:32 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Mon, Jan 4, 2021 at 10:48 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
>
> > 5. Is it possible to write a testcase where we fail (say due to pk
> > violation or some other error) after the initial copy is done, then
> > remove the conflicting row and allow a copy to be completed? If we
> > already have any such test then it is fine.
> >
>
> Causing a PK violation during the initial copy is not a problem to
> test, but doing it after the initial copy is difficult. I have done
> exactly this test scenario before but I thought it cannot be
> automated. E.g. To cause an PK violation error somewhere between
> COPYDONE and SYNDONE means that the offending insert (the one which
> tablesync will fail to replicate) has to be sent while the tablesync
> is in CATCHUP mode. But AFAIK that can only be achieved using the
> debugger to get the timing right.
>

Yeah, I am also not able to think of any way to automate such a test.
I was thinking about what could go wrong if we error out in that
stage. The only thing that could be problematic is if we somehow make
the slot and replication origin used during copy dangling. I think if
tablesync is restarted after error then we will clean up those which
will be normally the case but what if the tablesync worker is not
started again? I think the only possibility of tablesync worker not
started again is if during Alter Subscription ... Refresh Publication,
we remove the corresponding subscription rel (see
AlterSubscription_refresh, I guess it could happen if one has dropped
the relation from publication). I haven't tested this with your patch
but if such a possibility exists then we need to think of cleaning up
slot and origin when we remove subscription rel. What do you think?

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: Parallel Inserts in CREATE TABLE AS
Next
From: "Hou, Zhijie"
Date:
Subject: RE: Parallel Inserts in CREATE TABLE AS