Re: Single transaction in the tablesync worker? - Mailing list pgsql-hackers

From Peter Smith
Subject Re: Single transaction in the tablesync worker?
Date
Msg-id CAHut+Pt4wQcSr31ijj8gJq22KWnbUq0h2rLy2w+CB7wnKRP_mA@mail.gmail.com
Whole thread Raw
In response to Re: Single transaction in the tablesync worker?  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Single transaction in the tablesync worker?
List pgsql-hackers
On Wed, Dec 23, 2020 at 8:43 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> 1.
> + * Rarely, the DropSubscription may be issued when a tablesync still
> + * is in SYNCDONE but not yet in READY state. If this happens then
> + * the drop slot could fail because it is already dropped.
> + * In this case suppress and drop slot error.
> + *
> + * FIXME - Is there a better way than this?
> + */
> + if (rstate->state != SUBREL_STATE_SYNCDONE)
> + PG_RE_THROW();
>
> So, does this situation happens when we try to drop subscription after
> the state is changed to syncdone but not syncready. If so, then can't
> we write a function GetSubscriptionNotDoneRelations similar to
> GetSubscriptionNotReadyRelations where we get a list of relations that
> are not in done stage. I think this should be safe because once we are
> here we shouldn't be allowed to start a new worker and old workers are
> already stopped by this function.

Yes, but I don't see how adding such a function is an improvement over
the existing code:
e.g.1. GetSubscriptionNotDoneRelations will include the READY state
(which we don't want) just like GetSubscriptionNotReadyRelations
includes the SYNCDONE state.
e.g.2. Or, something like GetSubscriptionNotDoneAndNotReadyRelations
would be an unnecessary overkill replacement for the current simple
"if".

AFAIK the code is OK as-is. That "FIXME" comment was really meant only
to highlight this for review, rather than to imply something needed to
be fixed. I have removed that "FIXME" comment in the latest patch
[v9].

>
> 2. Your changes in LogicalRepSyncTableStart() doesn't seem to be
> right. IIUC, you are copying the table in one transaction, then
> updating the state to SUBREL_STATE_COPYDONE in another transaction,
> and after that doing replorigin_advance. Consider what happened if we
> error out after the first txn is committed in which we have copied the
> table. After the restart, it will again try to copy and lead to an
> error. Similarly, consider if we error out after the second
> transaction, we won't where to start decoding from. I think all these
> should be done in a single transaction.

Fixed as suggested. Please see latest patch [v9]

---

[v9] https://www.postgresql.org/message-id/CAHut%2BPv8ShLmrjCriVU%2Btprk_9b2kwBxYK2oGSn5Eb__kWVc7A%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Peter Smith
Date:
Subject: Re: Single transaction in the tablesync worker?
Next
From: Andrey Lepikhov
Date:
Subject: Re: [POC] Fast COPY FROM command for the table with foreign partitions