Re: Orphaned records in pg_replication_origin_status after subscription drop - Mailing list pgsql-hackers
| From | Masahiko Sawada |
|---|---|
| Subject | Re: Orphaned records in pg_replication_origin_status after subscription drop |
| Date | |
| Msg-id | CAD21AoCiFxFeWNaFKweX3L1op6GS3caxuzFWTdtFL2mfFQ6=gg@mail.gmail.com Whole thread Raw |
| In response to | RE: Orphaned records in pg_replication_origin_status after subscription drop ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>) |
| Responses |
Re: Orphaned records in pg_replication_origin_status after subscription drop
|
| List | pgsql-hackers |
On Mon, Dec 22, 2025 at 2:15 AM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> wrote: > > On Monday, December 22, 2025 2:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Dec 22, 2025 at 10:16 AM Zhijie Hou (Fujitsu) > > <houzj.fnst@fujitsu.com> wrote: > > > > > > On Monday, December 22, 2025 7:01 AM Michael Paquier > > <michael@paquier.xyz> wrote: > > > > On Sat, Dec 20, 2025 at 02:55:15PM +0530, Amit Kapila wrote: > > > > > As of today, I can't think of a case where next time (restart > > > > > after > > > > > error) we won't generate the same origin_name but I think this > > > > > will add the dependency that each time the origin name should be > > > > > generated the same. > > > > > > > > ReplicationOriginNameForLogicalRep() would generate the origin name > > > > as pg_suboid_relid, so we would always reuse the same origin name on > > > > restart as long as the subscription is not recreated, wouldn't we? > > > > > > > > > Also, moving just repl_origin_create part (leaving other things > > > > > like origin_advance at its location) would need some explanation > > > > > in comments which is that it has some dependency on > > > > > DropSubscription and cleanup. Anyway, if we want to go with > > > > > creating origin in a separate transaction than COPY, then we > > > > > should change few comments like: "It is possible that the origin > > > > > is not yet created for tablesync worker, this can happen for the > > > > > states before SUBREL_STATE_FINISHEDCOPY." in the code. > > > > > > > > Hmm. So... Do you think that it should be OK to just create a new > > > > transaction before we open the transaction that locks > > > > MyLogicalRepWorker->relid (one that opens a BEGIN READ ONLY on the > > > > remote side)? And I guess that we would just move the > > > > replorigin_by_name(), > > > > replorigin_create() and ERROR into this new transaction? Setting up > > > > replorigin_session_setup() and replorigin_session_origin could then > > > > be delayed until we have done the > > > > replorigin_advance() in BEGIN READ ONLY transaction? By that I mean > > > > to leave the replorigin_advance() position untouched. I have > > > > studied this code quite a bit. I "think" that something among these > > > > lines would work, but I am not 100% sure if things are OK based the > > > > relstate we store in each of these phases. If you have an argument that > > invalidates these lines, please feel free! > > > > > > I think the solution you outlined works. One nitpick is that, instead > > > of starting a new transaction, we could create the origin within the > > > same transaction that updates the DATASYNC states, thereby ensuring > > > the origin information is available as soon as the catalog is updated. > > > I think the bug won't happen as long as the origin is created in a > > > transaction separate from the one setting up the shared memory state. > > > > > > > Agreed. But please update the comment (/* Update the state and make it > > visible to others. */) just before that transaction to reflect that origin will also > > be created in this transaction. > > Updated. > > > > > > Additionally, if we relocate the origin creation code, we need to > > > remove the ERROR report. This is because the origin may already exist > > > if a table sync restarts due to an ERROR during the initial COPY. This > > > should be safe since the origin is created with the reserved name > > > "pg_xx," preventing users from creating another origin with the same name. > > > > > > > Right. > > > > + /* > > + * Advance the origin to the LSN got from walrcv_create_slot. This is > > + WAL > > + * logged for the purpose of recovery. Locks are to prevent the > > + * replication origin from vanishing while advancing. > > + */ > > + LockRelationOid(ReplicationOriginRelationId, RowExclusiveLock); > > + replorigin_advance(originid, *origin_startpos, InvalidXLogRecPtr, > > + true /* go backward */ , true /* WAL log */ ); > > + UnlockRelationOid(ReplicationOriginRelationId, RowExclusiveLock); > > + > > /* > > * Setup replication origin tracking. The purpose of doing this before the > > * copy is to avoid doing the copy again due to any error in setting up > > * origin tracking. > > */ > > > > Shouldn't the second comment starting from "Setup replication origin .." be > > merged with the previous one because it is also intended for the previous > > code change? > > Agreed and merged. > > Here is the V2 patch which addressed above comments. > Thank you for making the patch! The patch looks good to me. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: