Re: Orphaned records in pg_replication_origin_status after subscription drop - Mailing list pgsql-hackers
| From | Amit Kapila |
|---|---|
| Subject | Re: Orphaned records in pg_replication_origin_status after subscription drop |
| Date | |
| Msg-id | CAA4eK1KE0gvaGBjXOiFot0PJNZ+85vik6KN4UTe5AZ7tW6B34A@mail.gmail.com Whole thread Raw |
| In response to | RE: Orphaned records in pg_replication_origin_status after subscription drop ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>) |
| List | pgsql-hackers |
On Mon, Dec 22, 2025 at 10:16 AM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> wrote: > > On Monday, December 22, 2025 7:01 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Sat, Dec 20, 2025 at 02:55:15PM +0530, Amit Kapila wrote: > > > As of today, I can't think of a case where next time (restart after > > > error) we won't generate the same origin_name but I think this will > > > add the dependency that each time the origin name should be generated > > > the same. > > > > ReplicationOriginNameForLogicalRep() would generate the origin name as > > pg_suboid_relid, so we would always reuse the same origin name on restart > > as long as the subscription is not recreated, wouldn't we? > > > > > Also, moving just repl_origin_create part (leaving other things like > > > origin_advance at its location) would need some explanation in > > > comments which is that it has some dependency on DropSubscription and > > > cleanup. Anyway, if we want to go with creating origin in a separate > > > transaction than COPY, then we should change few comments like: "It is > > > possible that the origin is not yet created for tablesync worker, this > > > can happen for the states before SUBREL_STATE_FINISHEDCOPY." in the > > > code. > > > > Hmm. So... Do you think that it should be OK to just create a new transaction > > before we open the transaction that locks > > MyLogicalRepWorker->relid (one that opens a BEGIN READ ONLY on the > > remote side)? And I guess that we would just move the replorigin_by_name(), > > replorigin_create() and ERROR into this new transaction? Setting up > > replorigin_session_setup() and replorigin_session_origin could then be > > delayed until we have done the > > replorigin_advance() in BEGIN READ ONLY transaction? By that I mean to > > leave the replorigin_advance() position untouched. I have studied this code > > quite a bit. I "think" that something among these lines would work, but I am > > not 100% sure if things are OK based the relstate we store in each of these > > phases. If you have an argument that invalidates these lines, please feel free! > > I think the solution you outlined works. One nitpick is that, instead of > starting a new transaction, we could create the origin within the same > transaction that updates the DATASYNC states, thereby ensuring the origin > information is available as soon as the catalog is updated. I think the bug > won't happen as long as the origin is created in a transaction separate > from the one setting up the shared memory state. > Agreed. But please update the comment (/* Update the state and make it visible to others. */) just before that transaction to reflect that origin will also be created in this transaction. > Additionally, if we relocate the origin creation code, we need to remove the > ERROR report. This is because the origin may already exist if a table sync > restarts due to an ERROR during the initial COPY. This should be safe since the > origin is created with the reserved name "pg_xx," preventing users from creating > another origin with the same name. > Right. + /* + * Advance the origin to the LSN got from walrcv_create_slot. This is WAL + * logged for the purpose of recovery. Locks are to prevent the + * replication origin from vanishing while advancing. + */ + LockRelationOid(ReplicationOriginRelationId, RowExclusiveLock); + replorigin_advance(originid, *origin_startpos, InvalidXLogRecPtr, + true /* go backward */ , true /* WAL log */ ); + UnlockRelationOid(ReplicationOriginRelationId, RowExclusiveLock); + /* * Setup replication origin tracking. The purpose of doing this before the * copy is to avoid doing the copy again due to any error in setting up * origin tracking. */ Shouldn't the second comment starting from "Setup replication origin .." be merged with the previous one because it is also intended for the previous code change? -- With Regards, Amit Kapila.
pgsql-hackers by date: