Re: pg_replication_origin_drop API potential race condition - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: pg_replication_origin_drop API potential race condition
Date
Msg-id CAA4eK1JxnWQ6tf8qJ1gHusVbaoBDp_rL-CXYMs2ue2zgawoU0Q@mail.gmail.com
Whole thread Raw
In response to Re: pg_replication_origin_drop API potential race condition  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: pg_replication_origin_drop API potential race condition
List pgsql-hackers
On Tue, Feb 9, 2021 at 9:27 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Feb 8, 2021 at 11:30 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> >
> > > +void
> > > +replorigin_drop_by_name(char *name, bool missing_ok, bool nowait)
> > > +{
> > > +     RepOriginId roident;
> > > +     Relation        rel;
> > > +
> > > +     Assert(IsTransactionState());
> > > +
> > > +     /*
> > > +      * To interlock against concurrent drops, we hold ExclusiveLock on
> > > +      * pg_replication_origin throughout this function.
> > > +      */
> >
> > This comment is now wrong though; should s/throughout.*/till xact commit/
> > to reflect the new reality.
> >
>
> Right, I'll fix in the next version.
>

Fixed in the attached.

> > I do wonder if this is going to be painful in some way, since the lock
> > is now going to be much longer-lived.  My impression is that it's okay,
> > since dropping an origin is not a very frequent occurrence.  It is going
> > to block pg_replication_origin_advance() with *any* origin, which
> > acquires RowExclusiveLock on the same relation.  If this is a problem,
> > then we could use LockSharedObject() in both places (and make it last
> > till end of xact for the case of deletion), instead of holding this
> > catalog-level lock till end of transaction.
> >
>
> IIUC, you are suggesting to use lock for the particular origin instead
> of locking the corresponding catalog table in functions
> pg_replication_origin_advance and replorigin_drop_by_name. If so, I
> don't see any problem with the same
>

I think it won't be that straightforward as we don't have origin_id.
So what we instead need to do is first to acquire a lock on
ReplicationOriginRelationId, get the origin_id, lock the specific
origin and then re-check if the origin still exists. I feel some
similar changes might be required in pg_replication_origin_advance.
Now, we can do this optimization if we want but I am not sure if
origin_drop would be a frequent enough operation that we add such an
optimization. For now, I have added a note in the comments so that if
we find any such use case we can implement such optimization in the
future. What do you think?

-- 
With Regards,
Amit Kapila.

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: Is Recovery actually paused?