Re: pg_replication_origin_drop API potential race condition - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: pg_replication_origin_drop API potential race condition
Date
Msg-id CAA4eK1+RafTDCh4nX7rFSWEN9C6KkKXv+GH+wjTqrj0XV-0i8Q@mail.gmail.com
Whole thread Raw
In response to pg_replication_origin_drop API potential race condition  (Peter Smith <smithpb2250@gmail.com>)
Responses Re: pg_replication_origin_drop API potential race condition  (Peter Smith <smithpb2250@gmail.com>)
List pgsql-hackers
On Wed, Jan 27, 2021 at 4:58 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Hackers.
>
> As discovered elsewhere [ak0125] there is a potential race condition
> in the pg_replication_origin_drop API
>
> The current code of pg_replication_origin_drop looks like:
> ====
> roident = replorigin_by_name(name, false);
> Assert(OidIsValid(roident));
>
> replorigin_drop(roident, true);
> ====
>
> Users cannot deliberately drop a non-existent origin
> (replorigin_by_name passes missing_ok = false) but there is still a
> small window where concurrent processes may be able to call
> replorigin_drop for the same valid roident.
>
> Locking within replorigin_drop guards against concurrent drops so the
> 1st execution will succeed, but then the 2nd execution would give
> internal cache error: elog(ERROR, "cache lookup failed for replication
> origin with oid %u", roident);
>
> Some ideas to fix this include:
> 1. Do nothing except write a comment about this in the code. The
> internal ERROR is not ideal for a user API there is no great harm
> done.
> 2. Change the behavior of replorigin_drop to be like
> replorigin_drop_IF_EXISTS, so the 2nd execution of this race would
> silently do nothing when it finds the roident is already gone.
> 3. Same as 2, but make the NOP behavior more explicit by introducing a
> new "missing_ok" parameter for replorigin_drop.
>

How about if we call replorigin_by_name() inside replorigin_drop after
acquiring the lock? Wouldn't that close this race condition? We are
doing something similar for pg_replication_origin_advance().

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: pg_dump: Add const decorations
Next
From: Amit Kapila
Date:
Subject: Re: Single transaction in the tablesync worker?