Re: ReplicationSlotsComputeRequiredXmin seems pretty questionable - Mailing list pgsql-hackers

From Tom Lane
Subject Re: ReplicationSlotsComputeRequiredXmin seems pretty questionable
Date
Msg-id 2146079.1598108589@sss.pgh.pa.us
Whole thread Raw
In response to Re: ReplicationSlotsComputeRequiredXmin seems pretty questionable  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: ReplicationSlotsComputeRequiredXmin seems pretty questionable
List pgsql-hackers
Amit Kapila <amit.kapila16@gmail.com> writes:
> On Thu, Aug 20, 2020 at 10:28 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> 2. On the other hand, the code is *releasing* the
>> ReplicationSlotControlLock before it calls
>> ProcArraySetReplicationSlotXmin, and that seems like a flat out
>> concurrency bug.

> It is not clear to me how those values can go backward.

After releasing ReplicationSlotControlLock, that code is holding no
lock at all (in the already_locked=false case I'm concerned about).
Thus the scenario to consider is:

1. Walsender A runs ReplicationSlotsComputeRequiredXmin, computes
some perfectly-valid xmins, releases ReplicationSlotControlLock,
amd then gets swapped out to Timbuktu.

2. Time passes and the "true values" of those xmins advance thanks
to other walsender activity.

3. Walsender B runs ReplicationSlotsComputeRequiredXmin, computes
some perfectly-valid xmins, and successfully stores them in the
procarray.

4. Walsender A returns from never-never land, and stores its now
quite stale results in the procarray, causing the globally visible
xmins to go backwards from where they were after step 3.

I see no mechanism in the code that prevents this scenario.
On reflection I'm not even very sure that the code change
I'm suggesting would prevent it.  It'd prevent walsenders
from entering or exiting until we've updated the procarray,
but there's nothing to stop the furthest-back walsender from
advancing its values.

There may be some argument why this can't lead to a problem,
but I don't see any comments making such an argument, either.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Creating a function for exposing memory usage of backend process
Next
From: Bruce Momjian
Date:
Subject: Re: xl_heap_header alignment?