Re: [HACKERS] Race conditions with WAL sender PID lookups - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: [HACKERS] Race conditions with WAL sender PID lookups
Date
Msg-id CAB7nPqQ1EBeR5J+tF0h=ivDr4EtEQ8gxsNrxw3x3wrbWZ7=vCw@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Race conditions with WAL sender PID lookups  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: [HACKERS] Race conditions with WAL sender PID lookups
List pgsql-hackers
On Thu, Jun 29, 2017 at 7:52 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Michael Paquier wrote:
>> On Thu, Jun 8, 2017 at 3:31 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> I take that you are referring to the two lookups in
>> WalSndWaitForWal(), one in exec_replication_command(), one in
>> WalSndLoop() and one in WalSndDone(). First let's remember that all
>> the fields protected by the spinlock are only updated in a WAL sender
>> process, so reading them directly is safe in this context: we know
>> that no other processes will write them. And all those functions are
>> called only in the context of a WAL sender process. So the copies of
>> MyWalSnd that you are referring to here don't need a spinlock. Is
>> there something I am missing?
>
> I assume you mean "reading them unlocked" when you write "reading them
> directly".

Both expressions sound the same to me. So yes I meant that.

> If so, I agree with that rationale; I verified that it
> applies to members state, sentPtr, write, sent, flush, writeLag,
> sentLag, flushLag.  I wonder if there's a maintainability danger: as
> soon as we add a write to those members in a process other than the
> walsender itself, we have a bug.  I don't yet have an opinion on the
> severity of that problem.
>
> Other struct members such as needreload are written by other processes,
> so they should be always accessed with mutex held.  Regarding pid, it
> seems easiest if we use the rule that it must also be always accessed
> with spinlock held.  I propose updating the comment on WalSnd to this:
>
> /*
>  * Each walsender has a WalSnd struct in shared memory.
>  *
>  * This struct is protected by 'mutex', with two exceptions; one is
>  * sync_standby_priority as noted below.  The other exception is that some
>  * members are only written by the walsender process itself, and thus that
>  * process is free to read those members without holding spinlock.  pid and
>  * needreload always require the spinlock to be held for all accesses.
>  */

Agreed, a comment seems appropriate to me. That's in line with what
Horiguchi-san has mentioned upthread.

> I think I'm done with the walsender half of this patch; I still need to
> review the walreceiver part.  I will report back tomorrow 19:00 CLT.

Thanks!

> Currently, I'm considering not to backpatch any of this.

Considering how crazy the conditions to make the information fetched
by users inconsistent are met, I agree with that.
-- 
Michael



pgsql-hackers by date:

Previous
From: Jing Wang
Date:
Subject: [HACKERS] proposal: Support Unicode host variable in ECPG
Next
From: Ants Aasma
Date:
Subject: Re: [HACKERS] Reducing pg_ctl's reaction time