Thread: SyncRepLock acquired exclusively in default configuration

SyncRepLock acquired exclusively in default configuration

From

Andres Freund

Date:

06 April 2020, 05:03:32

Hi,

Due to the change below, when using the default postgres configuration
of ynchronous_commit = on, max_wal_senders = 10, will now acquire a new
exclusive lwlock after writing a commit record.

commit 48c9f4926562278a2fd2b85e7486c6d11705f177
Author: Simon Riggs <simon@2ndQuadrant.com>
Date:   2017-12-29 14:30:33 +0000

    Fix race condition when changing synchronous_standby_names

    A momentary window exists when synchronous_standby_names
    changes that allows commands issued after the change to
    continue to act as async until the change becomes visible.
    Remove the race by using more appropriate test in syncrep.c

    Author: Asim Rama Praveen and Ashwin Agrawal
    Reported-by: Xin Zhang, Ashwin Agrawal, and Asim Rama Praveen
    Reviewed-by: Michael Paquier, Masahiko Sawada

As far as I can tell there was no discussion about the added contention
due this change in the relevant thread [1].

The default configuration has an empty synchronous_standby_names. Before
this change we'd fall out of SyncRepWaitForLSN() before acquiring
SyncRepLock in exlusive mode. Now we don't anymore.


I'm really not ok with unneccessarily adding an exclusive lock
acquisition to such a crucial path.

Greetings,

Andres Freund

[1] https://postgr.es/m/CABrsG8j3kPD%2Bkbbsx_isEpFvAgaOBNGyGpsqSjQ6L8vwVUaZAQ%40mail.gmail.com

Re: SyncRepLock acquired exclusively in default configuration

From

Masahiko Sawada

Date:

06 April 2020, 08:51:42

On Mon, 6 Apr 2020 at 14:04, Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> Due to the change below, when using the default postgres configuration
> of ynchronous_commit = on, max_wal_senders = 10, will now acquire a new
> exclusive lwlock after writing a commit record.

Indeed.

>
> commit 48c9f4926562278a2fd2b85e7486c6d11705f177
> Author: Simon Riggs <simon@2ndQuadrant.com>
> Date:   2017-12-29 14:30:33 +0000
>
>     Fix race condition when changing synchronous_standby_names
>
>     A momentary window exists when synchronous_standby_names
>     changes that allows commands issued after the change to
>     continue to act as async until the change becomes visible.
>     Remove the race by using more appropriate test in syncrep.c
>
>     Author: Asim Rama Praveen and Ashwin Agrawal
>     Reported-by: Xin Zhang, Ashwin Agrawal, and Asim Rama Praveen
>     Reviewed-by: Michael Paquier, Masahiko Sawada
>
> As far as I can tell there was no discussion about the added contention
> due this change in the relevant thread [1].
>
> The default configuration has an empty synchronous_standby_names. Before
> this change we'd fall out of SyncRepWaitForLSN() before acquiring
> SyncRepLock in exlusive mode. Now we don't anymore.
>
>
> I'm really not ok with unneccessarily adding an exclusive lock
> acquisition to such a crucial path.
>

I think we can acquire SyncRepLock in share mode once to check
WalSndCtl->sync_standbys_defined and if it's true then check it again
after acquiring it in exclusive mode. But it in turn ends up with
adding one extra LWLockAcquire and LWLockRelease in sync rep path.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: SyncRepLock acquired exclusively in default configuration

From

Ashwin Agrawal

Date:

06 April 2020, 18:51:06

On Mon, Apr 6, 2020 at 1:52 AM Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote:

On Mon, 6 Apr 2020 at 14:04, Andres Freund <andres@anarazel.de> wrote:
>
> commit 48c9f4926562278a2fd2b85e7486c6d11705f177
> Author: Simon Riggs <simon@2ndQuadrant.com>
> Date: 2017-12-29 14:30:33 +0000
>
> Fix race condition when changing synchronous_standby_names
>
> A momentary window exists when synchronous_standby_names
> changes that allows commands issued after the change to
> continue to act as async until the change becomes visible.
> Remove the race by using more appropriate test in syncrep.c
>
> Author: Asim Rama Praveen and Ashwin Agrawal
> Reported-by: Xin Zhang, Ashwin Agrawal, and Asim Rama Praveen
> Reviewed-by: Michael Paquier, Masahiko Sawada
>
> As far as I can tell there was no discussion about the added contention
> due this change in the relevant thread [1].
>
> The default configuration has an empty synchronous_standby_names. Before
> this change we'd fall out of SyncRepWaitForLSN() before acquiring
> SyncRepLock in exlusive mode. Now we don't anymore.
>
>
> I'm really not ok with unneccessarily adding an exclusive lock
> acquisition to such a crucial path.
>

I think we can acquire SyncRepLock in share mode once to check
WalSndCtl->sync_standbys_defined and if it's true then check it again
after acquiring it in exclusive mode. But it in turn ends up with
adding one extra LWLockAcquire and LWLockRelease in sync rep path.

How about we change it to this ?

diff --git a/src/backend/replication/syncrep.c b/src/backend/replication/syncrep.c
index ffd5b31eb2..cdb82a8b28 100644
--- a/src/backend/replication/syncrep.c
+++ b/src/backend/replication/syncrep.c
@@ -165,8 +165,11 @@ SyncRepWaitForLSN(XLogRecPtr lsn, bool commit)
/*
* Fast exit if user has not requested sync replication.
*/
- if (!SyncRepRequested())
- return;
+ if (!SyncRepRequested() || !SyncStandbysDefined())
+ {
+ if (!WalSndCtl->sync_standbys_defined)
+ return;
+ }

Assert(SHMQueueIsDetached(&(MyProc->syncRepLinks)));
Assert(WalSndCtl != NULL);

Bring back the check which existed based on GUC but instead of just blindly returning based on just GUC not being set, check WalSndCtl->sync_standbys_defined. Thoughts?

Re: SyncRepLock acquired exclusively in default configuration

From

Andres Freund

Date:

06 April 2020, 21:14:01

Hi,

On 2020-04-06 11:51:06 -0700, Ashwin Agrawal wrote:
> On Mon, Apr 6, 2020 at 1:52 AM Masahiko Sawada <
> masahiko.sawada@2ndquadrant.com> wrote:
> > On Mon, 6 Apr 2020 at 14:04, Andres Freund <andres@anarazel.de> wrote:
> > > I'm really not ok with unneccessarily adding an exclusive lock
> > > acquisition to such a crucial path.
> > >
> >
> > I think we can acquire SyncRepLock in share mode once to check
> > WalSndCtl->sync_standbys_defined and if it's true then check it again
> > after acquiring it in exclusive mode. But it in turn ends up with
> > adding one extra LWLockAcquire and LWLockRelease in sync rep path.

That's still too much. Adding another lwlock acquisition, where the same
lock is acquired by all backends (contrasting e.g. to buffer locks), to
the commit path, for the benefit of a feature that the vast majority of
people aren't going to use, isn't good.

> How about we change it to this ?

Hm. Better. But I think it might need at least a compiler barrier /
volatile memory load?  Unlikely here, but otherwise the compiler could
theoretically just stash the variable somewhere locally (it's not likely
to be a problem because it'd not be long ago that we acquired an lwlock,
which is a full barrier).

> Bring back the check which existed based on GUC but instead of just blindly
> returning based on just GUC not being set, check
> WalSndCtl->sync_standbys_defined. Thoughts?

Hm. Is there any reason not to just check
WalSndCtl->sync_standbys_defined? rather than both !SyncStandbysDefined()
and WalSndCtl->sync_standbys_defined?

Greetings,

Andres Freund

Re: SyncRepLock acquired exclusively in default configuration

From

Masahiko Sawada

Date:

07 April 2020, 06:48:07

On Tue, 7 Apr 2020 at 06:14, Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2020-04-06 11:51:06 -0700, Ashwin Agrawal wrote:
> > On Mon, Apr 6, 2020 at 1:52 AM Masahiko Sawada <
> > masahiko.sawada@2ndquadrant.com> wrote:
> > > On Mon, 6 Apr 2020 at 14:04, Andres Freund <andres@anarazel.de> wrote:
> > > > I'm really not ok with unneccessarily adding an exclusive lock
> > > > acquisition to such a crucial path.
> > > >
> > >
> > > I think we can acquire SyncRepLock in share mode once to check
> > > WalSndCtl->sync_standbys_defined and if it's true then check it again
> > > after acquiring it in exclusive mode. But it in turn ends up with
> > > adding one extra LWLockAcquire and LWLockRelease in sync rep path.
>
> That's still too much. Adding another lwlock acquisition, where the same
> lock is acquired by all backends (contrasting e.g. to buffer locks), to
> the commit path, for the benefit of a feature that the vast majority of
> people aren't going to use, isn't good.

Agreed.

In this case it seems okay to read WalSndCtl->sync_standbys_defined
without SyncRepLock before we acquire SyncRepLock in exclusive mode.
While changing WalSndCtl->sync_standbys_defined to true, in the
current code a backend who reached SyncRepWaitForLSN() waits on
SyncRepLock, see sync_standbys_defined is true and enqueue itself.
With this change, since we don't acquire SyncRepLock to read
WalSndCtl->sync_standbys_defined these backends return without waiting
for the change of WalSndCtl->sync_standbys_defined but it would not be
a problem. Similarly, I've considered the case where changing to
false, but I think there is no problem.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: SyncRepLock acquired exclusively in default configuration

From

Ashwin Agrawal

Date:

07 April 2020, 18:01:55

On Mon, Apr 6, 2020 at 2:14 PM Andres Freund <andres@anarazel.de> wrote:

> How about we change it to this ?

Hm. Better. But I think it might need at least a compiler barrier /
volatile memory load? Unlikely here, but otherwise the compiler could
theoretically just stash the variable somewhere locally (it's not likely
to be a problem because it'd not be long ago that we acquired an lwlock,
which is a full barrier).

That's the part, I am not fully sure about. But reading the comment above SyncRepUpdateSyncStandbysDefined(), it seems fine.

> Bring back the check which existed based on GUC but instead of just blindly
> returning based on just GUC not being set, check
> WalSndCtl->sync_standbys_defined. Thoughts?

Hm. Is there any reason not to just check
WalSndCtl->sync_standbys_defined? rather than both !SyncStandbysDefined()
and WalSndCtl->sync_standbys_defined?

Agree, just checking for WalSndCtl->sync_standbys_defined seems fine.

I wasn't fully thinking there, as I got distracted by if lock will be required or not for reading the same. If lock was required then checking for guc first would have been better, but seems not required.

Re: SyncRepLock acquired exclusively in default configuration

From

Fujii Masao

Date:

10 April 2020, 04:20:48


On 2020/04/08 3:01, Ashwin Agrawal wrote:
> 
> On Mon, Apr 6, 2020 at 2:14 PM Andres Freund <andres@anarazel.de <mailto:andres@anarazel.de>> wrote:
> 
>      > How about we change it to this ?
> 
>     Hm. Better. But I think it might need at least a compiler barrier /
>     volatile memory load?  Unlikely here, but otherwise the compiler could
>     theoretically just stash the variable somewhere locally (it's not likely
>     to be a problem because it'd not be long ago that we acquired an lwlock,
>     which is a full barrier).
> 
> 
> That's the part, I am not fully sure about. But reading the comment above SyncRepUpdateSyncStandbysDefined(), it
seemsfine.
 
> 
>      > Bring back the check which existed based on GUC but instead of just blindly
>      > returning based on just GUC not being set, check
>      > WalSndCtl->sync_standbys_defined. Thoughts?
> 
>     Hm. Is there any reason not to just check
>     WalSndCtl->sync_standbys_defined? rather than both !SyncStandbysDefined()
>     and WalSndCtl->sync_standbys_defined?
> 
> 
> Agree, just checking for WalSndCtl->sync_standbys_defined seems fine.

So the consensus is something like the following? Patch attached.

         /*
-        * Fast exit if user has not requested sync replication.
+        * Fast exit if user has not requested sync replication, or there are no
+        * sync replication standby names defined.
          */
-       if (!SyncRepRequested())
+       if (!SyncRepRequested() ||
+               !((volatile WalSndCtlData *) WalSndCtl)->sync_standbys_defined)
                 return;

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Attachment

syncreplock.patch

Re: SyncRepLock acquired exclusively in default configuration

From

Masahiko Sawada

Date:

10 April 2020, 05:11:48

On Fri, 10 Apr 2020 at 13:20, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>
>
>
> On 2020/04/08 3:01, Ashwin Agrawal wrote:
> >
> > On Mon, Apr 6, 2020 at 2:14 PM Andres Freund <andres@anarazel.de <mailto:andres@anarazel.de>> wrote:
> >
> >      > How about we change it to this ?
> >
> >     Hm. Better. But I think it might need at least a compiler barrier /
> >     volatile memory load?  Unlikely here, but otherwise the compiler could
> >     theoretically just stash the variable somewhere locally (it's not likely
> >     to be a problem because it'd not be long ago that we acquired an lwlock,
> >     which is a full barrier).
> >
> >
> > That's the part, I am not fully sure about. But reading the comment above SyncRepUpdateSyncStandbysDefined(), it
seemsfine.
 
> >
> >      > Bring back the check which existed based on GUC but instead of just blindly
> >      > returning based on just GUC not being set, check
> >      > WalSndCtl->sync_standbys_defined. Thoughts?
> >
> >     Hm. Is there any reason not to just check
> >     WalSndCtl->sync_standbys_defined? rather than both !SyncStandbysDefined()
> >     and WalSndCtl->sync_standbys_defined?
> >
> >
> > Agree, just checking for WalSndCtl->sync_standbys_defined seems fine.
>
> So the consensus is something like the following? Patch attached.
>
>          /*
> -        * Fast exit if user has not requested sync replication.
> +        * Fast exit if user has not requested sync replication, or there are no
> +        * sync replication standby names defined.
>           */
> -       if (!SyncRepRequested())
> +       if (!SyncRepRequested() ||
> +               !((volatile WalSndCtlData *) WalSndCtl)->sync_standbys_defined)
>                  return;
>

I think we need more comments describing why checking
sync_standby_defined without SyncRepLock is safe here. For example:

This routine gets called every commit time. So, to check if the
synchronous standbys is defined as quick as possible we check
WalSndCtl->sync_standbys_defined without acquiring SyncRepLock. Since
we make this test unlocked, there's a change we might fail to notice
that it has been turned off and continue processing. But since the
subsequent check will check it again while holding SyncRepLock, it's
no problem. Similarly even if we fail to notice that it has been
turned on, it's okay to return quickly since all backend consistently
behaves so.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: SyncRepLock acquired exclusively in default configuration

From

Fujii Masao

Date:

10 April 2020, 09:57:12


On 2020/04/10 14:11, Masahiko Sawada wrote:
> On Fri, 10 Apr 2020 at 13:20, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>>
>>
>>
>> On 2020/04/08 3:01, Ashwin Agrawal wrote:
>>>
>>> On Mon, Apr 6, 2020 at 2:14 PM Andres Freund <andres@anarazel.de <mailto:andres@anarazel.de>> wrote:
>>>
>>>       > How about we change it to this ?
>>>
>>>      Hm. Better. But I think it might need at least a compiler barrier /
>>>      volatile memory load?  Unlikely here, but otherwise the compiler could
>>>      theoretically just stash the variable somewhere locally (it's not likely
>>>      to be a problem because it'd not be long ago that we acquired an lwlock,
>>>      which is a full barrier).
>>>
>>>
>>> That's the part, I am not fully sure about. But reading the comment above SyncRepUpdateSyncStandbysDefined(), it
seemsfine.
 
>>>
>>>       > Bring back the check which existed based on GUC but instead of just blindly
>>>       > returning based on just GUC not being set, check
>>>       > WalSndCtl->sync_standbys_defined. Thoughts?
>>>
>>>      Hm. Is there any reason not to just check
>>>      WalSndCtl->sync_standbys_defined? rather than both !SyncStandbysDefined()
>>>      and WalSndCtl->sync_standbys_defined?
>>>
>>>
>>> Agree, just checking for WalSndCtl->sync_standbys_defined seems fine.
>>
>> So the consensus is something like the following? Patch attached.
>>
>>           /*
>> -        * Fast exit if user has not requested sync replication.
>> +        * Fast exit if user has not requested sync replication, or there are no
>> +        * sync replication standby names defined.
>>            */
>> -       if (!SyncRepRequested())
>> +       if (!SyncRepRequested() ||
>> +               !((volatile WalSndCtlData *) WalSndCtl)->sync_standbys_defined)
>>                   return;
>>
> 
> I think we need more comments describing why checking
> sync_standby_defined without SyncRepLock is safe here. For example:

Yep, agreed!

> This routine gets called every commit time. So, to check if the
> synchronous standbys is defined as quick as possible we check
> WalSndCtl->sync_standbys_defined without acquiring SyncRepLock. Since
> we make this test unlocked, there's a change we might fail to notice
> that it has been turned off and continue processing.

Does this really happen? I was thinking that the problem by not taking
the lock here is that SyncRepWaitForLSN() can see that shared flag after
SyncRepUpdateSyncStandbysDefined() wakes up all the waiters and
before it sets the flag to false. Then if SyncRepWaitForLSN() adds  itself
into the wait queue becaues the flag was true, without lock, it may keep
sleeping infinitely.

> But since the
> subsequent check will check it again while holding SyncRepLock, it's
> no problem. Similarly even if we fail to notice that it has been
> turned on
Is this true? ISTM that after SyncRepUpdateSyncStandbysDefined()
sets the flag to true, SyncRepWaitForLSN() basically doesn't seem
to fail to notice that. No?

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: SyncRepLock acquired exclusively in default configuration

From

Masahiko Sawada

Date:

10 April 2020, 11:56:57

On Fri, 10 Apr 2020 at 18:57, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>
>
>
> On 2020/04/10 14:11, Masahiko Sawada wrote:
> > On Fri, 10 Apr 2020 at 13:20, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> >>
> >>
> >>
> >> On 2020/04/08 3:01, Ashwin Agrawal wrote:
> >>>
> >>> On Mon, Apr 6, 2020 at 2:14 PM Andres Freund <andres@anarazel.de <mailto:andres@anarazel.de>> wrote:
> >>>
> >>>       > How about we change it to this ?
> >>>
> >>>      Hm. Better. But I think it might need at least a compiler barrier /
> >>>      volatile memory load?  Unlikely here, but otherwise the compiler could
> >>>      theoretically just stash the variable somewhere locally (it's not likely
> >>>      to be a problem because it'd not be long ago that we acquired an lwlock,
> >>>      which is a full barrier).
> >>>
> >>>
> >>> That's the part, I am not fully sure about. But reading the comment above SyncRepUpdateSyncStandbysDefined(), it
seemsfine.
 
> >>>
> >>>       > Bring back the check which existed based on GUC but instead of just blindly
> >>>       > returning based on just GUC not being set, check
> >>>       > WalSndCtl->sync_standbys_defined. Thoughts?
> >>>
> >>>      Hm. Is there any reason not to just check
> >>>      WalSndCtl->sync_standbys_defined? rather than both !SyncStandbysDefined()
> >>>      and WalSndCtl->sync_standbys_defined?
> >>>
> >>>
> >>> Agree, just checking for WalSndCtl->sync_standbys_defined seems fine.
> >>
> >> So the consensus is something like the following? Patch attached.
> >>
> >>           /*
> >> -        * Fast exit if user has not requested sync replication.
> >> +        * Fast exit if user has not requested sync replication, or there are no
> >> +        * sync replication standby names defined.
> >>            */
> >> -       if (!SyncRepRequested())
> >> +       if (!SyncRepRequested() ||
> >> +               !((volatile WalSndCtlData *) WalSndCtl)->sync_standbys_defined)
> >>                   return;
> >>
> >
> > I think we need more comments describing why checking
> > sync_standby_defined without SyncRepLock is safe here. For example:
>
> Yep, agreed!
>
> > This routine gets called every commit time. So, to check if the
> > synchronous standbys is defined as quick as possible we check
> > WalSndCtl->sync_standbys_defined without acquiring SyncRepLock. Since
> > we make this test unlocked, there's a change we might fail to notice
> > that it has been turned off and continue processing.
>
> Does this really happen? I was thinking that the problem by not taking
> the lock here is that SyncRepWaitForLSN() can see that shared flag after
> SyncRepUpdateSyncStandbysDefined() wakes up all the waiters and
> before it sets the flag to false. Then if SyncRepWaitForLSN() adds  itself
> into the wait queue becaues the flag was true, without lock, it may keep
> sleeping infinitely.

I think that because a backend does the following check after
acquiring SyncRepLock, in that case, once the backend has taken
SyncRepLock it can see that sync_standbys_defined is false and return.
But you meant that we do both checks without SyncRepLock?

    /*
     * We don't wait for sync rep if WalSndCtl->sync_standbys_defined is not
     * set.  See SyncRepUpdateSyncStandbysDefined.
     *
     * Also check that the standby hasn't already replied. Unlikely race
     * condition but we'll be fetching that cache line anyway so it's likely
     * to be a low cost check.
     */
    if (!WalSndCtl->sync_standbys_defined ||
        lsn <= WalSndCtl->lsn[mode])
    {
        LWLockRelease(SyncRepLock);
        return;
    }

>
> > But since the
> > subsequent check will check it again while holding SyncRepLock, it's
> > no problem. Similarly even if we fail to notice that it has been
> > turned on
> Is this true? ISTM that after SyncRepUpdateSyncStandbysDefined()
> sets the flag to true, SyncRepWaitForLSN() basically doesn't seem
> to fail to notice that. No?

What I wanted to say is, in the current code, while the checkpointer
process is holding SyncRepLock to turn off sync_standbys_defined,
backends who reach SyncRepWaitForLSN() wait for the lock. Then, after
the checkpointer process releases SyncRepLock these backend can
enqueue themselves to the wait queue because they can see that
sync_standbys_defined is turned on. On the other hand if we do the
check without SyncRepLock, backends who reach SyncRepWaitForLSN() will
return instead of waiting, in spite of checkpointer process being
turning on sync_standbys_defined. Which means these backends are
failing to notice that it has been turned on, I thought.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: SyncRepLock acquired exclusively in default configuration

From

Fujii Masao

Date:

10 April 2020, 12:52:03


On 2020/04/10 20:56, Masahiko Sawada wrote:
> On Fri, 10 Apr 2020 at 18:57, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>>
>>
>>
>> On 2020/04/10 14:11, Masahiko Sawada wrote:
>>> On Fri, 10 Apr 2020 at 13:20, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>>>>
>>>>
>>>>
>>>> On 2020/04/08 3:01, Ashwin Agrawal wrote:
>>>>>
>>>>> On Mon, Apr 6, 2020 at 2:14 PM Andres Freund <andres@anarazel.de <mailto:andres@anarazel.de>> wrote:
>>>>>
>>>>>        > How about we change it to this ?
>>>>>
>>>>>       Hm. Better. But I think it might need at least a compiler barrier /
>>>>>       volatile memory load?  Unlikely here, but otherwise the compiler could
>>>>>       theoretically just stash the variable somewhere locally (it's not likely
>>>>>       to be a problem because it'd not be long ago that we acquired an lwlock,
>>>>>       which is a full barrier).
>>>>>
>>>>>
>>>>> That's the part, I am not fully sure about. But reading the comment above SyncRepUpdateSyncStandbysDefined(), it
seemsfine.
 
>>>>>
>>>>>        > Bring back the check which existed based on GUC but instead of just blindly
>>>>>        > returning based on just GUC not being set, check
>>>>>        > WalSndCtl->sync_standbys_defined. Thoughts?
>>>>>
>>>>>       Hm. Is there any reason not to just check
>>>>>       WalSndCtl->sync_standbys_defined? rather than both !SyncStandbysDefined()
>>>>>       and WalSndCtl->sync_standbys_defined?
>>>>>
>>>>>
>>>>> Agree, just checking for WalSndCtl->sync_standbys_defined seems fine.
>>>>
>>>> So the consensus is something like the following? Patch attached.
>>>>
>>>>            /*
>>>> -        * Fast exit if user has not requested sync replication.
>>>> +        * Fast exit if user has not requested sync replication, or there are no
>>>> +        * sync replication standby names defined.
>>>>             */
>>>> -       if (!SyncRepRequested())
>>>> +       if (!SyncRepRequested() ||
>>>> +               !((volatile WalSndCtlData *) WalSndCtl)->sync_standbys_defined)
>>>>                    return;
>>>>
>>>
>>> I think we need more comments describing why checking
>>> sync_standby_defined without SyncRepLock is safe here. For example:
>>
>> Yep, agreed!
>>
>>> This routine gets called every commit time. So, to check if the
>>> synchronous standbys is defined as quick as possible we check
>>> WalSndCtl->sync_standbys_defined without acquiring SyncRepLock. Since
>>> we make this test unlocked, there's a change we might fail to notice
>>> that it has been turned off and continue processing.
>>
>> Does this really happen? I was thinking that the problem by not taking
>> the lock here is that SyncRepWaitForLSN() can see that shared flag after
>> SyncRepUpdateSyncStandbysDefined() wakes up all the waiters and
>> before it sets the flag to false. Then if SyncRepWaitForLSN() adds  itself
>> into the wait queue becaues the flag was true, without lock, it may keep
>> sleeping infinitely.
> 
> I think that because a backend does the following check after
> acquiring SyncRepLock, in that case, once the backend has taken
> SyncRepLock it can see that sync_standbys_defined is false and return.

Yes, but the backend can see that sync_standby_defined indicates false
whether holding SyncRepLock or not, after the checkpointer sets it to false.

> But you meant that we do both checks without SyncRepLock?

Maybe No. The change that the latest patch provides should be applied, I think.
That is, sync_standbys_defined should be check without lock at first, then
only if it's true, it should be checked again with lock.

ISTM that basically SyncRepLock is used in SyncRepWaitForLSN() and
SyncRepUpdateSyncStandbysDefined() to make operation on the queue
and enabling sync_standbys_defined atomic. Without lock, the issue that
the comment in SyncRepUpdateSyncStandbysDefined() explains would
happen. That is, the backend may keep waiting infinitely as follows.

1. checkpointer calls SyncRepUpdateSyncStandbysDefined()
2. checkpointer sees that the flag indicates true but the config indicates false
3. checkpointer takes lock and wakes up all the waiters
4. backend calls SyncRepWaitForLSN() can see that the flag indicates true
5. checkpointer sets the flag to false and releases the lock
6. backend adds itself to the queue and wait until it's waken up, but will not happen immediately

So after the backend sees that the flag indicates true without lock,
it must check the flag again with lock immediately without operating
the queue. If this my understanding is right, I was thinking that
the comment should mention these things.

>      /*
>       * We don't wait for sync rep if WalSndCtl->sync_standbys_defined is not
>       * set.  See SyncRepUpdateSyncStandbysDefined.
>       *
>       * Also check that the standby hasn't already replied. Unlikely race
>       * condition but we'll be fetching that cache line anyway so it's likely
>       * to be a low cost check.
>       */
>      if (!WalSndCtl->sync_standbys_defined ||
>          lsn <= WalSndCtl->lsn[mode])
>      {
>          LWLockRelease(SyncRepLock);
>          return;
>      }
> 
>>
>>> But since the
>>> subsequent check will check it again while holding SyncRepLock, it's
>>> no problem. Similarly even if we fail to notice that it has been
>>> turned on
>> Is this true? ISTM that after SyncRepUpdateSyncStandbysDefined()
>> sets the flag to true, SyncRepWaitForLSN() basically doesn't seem
>> to fail to notice that. No?
> 
> What I wanted to say is, in the current code, while the checkpointer
> process is holding SyncRepLock to turn off sync_standbys_defined,
> backends who reach SyncRepWaitForLSN() wait for the lock. Then, after
> the checkpointer process releases SyncRepLock these backend can
> enqueue themselves to the wait queue because they can see that
> sync_standbys_defined is turned on.

In this case, since the checkpointer turned the flag off while holding
the lock, the backend sees that the flag is turned off, and doesn't
enqueue itself. No?

> On the other hand if we do the
> check without SyncRepLock, backends who reach SyncRepWaitForLSN() will
> return instead of waiting, in spite of checkpointer process being
> turning on sync_standbys_defined. Which means these backends are
> failing to notice that it has been turned on, I thought.

No. Or I'm missing something... In this case, the backend sees that
the flag is turned on without lock since checkpointer turned it on.
So you're thinking the following. Right?

1. sync_standbys_defined flag is false
2. checkpointer takes the lock and turns the flag on
3. backend sees the flag
4. checkpointer releases the lock

In #3, the flag indicates true, I think. But you think it's false?

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: SyncRepLock acquired exclusively in default configuration

From

Masahiko Sawada

Date:

11 April 2020, 00:30:30

On Fri, 10 Apr 2020 at 21:52, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>
>
>
> On 2020/04/10 20:56, Masahiko Sawada wrote:
> > On Fri, 10 Apr 2020 at 18:57, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> >>
> >>
> >>
> >> On 2020/04/10 14:11, Masahiko Sawada wrote:
> >>> On Fri, 10 Apr 2020 at 13:20, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On 2020/04/08 3:01, Ashwin Agrawal wrote:
> >>>>>
> >>>>> On Mon, Apr 6, 2020 at 2:14 PM Andres Freund <andres@anarazel.de <mailto:andres@anarazel.de>> wrote:
> >>>>>
> >>>>>        > How about we change it to this ?
> >>>>>
> >>>>>       Hm. Better. But I think it might need at least a compiler barrier /
> >>>>>       volatile memory load?  Unlikely here, but otherwise the compiler could
> >>>>>       theoretically just stash the variable somewhere locally (it's not likely
> >>>>>       to be a problem because it'd not be long ago that we acquired an lwlock,
> >>>>>       which is a full barrier).
> >>>>>
> >>>>>
> >>>>> That's the part, I am not fully sure about. But reading the comment above SyncRepUpdateSyncStandbysDefined(),
itseems fine.
 
> >>>>>
> >>>>>        > Bring back the check which existed based on GUC but instead of just blindly
> >>>>>        > returning based on just GUC not being set, check
> >>>>>        > WalSndCtl->sync_standbys_defined. Thoughts?
> >>>>>
> >>>>>       Hm. Is there any reason not to just check
> >>>>>       WalSndCtl->sync_standbys_defined? rather than both !SyncStandbysDefined()
> >>>>>       and WalSndCtl->sync_standbys_defined?
> >>>>>
> >>>>>
> >>>>> Agree, just checking for WalSndCtl->sync_standbys_defined seems fine.
> >>>>
> >>>> So the consensus is something like the following? Patch attached.
> >>>>
> >>>>            /*
> >>>> -        * Fast exit if user has not requested sync replication.
> >>>> +        * Fast exit if user has not requested sync replication, or there are no
> >>>> +        * sync replication standby names defined.
> >>>>             */
> >>>> -       if (!SyncRepRequested())
> >>>> +       if (!SyncRepRequested() ||
> >>>> +               !((volatile WalSndCtlData *) WalSndCtl)->sync_standbys_defined)
> >>>>                    return;
> >>>>
> >>>
> >>> I think we need more comments describing why checking
> >>> sync_standby_defined without SyncRepLock is safe here. For example:
> >>
> >> Yep, agreed!
> >>
> >>> This routine gets called every commit time. So, to check if the
> >>> synchronous standbys is defined as quick as possible we check
> >>> WalSndCtl->sync_standbys_defined without acquiring SyncRepLock. Since
> >>> we make this test unlocked, there's a change we might fail to notice
> >>> that it has been turned off and continue processing.
> >>
> >> Does this really happen? I was thinking that the problem by not taking
> >> the lock here is that SyncRepWaitForLSN() can see that shared flag after
> >> SyncRepUpdateSyncStandbysDefined() wakes up all the waiters and
> >> before it sets the flag to false. Then if SyncRepWaitForLSN() adds  itself
> >> into the wait queue becaues the flag was true, without lock, it may keep
> >> sleeping infinitely.
> >
> > I think that because a backend does the following check after
> > acquiring SyncRepLock, in that case, once the backend has taken
> > SyncRepLock it can see that sync_standbys_defined is false and return.
>
> Yes, but the backend can see that sync_standby_defined indicates false
> whether holding SyncRepLock or not, after the checkpointer sets it to false.
>
> > But you meant that we do both checks without SyncRepLock?
>
> Maybe No. The change that the latest patch provides should be applied, I think.
> That is, sync_standbys_defined should be check without lock at first, then
> only if it's true, it should be checked again with lock.

Yes. My understanding is the same.

After applying your patch, SyncRepWaitForLSN() is going to become
something like:

    /*
     * Fast exit if user has not requested sync replication, or there are no
     * sync replication standby names defined.
     */
    if (!SyncRepRequested() ||
        !((volatile WalSndCtlData *) WalSndCtl)->sync_standbys_defined)
        return;

    Assert(SHMQueueIsDetached(&(MyProc->syncRepLinks)));
    Assert(WalSndCtl != NULL);

    LWLockAcquire(SyncRepLock, LW_EXCLUSIVE);
    Assert(MyProc->syncRepState == SYNC_REP_NOT_WAITING);

    /*
     * We don't wait for sync rep if WalSndCtl->sync_standbys_defined is not
     * set.  See SyncRepUpdateSyncStandbysDefined.
     *
     * Also check that the standby hasn't already replied. Unlikely race
     * condition but we'll be fetching that cache line anyway so it's likely
     * to be a low cost check.
     */
    if (!WalSndCtl->sync_standbys_defined ||
        lsn <= WalSndCtl->lsn[mode])
    {
        LWLockRelease(SyncRepLock);
        return;
    }

    /*
     * Set our waitLSN so WALSender will know when to wake us, and add
     * ourselves to the queue.
     */
    MyProc->waitLSN = lsn;
    MyProc->syncRepState = SYNC_REP_WAITING;
    SyncRepQueueInsert(mode);
    Assert(SyncRepQueueIsOrderedByLSN(mode));
    LWLockRelease(SyncRepLock);

There are two checks of sync_standbys_defined. The first check is
performed without SyncRepLock and the second check is performed with
SyncRepLock. That's what you and I are expecting. Right?

>
> ISTM that basically SyncRepLock is used in SyncRepWaitForLSN() and
> SyncRepUpdateSyncStandbysDefined() to make operation on the queue
> and enabling sync_standbys_defined atomic. Without lock, the issue that
> the comment in SyncRepUpdateSyncStandbysDefined() explains would
> happen. That is, the backend may keep waiting infinitely as follows.
>

Let me think the following sequence after applying your changes:

> 1. checkpointer calls SyncRepUpdateSyncStandbysDefined()
> 2. checkpointer sees that the flag indicates true but the config indicates false
> 3. checkpointer takes lock and wakes up all the waiters
> 4. backend calls SyncRepWaitForLSN() can see that the flag indicates true

Yes, I suppose this is the first check of sync_standbys_defined.

And before the second check, backend tries to acquire SyncRepLock but
since the lock is already being held by checkohpointer it must wait.

> 5. checkpointer sets the flag to false and releases the lock

After checkpointer release the lock, the backend is woken up.

> 6. backend adds itself to the queue and wait until it's waken up, but will not happen immediately

The backend sees that the flag has been false at the second check, so return.

If we didn't acquire SyncRepLock even for the second check I think the
backend would keep waiting infinitely as you mentioned.

>
> So after the backend sees that the flag indicates true without lock,
> it must check the flag again with lock immediately without operating
> the queue. If this my understanding is right, I was thinking that
> the comment should mention these things.

I think that's right. I was going to describe why we do the first
check without SyncRepLock and why it is safe but it seems to me that
these things you mentioned are related to the second check, if I'm not
missing something.

>
> >      /*
> >       * We don't wait for sync rep if WalSndCtl->sync_standbys_defined is not
> >       * set.  See SyncRepUpdateSyncStandbysDefined.
> >       *
> >       * Also check that the standby hasn't already replied. Unlikely race
> >       * condition but we'll be fetching that cache line anyway so it's likely
> >       * to be a low cost check.
> >       */
> >      if (!WalSndCtl->sync_standbys_defined ||
> >          lsn <= WalSndCtl->lsn[mode])
> >      {
> >          LWLockRelease(SyncRepLock);
> >          return;
> >      }
> >
> >>
> >>> But since the
> >>> subsequent check will check it again while holding SyncRepLock, it's
> >>> no problem. Similarly even if we fail to notice that it has been
> >>> turned on
> >> Is this true? ISTM that after SyncRepUpdateSyncStandbysDefined()
> >> sets the flag to true, SyncRepWaitForLSN() basically doesn't seem
> >> to fail to notice that. No?
> >
> > What I wanted to say is, in the current code, while the checkpointer
> > process is holding SyncRepLock to turn off sync_standbys_defined,
> > backends who reach SyncRepWaitForLSN() wait for the lock. Then, after
> > the checkpointer process releases SyncRepLock these backend can
> > enqueue themselves to the wait queue because they can see that
> > sync_standbys_defined is turned on.
>
> In this case, since the checkpointer turned the flag off while holding
> the lock, the backend sees that the flag is turned off, and doesn't
> enqueue itself. No?

Oops, I had mistake here. It should be "while the checkpointer process
is holding SyncRepLock to turn *on* sync_standbys_defined, ...".

>
> > On the other hand if we do the
> > check without SyncRepLock, backends who reach SyncRepWaitForLSN() will
> > return instead of waiting, in spite of checkpointer process being
> > turning on sync_standbys_defined. Which means these backends are
> > failing to notice that it has been turned on, I thought.
>
> No. Or I'm missing something... In this case, the backend sees that
> the flag is turned on without lock since checkpointer turned it on.
> So you're thinking the following. Right?
>
> 1. sync_standbys_defined flag is false
> 2. checkpointer takes the lock and turns the flag on
> 3. backend sees the flag
> 4. checkpointer releases the lock
>
> In #3, the flag indicates true, I think. But you think it's false?

I meant the backends who reach SyncRepLock() while checkpointer is at
after acquiring the lock but before turning the flag on, described at
#3 step in the following sequence. Such backends will wait for the
lock in the current code, but after applying the patch they return
quickly. So what I'm thinking is:

1. sync_standbys_defined flag is false
2. checkpointer takes the lock
3. backend sees the flag, and return as it's still false
4. checkpointer turns the flag on
5. checkpointer releases the lock

If a backend reaches SyncRepWaitForLSN() between #4 and #5 it will
wait for the lock and then enqueue itself after acquiring the lock.
But such behavior is not changed before and after applying the patch.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: SyncRepLock acquired exclusively in default configuration

From

Masahiko Sawada

Date:

19 May 2020, 02:41:12

On Sat, 11 Apr 2020 at 09:30, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Fri, 10 Apr 2020 at 21:52, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> >
> >
> >
> > On 2020/04/10 20:56, Masahiko Sawada wrote:
> > > On Fri, 10 Apr 2020 at 18:57, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> > >>
> > >>
> > >>
> > >> On 2020/04/10 14:11, Masahiko Sawada wrote:
> > >>> On Fri, 10 Apr 2020 at 13:20, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> > >>>>
> > >>>>
> > >>>>
> > >>>> On 2020/04/08 3:01, Ashwin Agrawal wrote:
> > >>>>>
> > >>>>> On Mon, Apr 6, 2020 at 2:14 PM Andres Freund <andres@anarazel.de <mailto:andres@anarazel.de>> wrote:
> > >>>>>
> > >>>>>        > How about we change it to this ?
> > >>>>>
> > >>>>>       Hm. Better. But I think it might need at least a compiler barrier /
> > >>>>>       volatile memory load?  Unlikely here, but otherwise the compiler could
> > >>>>>       theoretically just stash the variable somewhere locally (it's not likely
> > >>>>>       to be a problem because it'd not be long ago that we acquired an lwlock,
> > >>>>>       which is a full barrier).
> > >>>>>
> > >>>>>
> > >>>>> That's the part, I am not fully sure about. But reading the comment above SyncRepUpdateSyncStandbysDefined(),
itseems fine.
 
> > >>>>>
> > >>>>>        > Bring back the check which existed based on GUC but instead of just blindly
> > >>>>>        > returning based on just GUC not being set, check
> > >>>>>        > WalSndCtl->sync_standbys_defined. Thoughts?
> > >>>>>
> > >>>>>       Hm. Is there any reason not to just check
> > >>>>>       WalSndCtl->sync_standbys_defined? rather than both !SyncStandbysDefined()
> > >>>>>       and WalSndCtl->sync_standbys_defined?
> > >>>>>
> > >>>>>
> > >>>>> Agree, just checking for WalSndCtl->sync_standbys_defined seems fine.
> > >>>>
> > >>>> So the consensus is something like the following? Patch attached.
> > >>>>
> > >>>>            /*
> > >>>> -        * Fast exit if user has not requested sync replication.
> > >>>> +        * Fast exit if user has not requested sync replication, or there are no
> > >>>> +        * sync replication standby names defined.
> > >>>>             */
> > >>>> -       if (!SyncRepRequested())
> > >>>> +       if (!SyncRepRequested() ||
> > >>>> +               !((volatile WalSndCtlData *) WalSndCtl)->sync_standbys_defined)
> > >>>>                    return;
> > >>>>
> > >>>
> > >>> I think we need more comments describing why checking
> > >>> sync_standby_defined without SyncRepLock is safe here. For example:
> > >>
> > >> Yep, agreed!
> > >>
> > >>> This routine gets called every commit time. So, to check if the
> > >>> synchronous standbys is defined as quick as possible we check
> > >>> WalSndCtl->sync_standbys_defined without acquiring SyncRepLock. Since
> > >>> we make this test unlocked, there's a change we might fail to notice
> > >>> that it has been turned off and continue processing.
> > >>
> > >> Does this really happen? I was thinking that the problem by not taking
> > >> the lock here is that SyncRepWaitForLSN() can see that shared flag after
> > >> SyncRepUpdateSyncStandbysDefined() wakes up all the waiters and
> > >> before it sets the flag to false. Then if SyncRepWaitForLSN() adds  itself
> > >> into the wait queue becaues the flag was true, without lock, it may keep
> > >> sleeping infinitely.
> > >
> > > I think that because a backend does the following check after
> > > acquiring SyncRepLock, in that case, once the backend has taken
> > > SyncRepLock it can see that sync_standbys_defined is false and return.
> >
> > Yes, but the backend can see that sync_standby_defined indicates false
> > whether holding SyncRepLock or not, after the checkpointer sets it to false.
> >
> > > But you meant that we do both checks without SyncRepLock?
> >
> > Maybe No. The change that the latest patch provides should be applied, I think.
> > That is, sync_standbys_defined should be check without lock at first, then
> > only if it's true, it should be checked again with lock.
>
> Yes. My understanding is the same.
>
> After applying your patch, SyncRepWaitForLSN() is going to become
> something like:
>
>     /*
>      * Fast exit if user has not requested sync replication, or there are no
>      * sync replication standby names defined.
>      */
>     if (!SyncRepRequested() ||
>         !((volatile WalSndCtlData *) WalSndCtl)->sync_standbys_defined)
>         return;
>
>     Assert(SHMQueueIsDetached(&(MyProc->syncRepLinks)));
>     Assert(WalSndCtl != NULL);
>
>     LWLockAcquire(SyncRepLock, LW_EXCLUSIVE);
>     Assert(MyProc->syncRepState == SYNC_REP_NOT_WAITING);
>
>     /*
>      * We don't wait for sync rep if WalSndCtl->sync_standbys_defined is not
>      * set.  See SyncRepUpdateSyncStandbysDefined.
>      *
>      * Also check that the standby hasn't already replied. Unlikely race
>      * condition but we'll be fetching that cache line anyway so it's likely
>      * to be a low cost check.
>      */
>     if (!WalSndCtl->sync_standbys_defined ||
>         lsn <= WalSndCtl->lsn[mode])
>     {
>         LWLockRelease(SyncRepLock);
>         return;
>     }
>
>     /*
>      * Set our waitLSN so WALSender will know when to wake us, and add
>      * ourselves to the queue.
>      */
>     MyProc->waitLSN = lsn;
>     MyProc->syncRepState = SYNC_REP_WAITING;
>     SyncRepQueueInsert(mode);
>     Assert(SyncRepQueueIsOrderedByLSN(mode));
>     LWLockRelease(SyncRepLock);
>
> There are two checks of sync_standbys_defined. The first check is
> performed without SyncRepLock and the second check is performed with
> SyncRepLock. That's what you and I are expecting. Right?
>
> >
> > ISTM that basically SyncRepLock is used in SyncRepWaitForLSN() and
> > SyncRepUpdateSyncStandbysDefined() to make operation on the queue
> > and enabling sync_standbys_defined atomic. Without lock, the issue that
> > the comment in SyncRepUpdateSyncStandbysDefined() explains would
> > happen. That is, the backend may keep waiting infinitely as follows.
> >
>
> Let me think the following sequence after applying your changes:
>
> > 1. checkpointer calls SyncRepUpdateSyncStandbysDefined()
> > 2. checkpointer sees that the flag indicates true but the config indicates false
> > 3. checkpointer takes lock and wakes up all the waiters
> > 4. backend calls SyncRepWaitForLSN() can see that the flag indicates true
>
> Yes, I suppose this is the first check of sync_standbys_defined.
>
> And before the second check, backend tries to acquire SyncRepLock but
> since the lock is already being held by checkohpointer it must wait.
>
> > 5. checkpointer sets the flag to false and releases the lock
>
> After checkpointer release the lock, the backend is woken up.
>
> > 6. backend adds itself to the queue and wait until it's waken up, but will not happen immediately
>
> The backend sees that the flag has been false at the second check, so return.
>
> If we didn't acquire SyncRepLock even for the second check I think the
> backend would keep waiting infinitely as you mentioned.
>
> >
> > So after the backend sees that the flag indicates true without lock,
> > it must check the flag again with lock immediately without operating
> > the queue. If this my understanding is right, I was thinking that
> > the comment should mention these things.
>
> I think that's right. I was going to describe why we do the first
> check without SyncRepLock and why it is safe but it seems to me that
> these things you mentioned are related to the second check, if I'm not
> missing something.
>
> >
> > >      /*
> > >       * We don't wait for sync rep if WalSndCtl->sync_standbys_defined is not
> > >       * set.  See SyncRepUpdateSyncStandbysDefined.
> > >       *
> > >       * Also check that the standby hasn't already replied. Unlikely race
> > >       * condition but we'll be fetching that cache line anyway so it's likely
> > >       * to be a low cost check.
> > >       */
> > >      if (!WalSndCtl->sync_standbys_defined ||
> > >          lsn <= WalSndCtl->lsn[mode])
> > >      {
> > >          LWLockRelease(SyncRepLock);
> > >          return;
> > >      }
> > >
> > >>
> > >>> But since the
> > >>> subsequent check will check it again while holding SyncRepLock, it's
> > >>> no problem. Similarly even if we fail to notice that it has been
> > >>> turned on
> > >> Is this true? ISTM that after SyncRepUpdateSyncStandbysDefined()
> > >> sets the flag to true, SyncRepWaitForLSN() basically doesn't seem
> > >> to fail to notice that. No?
> > >
> > > What I wanted to say is, in the current code, while the checkpointer
> > > process is holding SyncRepLock to turn off sync_standbys_defined,
> > > backends who reach SyncRepWaitForLSN() wait for the lock. Then, after
> > > the checkpointer process releases SyncRepLock these backend can
> > > enqueue themselves to the wait queue because they can see that
> > > sync_standbys_defined is turned on.
> >
> > In this case, since the checkpointer turned the flag off while holding
> > the lock, the backend sees that the flag is turned off, and doesn't
> > enqueue itself. No?
>
> Oops, I had mistake here. It should be "while the checkpointer process
> is holding SyncRepLock to turn *on* sync_standbys_defined, ...".
>
> >
> > > On the other hand if we do the
> > > check without SyncRepLock, backends who reach SyncRepWaitForLSN() will
> > > return instead of waiting, in spite of checkpointer process being
> > > turning on sync_standbys_defined. Which means these backends are
> > > failing to notice that it has been turned on, I thought.
> >
> > No. Or I'm missing something... In this case, the backend sees that
> > the flag is turned on without lock since checkpointer turned it on.
> > So you're thinking the following. Right?
> >
> > 1. sync_standbys_defined flag is false
> > 2. checkpointer takes the lock and turns the flag on
> > 3. backend sees the flag
> > 4. checkpointer releases the lock
> >
> > In #3, the flag indicates true, I think. But you think it's false?
>
> I meant the backends who reach SyncRepLock() while checkpointer is at
> after acquiring the lock but before turning the flag on, described at
> #3 step in the following sequence. Such backends will wait for the
> lock in the current code, but after applying the patch they return
> quickly. So what I'm thinking is:
>
> 1. sync_standbys_defined flag is false
> 2. checkpointer takes the lock
> 3. backend sees the flag, and return as it's still false
> 4. checkpointer turns the flag on
> 5. checkpointer releases the lock
>
> If a backend reaches SyncRepWaitForLSN() between #4 and #5 it will
> wait for the lock and then enqueue itself after acquiring the lock.
> But such behavior is not changed before and after applying the patch.
>

Fujii-san, I think we agree on how to fix this issue and on the patch
you proposed so please add your comments.

This item is for PG14, right? If so I'd like to add this item to the
next commit fest.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: SyncRepLock acquired exclusively in default configuration

From

Ashwin Agrawal

Date:

19 May 2020, 15:56:13

On Mon, May 18, 2020 at 7:41 PM Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote:

This item is for PG14, right? If so I'd like to add this item to the
next commit fest.

Sure, add it to commit fest.

Seems though it should be backpatched to relevant branches as well.

Re: SyncRepLock acquired exclusively in default configuration

From

Michael Paquier

Date:

20 May 2020, 05:38:38

On Tue, May 19, 2020 at 08:56:13AM -0700, Ashwin Agrawal wrote:
> Sure, add it to commit fest.
> Seems though it should be backpatched to relevant branches as well.

It does not seem to be listed yet.  Are you planning to add it under
the section for bug fixes?
--
Michael

Attachment

signature.asc

Re: SyncRepLock acquired exclusively in default configuration

From

Fujii Masao

Date:

09 July 2020, 10:52:01


On 2020/05/19 11:41, Masahiko Sawada wrote:
> On Sat, 11 Apr 2020 at 09:30, Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
>>
>> On Fri, 10 Apr 2020 at 21:52, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>>>
>>>
>>>
>>> On 2020/04/10 20:56, Masahiko Sawada wrote:
>>>> On Fri, 10 Apr 2020 at 18:57, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 2020/04/10 14:11, Masahiko Sawada wrote:
>>>>>> On Fri, 10 Apr 2020 at 13:20, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 2020/04/08 3:01, Ashwin Agrawal wrote:
>>>>>>>>
>>>>>>>> On Mon, Apr 6, 2020 at 2:14 PM Andres Freund <andres@anarazel.de <mailto:andres@anarazel.de>> wrote:
>>>>>>>>
>>>>>>>>         > How about we change it to this ?
>>>>>>>>
>>>>>>>>        Hm. Better. But I think it might need at least a compiler barrier /
>>>>>>>>        volatile memory load?  Unlikely here, but otherwise the compiler could
>>>>>>>>        theoretically just stash the variable somewhere locally (it's not likely
>>>>>>>>        to be a problem because it'd not be long ago that we acquired an lwlock,
>>>>>>>>        which is a full barrier).
>>>>>>>>
>>>>>>>>
>>>>>>>> That's the part, I am not fully sure about. But reading the comment above SyncRepUpdateSyncStandbysDefined(),
itseems fine.
 
>>>>>>>>
>>>>>>>>         > Bring back the check which existed based on GUC but instead of just blindly
>>>>>>>>         > returning based on just GUC not being set, check
>>>>>>>>         > WalSndCtl->sync_standbys_defined. Thoughts?
>>>>>>>>
>>>>>>>>        Hm. Is there any reason not to just check
>>>>>>>>        WalSndCtl->sync_standbys_defined? rather than both !SyncStandbysDefined()
>>>>>>>>        and WalSndCtl->sync_standbys_defined?
>>>>>>>>
>>>>>>>>
>>>>>>>> Agree, just checking for WalSndCtl->sync_standbys_defined seems fine.
>>>>>>>
>>>>>>> So the consensus is something like the following? Patch attached.
>>>>>>>
>>>>>>>             /*
>>>>>>> -        * Fast exit if user has not requested sync replication.
>>>>>>> +        * Fast exit if user has not requested sync replication, or there are no
>>>>>>> +        * sync replication standby names defined.
>>>>>>>              */
>>>>>>> -       if (!SyncRepRequested())
>>>>>>> +       if (!SyncRepRequested() ||
>>>>>>> +               !((volatile WalSndCtlData *) WalSndCtl)->sync_standbys_defined)
>>>>>>>                     return;
>>>>>>>
>>>>>>
>>>>>> I think we need more comments describing why checking
>>>>>> sync_standby_defined without SyncRepLock is safe here. For example:
>>>>>
>>>>> Yep, agreed!
>>>>>
>>>>>> This routine gets called every commit time. So, to check if the
>>>>>> synchronous standbys is defined as quick as possible we check
>>>>>> WalSndCtl->sync_standbys_defined without acquiring SyncRepLock. Since
>>>>>> we make this test unlocked, there's a change we might fail to notice
>>>>>> that it has been turned off and continue processing.
>>>>>
>>>>> Does this really happen? I was thinking that the problem by not taking
>>>>> the lock here is that SyncRepWaitForLSN() can see that shared flag after
>>>>> SyncRepUpdateSyncStandbysDefined() wakes up all the waiters and
>>>>> before it sets the flag to false. Then if SyncRepWaitForLSN() adds  itself
>>>>> into the wait queue becaues the flag was true, without lock, it may keep
>>>>> sleeping infinitely.
>>>>
>>>> I think that because a backend does the following check after
>>>> acquiring SyncRepLock, in that case, once the backend has taken
>>>> SyncRepLock it can see that sync_standbys_defined is false and return.
>>>
>>> Yes, but the backend can see that sync_standby_defined indicates false
>>> whether holding SyncRepLock or not, after the checkpointer sets it to false.
>>>
>>>> But you meant that we do both checks without SyncRepLock?
>>>
>>> Maybe No. The change that the latest patch provides should be applied, I think.
>>> That is, sync_standbys_defined should be check without lock at first, then
>>> only if it's true, it should be checked again with lock.
>>
>> Yes. My understanding is the same.
>>
>> After applying your patch, SyncRepWaitForLSN() is going to become
>> something like:
>>
>>      /*
>>       * Fast exit if user has not requested sync replication, or there are no
>>       * sync replication standby names defined.
>>       */
>>      if (!SyncRepRequested() ||
>>          !((volatile WalSndCtlData *) WalSndCtl)->sync_standbys_defined)
>>          return;
>>
>>      Assert(SHMQueueIsDetached(&(MyProc->syncRepLinks)));
>>      Assert(WalSndCtl != NULL);
>>
>>      LWLockAcquire(SyncRepLock, LW_EXCLUSIVE);
>>      Assert(MyProc->syncRepState == SYNC_REP_NOT_WAITING);
>>
>>      /*
>>       * We don't wait for sync rep if WalSndCtl->sync_standbys_defined is not
>>       * set.  See SyncRepUpdateSyncStandbysDefined.
>>       *
>>       * Also check that the standby hasn't already replied. Unlikely race
>>       * condition but we'll be fetching that cache line anyway so it's likely
>>       * to be a low cost check.
>>       */
>>      if (!WalSndCtl->sync_standbys_defined ||
>>          lsn <= WalSndCtl->lsn[mode])
>>      {
>>          LWLockRelease(SyncRepLock);
>>          return;
>>      }
>>
>>      /*
>>       * Set our waitLSN so WALSender will know when to wake us, and add
>>       * ourselves to the queue.
>>       */
>>      MyProc->waitLSN = lsn;
>>      MyProc->syncRepState = SYNC_REP_WAITING;
>>      SyncRepQueueInsert(mode);
>>      Assert(SyncRepQueueIsOrderedByLSN(mode));
>>      LWLockRelease(SyncRepLock);
>>
>> There are two checks of sync_standbys_defined. The first check is
>> performed without SyncRepLock and the second check is performed with
>> SyncRepLock. That's what you and I are expecting. Right?
>>
>>>
>>> ISTM that basically SyncRepLock is used in SyncRepWaitForLSN() and
>>> SyncRepUpdateSyncStandbysDefined() to make operation on the queue
>>> and enabling sync_standbys_defined atomic. Without lock, the issue that
>>> the comment in SyncRepUpdateSyncStandbysDefined() explains would
>>> happen. That is, the backend may keep waiting infinitely as follows.
>>>
>>
>> Let me think the following sequence after applying your changes:
>>
>>> 1. checkpointer calls SyncRepUpdateSyncStandbysDefined()
>>> 2. checkpointer sees that the flag indicates true but the config indicates false
>>> 3. checkpointer takes lock and wakes up all the waiters
>>> 4. backend calls SyncRepWaitForLSN() can see that the flag indicates true
>>
>> Yes, I suppose this is the first check of sync_standbys_defined.
>>
>> And before the second check, backend tries to acquire SyncRepLock but
>> since the lock is already being held by checkohpointer it must wait.
>>
>>> 5. checkpointer sets the flag to false and releases the lock
>>
>> After checkpointer release the lock, the backend is woken up.
>>
>>> 6. backend adds itself to the queue and wait until it's waken up, but will not happen immediately
>>
>> The backend sees that the flag has been false at the second check, so return.
>>
>> If we didn't acquire SyncRepLock even for the second check I think the
>> backend would keep waiting infinitely as you mentioned.
>>
>>>
>>> So after the backend sees that the flag indicates true without lock,
>>> it must check the flag again with lock immediately without operating
>>> the queue. If this my understanding is right, I was thinking that
>>> the comment should mention these things.
>>
>> I think that's right. I was going to describe why we do the first
>> check without SyncRepLock and why it is safe but it seems to me that
>> these things you mentioned are related to the second check, if I'm not
>> missing something.
>>
>>>
>>>>       /*
>>>>        * We don't wait for sync rep if WalSndCtl->sync_standbys_defined is not
>>>>        * set.  See SyncRepUpdateSyncStandbysDefined.
>>>>        *
>>>>        * Also check that the standby hasn't already replied. Unlikely race
>>>>        * condition but we'll be fetching that cache line anyway so it's likely
>>>>        * to be a low cost check.
>>>>        */
>>>>       if (!WalSndCtl->sync_standbys_defined ||
>>>>           lsn <= WalSndCtl->lsn[mode])
>>>>       {
>>>>           LWLockRelease(SyncRepLock);
>>>>           return;
>>>>       }
>>>>
>>>>>
>>>>>> But since the
>>>>>> subsequent check will check it again while holding SyncRepLock, it's
>>>>>> no problem. Similarly even if we fail to notice that it has been
>>>>>> turned on
>>>>> Is this true? ISTM that after SyncRepUpdateSyncStandbysDefined()
>>>>> sets the flag to true, SyncRepWaitForLSN() basically doesn't seem
>>>>> to fail to notice that. No?
>>>>
>>>> What I wanted to say is, in the current code, while the checkpointer
>>>> process is holding SyncRepLock to turn off sync_standbys_defined,
>>>> backends who reach SyncRepWaitForLSN() wait for the lock. Then, after
>>>> the checkpointer process releases SyncRepLock these backend can
>>>> enqueue themselves to the wait queue because they can see that
>>>> sync_standbys_defined is turned on.
>>>
>>> In this case, since the checkpointer turned the flag off while holding
>>> the lock, the backend sees that the flag is turned off, and doesn't
>>> enqueue itself. No?
>>
>> Oops, I had mistake here. It should be "while the checkpointer process
>> is holding SyncRepLock to turn *on* sync_standbys_defined, ...".
>>
>>>
>>>> On the other hand if we do the
>>>> check without SyncRepLock, backends who reach SyncRepWaitForLSN() will
>>>> return instead of waiting, in spite of checkpointer process being
>>>> turning on sync_standbys_defined. Which means these backends are
>>>> failing to notice that it has been turned on, I thought.
>>>
>>> No. Or I'm missing something... In this case, the backend sees that
>>> the flag is turned on without lock since checkpointer turned it on.
>>> So you're thinking the following. Right?
>>>
>>> 1. sync_standbys_defined flag is false
>>> 2. checkpointer takes the lock and turns the flag on
>>> 3. backend sees the flag
>>> 4. checkpointer releases the lock
>>>
>>> In #3, the flag indicates true, I think. But you think it's false?
>>
>> I meant the backends who reach SyncRepLock() while checkpointer is at
>> after acquiring the lock but before turning the flag on, described at
>> #3 step in the following sequence. Such backends will wait for the
>> lock in the current code, but after applying the patch they return
>> quickly. So what I'm thinking is:
>>
>> 1. sync_standbys_defined flag is false
>> 2. checkpointer takes the lock
>> 3. backend sees the flag, and return as it's still false
>> 4. checkpointer turns the flag on
>> 5. checkpointer releases the lock
>>
>> If a backend reaches SyncRepWaitForLSN() between #4 and #5 it will
>> wait for the lock and then enqueue itself after acquiring the lock.
>> But such behavior is not changed before and after applying the patch.
>>
> 
> Fujii-san, I think we agree on how to fix this issue and on the patch
> you proposed so please add your comments.

Sorry for the late reply...

Regarding how to fix, don't we need memory barrier when reading
sync_standbys_defined? Without that, after SyncRepUpdateSyncStandbysDefined()
updates it to true, SyncRepWaitForLSN() can see the previous value,
i.e., false, and then exit out of the function. Is this right?
If this is right, we need memory barrier to avoid this issue?


> This item is for PG14, right?

Yes!

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: SyncRepLock acquired exclusively in default configuration

From

Asim Praveen

Date:

11 August 2020, 11:55:05

The proposed fix looks good, it resolves the lock contention problem as intended. +1 from my side.

> On 09-Jul-2020, at 4:22 PM, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> 
> 
> Regarding how to fix, don't we need memory barrier when reading
> sync_standbys_defined? Without that, after SyncRepUpdateSyncStandbysDefined()
> updates it to true, SyncRepWaitForLSN() can see the previous value,
> i.e., false, and then exit out of the function. Is this right?
> If this is right, we need memory barrier to avoid this issue?
> 

There is no out-of-order execution hazard in the scenario you are describing, memory barriers don’t seem to fit.  Using
locksto synchronise checkpointer process and a committing backend process is the right way.  We have made a conscious
decisionto bypass the lock, which looks correct in this case.

As an aside, there is a small (?) window where a change to synchronous_standby_names GUC is partially propagated among
committingbackends, checkpointer and walsender.  Such a window may result in walsender declaring a standby as
synchronouswhile a commit backend fails to wait for it in SyncRepWaitForLSN.  The root cause is walsender uses
sync_standby_priority,a per-walsender variable to tell if a standby is synchronous.  It is updated when walsender
processesa config change.  Whereas sync_standbys_defined, a variable updated by checkpointer, is used by committing
backendsto determine if they need to wait.  If checkpointer is busy flushing buffers, it may take longer than walsender
toreflect a change in sync_standbys_defined.  This is a low impact problem, should be ok to live with it.

Asim

Re: SyncRepLock acquired exclusively in default configuration

From

Robert Haas

Date:

11 August 2020, 15:27:16

On Tue, Aug 11, 2020 at 7:55 AM Asim Praveen <pasim@vmware.com> wrote:
> There is no out-of-order execution hazard in the scenario you are describing, memory barriers don’t seem to fit.
Usinglocks to synchronise checkpointer process and a committing backend process is the right way. We have made a
consciousdecision to bypass the lock, which looks correct in this case.

Yeah, I am not immediately seeing why a memory barrier would help anything here.

> As an aside, there is a small (?) window where a change to synchronous_standby_names GUC is partially propagated
amongcommitting backends, checkpointer and walsender. Such a window may result in walsender declaring a standby as
synchronouswhile a commit backend fails to wait for it in SyncRepWaitForLSN. The root cause is walsender uses
sync_standby_priority,a per-walsender variable to tell if a standby is synchronous. It is updated when walsender
processesa config change. Whereas sync_standbys_defined, a variable updated by checkpointer, is used by committing
backendsto determine if they need to wait. If checkpointer is busy flushing buffers, it may take longer than walsender
toreflect a change in sync_standbys_defined. This is a low impact problem, should be ok to live with it.

I think this gets to the root of the issue. If we check the flag
without a lock, we might see a slightly stale value. But, considering
that there's no particular amount of time within which configuration
changes are guaranteed to take effect, maybe that's OK. However, there
is one potential gotcha here: if the walsender declares the standby to
be synchronous, a user can see that, right? So maybe there's this
problem: a user sees that the standby is synchronous and expects a
transaction committing afterward to provoke a wait, but really it
doesn't. Now the user is unhappy, feeling that the system didn't
perform according to expectations.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: SyncRepLock acquired exclusively in default configuration

From

Asim Praveen

Date:

12 August 2020, 05:06:43

> On 11-Aug-2020, at 8:57 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> 
> I think this gets to the root of the issue. If we check the flag
> without a lock, we might see a slightly stale value. But, considering
> that there's no particular amount of time within which configuration
> changes are guaranteed to take effect, maybe that's OK. However, there
> is one potential gotcha here: if the walsender declares the standby to
> be synchronous, a user can see that, right? So maybe there's this
> problem: a user sees that the standby is synchronous and expects a
> transaction committing afterward to provoke a wait, but really it
> doesn't. Now the user is unhappy, feeling that the system didn't
> perform according to expectations.

Yes, pg_stat_replication reports a standby in sync as soon as walsender updates priority of the standby to something
otherthan 0.

The potential gotcha referred above doesn’t seem too severe.  What is the likelihood of someone setting
synchronous_standby_namesGUC with either “*” or a standby name and then immediately promoting that standby?  If the
standbyis promoted before the checkpointer on master gets a chance to update sync_standbys_defined in shared memory,
commitsmade during this interval on master may not make it to standby.  Upon promotion, those commits may be lost.

Asim

Re: SyncRepLock acquired exclusively in default configuration

From

Masahiko Sawada

Date:

12 August 2020, 06:32:39

On Wed, 12 Aug 2020 at 14:06, Asim Praveen <pasim@vmware.com> wrote:
>
>
>
> > On 11-Aug-2020, at 8:57 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> >
> > I think this gets to the root of the issue. If we check the flag
> > without a lock, we might see a slightly stale value. But, considering
> > that there's no particular amount of time within which configuration
> > changes are guaranteed to take effect, maybe that's OK. However, there
> > is one potential gotcha here: if the walsender declares the standby to
> > be synchronous, a user can see that, right? So maybe there's this
> > problem: a user sees that the standby is synchronous and expects a
> > transaction committing afterward to provoke a wait, but really it
> > doesn't. Now the user is unhappy, feeling that the system didn't
> > perform according to expectations.
>
> Yes, pg_stat_replication reports a standby in sync as soon as walsender updates priority of the standby to something
otherthan 0. 
>
> The potential gotcha referred above doesn’t seem too severe.  What is the likelihood of someone setting
synchronous_standby_namesGUC with either “*” or a standby name and then immediately promoting that standby?  If the
standbyis promoted before the checkpointer on master gets a chance to update sync_standbys_defined in shared memory,
commitsmade during this interval on master may not make it to standby.  Upon promotion, those commits may be lost. 

I think that if the standby is quite behind the primary and in case of
the primary crashes, the likelihood of losing commits might get
higher. The user can see the standby became synchronous standby via
pg_stat_replication but commit completes without a wait because the
checkpointer doesn't update sync_standbys_defined yet. If the primary
crashes before standby catching up and the user does failover, the
committed transaction will be lost, even though the user expects that
transaction commit has been replicated to the standby synchronously.
And this can happen even without the patch, right?

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: SyncRepLock acquired exclusively in default configuration

From

Fujii Masao

Date:

19 August 2020, 12:41:03


On 2020/08/12 15:32, Masahiko Sawada wrote:
> On Wed, 12 Aug 2020 at 14:06, Asim Praveen <pasim@vmware.com> wrote:
>>
>>
>>
>>> On 11-Aug-2020, at 8:57 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>>
>>> I think this gets to the root of the issue. If we check the flag
>>> without a lock, we might see a slightly stale value. But, considering
>>> that there's no particular amount of time within which configuration
>>> changes are guaranteed to take effect, maybe that's OK. However, there
>>> is one potential gotcha here: if the walsender declares the standby to
>>> be synchronous, a user can see that, right? So maybe there's this
>>> problem: a user sees that the standby is synchronous and expects a
>>> transaction committing afterward to provoke a wait, but really it
>>> doesn't. Now the user is unhappy, feeling that the system didn't
>>> perform according to expectations.
>>
>> Yes, pg_stat_replication reports a standby in sync as soon as walsender updates priority of the standby to something
otherthan 0.
 
>>
>> The potential gotcha referred above doesn’t seem too severe.  What is the likelihood of someone setting
synchronous_standby_namesGUC with either “*” or a standby name and then immediately promoting that standby?  If the
standbyis promoted before the checkpointer on master gets a chance to update sync_standbys_defined in shared memory,
commitsmade during this interval on master may not make it to standby.  Upon promotion, those commits may be lost.
 
> 
> I think that if the standby is quite behind the primary and in case of
> the primary crashes, the likelihood of losing commits might get
> higher. The user can see the standby became synchronous standby via
> pg_stat_replication but commit completes without a wait because the
> checkpointer doesn't update sync_standbys_defined yet. If the primary
> crashes before standby catching up and the user does failover, the
> committed transaction will be lost, even though the user expects that
> transaction commit has been replicated to the standby synchronously.
> And this can happen even without the patch, right?

I think you're right. This issue can happen even without the patch.

Maybe we should not mark the standby as "sync" whenever sync_standbys_defined
is false even if synchronous_standby_names is actually set and walsenders have
already detect that? Or we need more aggressive approach;
make the checkpointer update sync_standby_priority values of
all the walsenders? ISTM that the latter looks overkill...

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: SyncRepLock acquired exclusively in default configuration

From

Asim Praveen

Date:

19 August 2020, 13:20:46


> On 12-Aug-2020, at 12:02 PM, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote:
> 
> On Wed, 12 Aug 2020 at 14:06, Asim Praveen <pasim@vmware.com> wrote:
>> 
>> 
>> 
>>> On 11-Aug-2020, at 8:57 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> 
>>> I think this gets to the root of the issue. If we check the flag
>>> without a lock, we might see a slightly stale value. But, considering
>>> that there's no particular amount of time within which configuration
>>> changes are guaranteed to take effect, maybe that's OK. However, there
>>> is one potential gotcha here: if the walsender declares the standby to
>>> be synchronous, a user can see that, right? So maybe there's this
>>> problem: a user sees that the standby is synchronous and expects a
>>> transaction committing afterward to provoke a wait, but really it
>>> doesn't. Now the user is unhappy, feeling that the system didn't
>>> perform according to expectations.
>> 
>> Yes, pg_stat_replication reports a standby in sync as soon as walsender updates priority of the standby to something
otherthan 0.
 
>> 
>> The potential gotcha referred above doesn’t seem too severe.  What is the likelihood of someone setting
synchronous_standby_namesGUC with either “*” or a standby name and then immediately promoting that standby?  If the
standbyis promoted before the checkpointer on master gets a chance to update sync_standbys_defined in shared memory,
commitsmade during this interval on master may not make it to standby.  Upon promotion, those commits may be lost.
 
> 
> I think that if the standby is quite behind the primary and in case of
> the primary crashes, the likelihood of losing commits might get
> higher. The user can see the standby became synchronous standby via
> pg_stat_replication but commit completes without a wait because the
> checkpointer doesn't update sync_standbys_defined yet. If the primary
> crashes before standby catching up and the user does failover, the
> committed transaction will be lost, even though the user expects that
> transaction commit has been replicated to the standby synchronously.
> And this can happen even without the patch, right?
> 

It is correct that the issue is orthogonal to the patch upthread and I’ve marked
the commitfest entry as ready-for-committer.

Regarding the issue described above, the amount by which the standby is lagging
behind the primary does not affect the severity.  A standby’s state will be
reported as “sync” to the user only after the standby has caught up (state ==
WALSNDSTATE_STREAMING).  The time it takes for the checkpointer to update the
sync_standbys_defined flag in shared memory is the important factor.  Once
checkpointer sets this flag, commits start waiting for the standby (as long as
it is in-sync).

Asim

Re: SyncRepLock acquired exclusively in default configuration

From

Kyotaro Horiguchi

Date:

20 August 2020, 02:29:46

At Wed, 19 Aug 2020 13:20:46 +0000, Asim Praveen <pasim@vmware.com> wrote in 
> 
> 
> > On 12-Aug-2020, at 12:02 PM, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote:
> > 
> > On Wed, 12 Aug 2020 at 14:06, Asim Praveen <pasim@vmware.com> wrote:
> >> 
> >> 
> >> 
> >>> On 11-Aug-2020, at 8:57 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> >>> 
> >>> I think this gets to the root of the issue. If we check the flag
> >>> without a lock, we might see a slightly stale value. But, considering
> >>> that there's no particular amount of time within which configuration
> >>> changes are guaranteed to take effect, maybe that's OK. However, there
> >>> is one potential gotcha here: if the walsender declares the standby to
> >>> be synchronous, a user can see that, right? So maybe there's this
> >>> problem: a user sees that the standby is synchronous and expects a
> >>> transaction committing afterward to provoke a wait, but really it
> >>> doesn't. Now the user is unhappy, feeling that the system didn't
> >>> perform according to expectations.
> >> 
> >> Yes, pg_stat_replication reports a standby in sync as soon as walsender updates priority of the standby to
somethingother than 0.
 
> >> 
> >> The potential gotcha referred above doesn’t seem too severe.  What is the likelihood of someone setting
synchronous_standby_namesGUC with either “*” or a standby name and then immediately promoting that standby?  If the
standbyis promoted before the checkpointer on master gets a chance to update sync_standbys_defined in shared memory,
commitsmade during this interval on master may not make it to standby.  Upon promotion, those commits may be lost.
 
> > 
> > I think that if the standby is quite behind the primary and in case of
> > the primary crashes, the likelihood of losing commits might get
> > higher. The user can see the standby became synchronous standby via
> > pg_stat_replication but commit completes without a wait because the
> > checkpointer doesn't update sync_standbys_defined yet. If the primary
> > crashes before standby catching up and the user does failover, the
> > committed transaction will be lost, even though the user expects that
> > transaction commit has been replicated to the standby synchronously.
> > And this can happen even without the patch, right?
> > 
> 
> It is correct that the issue is orthogonal to the patch upthread and I’ve marked
> the commitfest entry as ready-for-committer.

I find the name of SyncStandbysDefined macro is very confusing with
the struct member sync_standbys_defined, but that might be another
issue..

-     * Fast exit if user has not requested sync replication.
+     * Fast exit if user has not requested sync replication, or there are no
+     * sync replication standby names defined.

This comment sounds like we just do that twice. The reason for the
check is to avoid wasteful exclusive locks on SyncRepLock, or to form
double-checked locking on the variable. I think we should explain that
here.

> Regarding the issue described above, the amount by which the standby is lagging
> behind the primary does not affect the severity.  A standby’s state will be
> reported as “sync” to the user only after the standby has caught up (state ==
> WALSNDSTATE_STREAMING).  The time it takes for the checkpointer to update the
> sync_standbys_defined flag in shared memory is the important factor.  Once
> checkpointer sets this flag, commits start waiting for the standby (as long as
> it is in-sync).

CATCHUP state is only entered at replication startup. It stays at
STREAMING when sync_sby_names is switched from '' to a valid name,
thus sync_state shows 'sync' even if checkpointer hasn't changed
sync_standbys_defined. If the standby being switched had a big lag,
the chance of losing commits getting larger (up to certain extent) for
the same extent of checkpointer lag.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: SyncRepLock acquired exclusively in default configuration

From

Kyotaro Horiguchi

Date:

20 August 2020, 03:12:50

At Wed, 19 Aug 2020 21:41:03 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in 
> 
> 
> On 2020/08/12 15:32, Masahiko Sawada wrote:
> > On Wed, 12 Aug 2020 at 14:06, Asim Praveen <pasim@vmware.com> wrote:
> >>
> >>
> >>
> >>> On 11-Aug-2020, at 8:57 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> >>>
> >>> I think this gets to the root of the issue. If we check the flag
> >>> without a lock, we might see a slightly stale value. But, considering
> >>> that there's no particular amount of time within which configuration
> >>> changes are guaranteed to take effect, maybe that's OK. However, there
> >>> is one potential gotcha here: if the walsender declares the standby to
> >>> be synchronous, a user can see that, right? So maybe there's this
> >>> problem: a user sees that the standby is synchronous and expects a
> >>> transaction committing afterward to provoke a wait, but really it
> >>> doesn't. Now the user is unhappy, feeling that the system didn't
> >>> perform according to expectations.
> >>
> >> Yes, pg_stat_replication reports a standby in sync as soon as
> >> walsender updates priority of the standby to something other than 0.
> >>
> >> The potential gotcha referred above doesn’t seem too severe.  What is
> >> the likelihood of someone setting synchronous_standby_names GUC with
> >> either “*” or a standby name and then immediately promoting that
> >> standby?  If the standby is promoted before the checkpointer on master
> >> gets a chance to update sync_standbys_defined in shared memory,
> >> commits made during this interval on master may not make it to
> >> standby.  Upon promotion, those commits may be lost.
> > I think that if the standby is quite behind the primary and in case of
> > the primary crashes, the likelihood of losing commits might get
> > higher. The user can see the standby became synchronous standby via
> > pg_stat_replication but commit completes without a wait because the
> > checkpointer doesn't update sync_standbys_defined yet. If the primary
> > crashes before standby catching up and the user does failover, the
> > committed transaction will be lost, even though the user expects that
> > transaction commit has been replicated to the standby synchronously.
> > And this can happen even without the patch, right?
> 
> I think you're right. This issue can happen even without the patch.
> 
> Maybe we should not mark the standby as "sync" whenever
> sync_standbys_defined
> is false even if synchronous_standby_names is actually set and
> walsenders have
> already detect that? Or we need more aggressive approach;
> make the checkpointer update sync_standby_priority values of
> all the walsenders? ISTM that the latter looks overkill...

It seems to me that the issue here is what
pg_stat_replication.sync_status doens't show "the working state of the
walsdner", but "the state the walsender is commanded". Non-zero
WalSnd.sync_standby_priority is immediately considered as "I am in
sync" but actually it is "I am going to sync from async or am already
in sync". And it is precisely "..., or am already in sync if
checkpointer already notices any sync walsender exists".

1. if a walsender changes its state from async to sync, it should once
  change its state back to "CATCHUP" or something like that.

2. pg_stat_replication.sync_status need to consider WalSnd.state
  and WalSndCtlData.sync_standbys_defined.

We might be able to let SyncRepUpdateSyncStandbysDefined postpone
changing sync_standbys_defined until any sync standby actually comes,
but it would be complex.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: SyncRepLock acquired exclusively in default configuration

From

Masahiko Sawada

Date:

21 August 2020, 01:20:09

On Wed, 19 Aug 2020 at 21:41, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>
>
>
> On 2020/08/12 15:32, Masahiko Sawada wrote:
> > On Wed, 12 Aug 2020 at 14:06, Asim Praveen <pasim@vmware.com> wrote:
> >>
> >>
> >>
> >>> On 11-Aug-2020, at 8:57 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> >>>
> >>> I think this gets to the root of the issue. If we check the flag
> >>> without a lock, we might see a slightly stale value. But, considering
> >>> that there's no particular amount of time within which configuration
> >>> changes are guaranteed to take effect, maybe that's OK. However, there
> >>> is one potential gotcha here: if the walsender declares the standby to
> >>> be synchronous, a user can see that, right? So maybe there's this
> >>> problem: a user sees that the standby is synchronous and expects a
> >>> transaction committing afterward to provoke a wait, but really it
> >>> doesn't. Now the user is unhappy, feeling that the system didn't
> >>> perform according to expectations.
> >>
> >> Yes, pg_stat_replication reports a standby in sync as soon as walsender updates priority of the standby to
somethingother than 0. 
> >>
> >> The potential gotcha referred above doesn’t seem too severe.  What is the likelihood of someone setting
synchronous_standby_namesGUC with either “*” or a standby name and then immediately promoting that standby?  If the
standbyis promoted before the checkpointer on master gets a chance to update sync_standbys_defined in shared memory,
commitsmade during this interval on master may not make it to standby.  Upon promotion, those commits may be lost. 
> >
> > I think that if the standby is quite behind the primary and in case of
> > the primary crashes, the likelihood of losing commits might get
> > higher. The user can see the standby became synchronous standby via
> > pg_stat_replication but commit completes without a wait because the
> > checkpointer doesn't update sync_standbys_defined yet. If the primary
> > crashes before standby catching up and the user does failover, the
> > committed transaction will be lost, even though the user expects that
> > transaction commit has been replicated to the standby synchronously.
> > And this can happen even without the patch, right?
>
> I think you're right. This issue can happen even without the patch.
>
> Maybe we should not mark the standby as "sync" whenever sync_standbys_defined
> is false even if synchronous_standby_names is actually set and walsenders have
> already detect that?

It seems good. I guess that we can set 'async' to sync_status and 0 to
sync_priority when sync_standbys_defined is not true regardless of
walsender's actual priority value. We print the message "standby
\"%s\" now has synchronous standby priority %u" in SyncRepInitConfig()
regardless of sync_standbys_defined but maybe it's fine as the message
isn't incorrect and it's DEBUG1 message.

> Or we need more aggressive approach;
> make the checkpointer update sync_standby_priority values of
> all the walsenders? ISTM that the latter looks overkill...

I think so too.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: SyncRepLock acquired exclusively in default configuration

From

Fujii Masao

Date:

26 August 2020, 17:40:29


On 2020/08/20 11:29, Kyotaro Horiguchi wrote:
> At Wed, 19 Aug 2020 13:20:46 +0000, Asim Praveen <pasim@vmware.com> wrote in
>>
>>
>>> On 12-Aug-2020, at 12:02 PM, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote:
>>>
>>> On Wed, 12 Aug 2020 at 14:06, Asim Praveen <pasim@vmware.com> wrote:
>>>>
>>>>
>>>>
>>>>> On 11-Aug-2020, at 8:57 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>>>>
>>>>> I think this gets to the root of the issue. If we check the flag
>>>>> without a lock, we might see a slightly stale value. But, considering
>>>>> that there's no particular amount of time within which configuration
>>>>> changes are guaranteed to take effect, maybe that's OK. However, there
>>>>> is one potential gotcha here: if the walsender declares the standby to
>>>>> be synchronous, a user can see that, right? So maybe there's this
>>>>> problem: a user sees that the standby is synchronous and expects a
>>>>> transaction committing afterward to provoke a wait, but really it
>>>>> doesn't. Now the user is unhappy, feeling that the system didn't
>>>>> perform according to expectations.
>>>>
>>>> Yes, pg_stat_replication reports a standby in sync as soon as walsender updates priority of the standby to
somethingother than 0.
 
>>>>
>>>> The potential gotcha referred above doesn’t seem too severe.  What is the likelihood of someone setting
synchronous_standby_namesGUC with either “*” or a standby name and then immediately promoting that standby?  If the
standbyis promoted before the checkpointer on master gets a chance to update sync_standbys_defined in shared memory,
commitsmade during this interval on master may not make it to standby.  Upon promotion, those commits may be lost.
 
>>>
>>> I think that if the standby is quite behind the primary and in case of
>>> the primary crashes, the likelihood of losing commits might get
>>> higher. The user can see the standby became synchronous standby via
>>> pg_stat_replication but commit completes without a wait because the
>>> checkpointer doesn't update sync_standbys_defined yet. If the primary
>>> crashes before standby catching up and the user does failover, the
>>> committed transaction will be lost, even though the user expects that
>>> transaction commit has been replicated to the standby synchronously.
>>> And this can happen even without the patch, right?
>>>
>>
>> It is correct that the issue is orthogonal to the patch upthread and I’ve marked
>> the commitfest entry as ready-for-committer.

Yes, thanks for the review!


> I find the name of SyncStandbysDefined macro is very confusing with
> the struct member sync_standbys_defined, but that might be another
> issue..
> 
> -     * Fast exit if user has not requested sync replication.
> +     * Fast exit if user has not requested sync replication, or there are no
> +     * sync replication standby names defined.
> 
> This comment sounds like we just do that twice. The reason for the
> check is to avoid wasteful exclusive locks on SyncRepLock, or to form
> double-checked locking on the variable. I think we should explain that
> here.

I added the following comments based on the suggestion by Sawada-san upthread. Thought?

+     * Since this routine gets called every commit time, it's important to
+     * exit quickly if sync replication is not requested. So we check
+     * WalSndCtl->sync_standbys_define without the lock and exit
+     * immediately if it's false. If it's true, we check it again later
+     * while holding the lock, to avoid the race condition described
+     * in SyncRepUpdateSyncStandbysDefined().


Attached is the updated version of the patch. I didn't change how to
fix the issue. But I changed the check for fast exit so that it's called
before setting the "mode", to avoid a few cycle.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Attachment

syncreplock_v2.patch

Re: SyncRepLock acquired exclusively in default configuration

From

Asim Praveen

Date:

27 August 2020, 06:59:13

> On 26-Aug-2020, at 11:10 PM, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> 
> I added the following comments based on the suggestion by Sawada-san upthread. Thought?
> 
> +     * Since this routine gets called every commit time, it's important to
> +     * exit quickly if sync replication is not requested. So we check
> +     * WalSndCtl->sync_standbys_define without the lock and exit
> +     * immediately if it's false. If it's true, we check it again later
> +     * while holding the lock, to avoid the race condition described
> +     * in SyncRepUpdateSyncStandbysDefined().
> 

+1.  May I suggest the following addition to the above comment (feel free to
rephrase / reject)?

"If sync_standbys_defined was being set from false to true and we observe it as
false, it ok to skip the wait.  Replication was async and it is in the process
of being changed to sync, due to user request.  Subsequent commits will observe
the change and start waiting.”

> 
> Attached is the updated version of the patch. I didn't change how to
> fix the issue. But I changed the check for fast exit so that it's called
> before setting the "mode", to avoid a few cycle.
> 


Looks good to me.  There is a typo in the comment:

    s/sync_standbys_define/sync_standbys_defined/

Asim

Re: SyncRepLock acquired exclusively in default configuration

From

Fujii Masao

Date:

28 August 2020, 01:33:45


On 2020/08/27 15:59, Asim Praveen wrote:
> 
>> On 26-Aug-2020, at 11:10 PM, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>>
>> I added the following comments based on the suggestion by Sawada-san upthread. Thought?
>>
>> +     * Since this routine gets called every commit time, it's important to
>> +     * exit quickly if sync replication is not requested. So we check
>> +     * WalSndCtl->sync_standbys_define without the lock and exit
>> +     * immediately if it's false. If it's true, we check it again later
>> +     * while holding the lock, to avoid the race condition described
>> +     * in SyncRepUpdateSyncStandbysDefined().
>>
> 
> +1.  May I suggest the following addition to the above comment (feel free to
> rephrase / reject)?
> 
> "If sync_standbys_defined was being set from false to true and we observe it as
> false, it ok to skip the wait.  Replication was async and it is in the process
> of being changed to sync, due to user request.  Subsequent commits will observe
> the change and start waiting.”

Thanks for the suggestion! I'm not sure if it's worth adding this because
it seems obvious thing. But maybe you imply that we need to comment
why the lock is not necessary when sync_standbys_defined is false. Right?
If so, what about updating the comments as follows?

+     * Since this routine gets called every commit time, it's important to
+     * exit quickly if sync replication is not requested. So we check
+     * WalSndCtl->sync_standbys_defined flag without the lock and exit
+     * immediately if it's false. If it's true, we need to check it again later
+     * while holding the lock, to check the flag and operate the sync rep
+     * queue atomically. This is necessary to avoid the race condition
+     * described in SyncRepUpdateSyncStandbysDefined(). On the other
+     * hand, if it's false, the lock is not necessary because we don't touch
+     * the queue.

> 
>>
>> Attached is the updated version of the patch. I didn't change how to
>> fix the issue. But I changed the check for fast exit so that it's called
>> before setting the "mode", to avoid a few cycle.
>>
> 
> 
> Looks good to me.  There is a typo in the comment:
> 
>      s/sync_standbys_define/sync_standbys_defined/

Fixed. Thanks!

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Attachment

syncreplock_v3.patch

Re: SyncRepLock acquired exclusively in default configuration

From

Asim Praveen

Date:

28 August 2020, 11:06:09


> On 28-Aug-2020, at 7:03 AM, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> 
> On 2020/08/27 15:59, Asim Praveen wrote:
>> 
>> +1.  May I suggest the following addition to the above comment (feel free to
>> rephrase / reject)?
>> "If sync_standbys_defined was being set from false to true and we observe it as
>> false, it ok to skip the wait.  Replication was async and it is in the process
>> of being changed to sync, due to user request.  Subsequent commits will observe
>> the change and start waiting.”
> 
> Thanks for the suggestion! I'm not sure if it's worth adding this because
> it seems obvious thing. But maybe you imply that we need to comment
> why the lock is not necessary when sync_standbys_defined is false. Right?
> If so, what about updating the comments as follows?
> 
> +     * Since this routine gets called every commit time, it's important to
> +     * exit quickly if sync replication is not requested. So we check
> +     * WalSndCtl->sync_standbys_defined flag without the lock and exit
> +     * immediately if it's false. If it's true, we need to check it again later
> +     * while holding the lock, to check the flag and operate the sync rep
> +     * queue atomically. This is necessary to avoid the race condition
> +     * described in SyncRepUpdateSyncStandbysDefined(). On the other
> +     * hand, if it's false, the lock is not necessary because we don't touch
> +     * the queue.

Thank you for updating the comment.  This looks better.

Asim

Re: SyncRepLock acquired exclusively in default configuration

From

Masahiko Sawada

Date:

28 August 2020, 12:20:01

On Fri, 28 Aug 2020 at 10:33, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>
>
>
> On 2020/08/27 15:59, Asim Praveen wrote:
> >
> >> On 26-Aug-2020, at 11:10 PM, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> >>
> >> I added the following comments based on the suggestion by Sawada-san upthread. Thought?
> >>
> >> +     * Since this routine gets called every commit time, it's important to
> >> +     * exit quickly if sync replication is not requested. So we check
> >> +     * WalSndCtl->sync_standbys_define without the lock and exit
> >> +     * immediately if it's false. If it's true, we check it again later
> >> +     * while holding the lock, to avoid the race condition described
> >> +     * in SyncRepUpdateSyncStandbysDefined().
> >>
> >
> > +1.  May I suggest the following addition to the above comment (feel free to
> > rephrase / reject)?
> >
> > "If sync_standbys_defined was being set from false to true and we observe it as
> > false, it ok to skip the wait.  Replication was async and it is in the process
> > of being changed to sync, due to user request.  Subsequent commits will observe
> > the change and start waiting.”
>
> Thanks for the suggestion! I'm not sure if it's worth adding this because
> it seems obvious thing. But maybe you imply that we need to comment
> why the lock is not necessary when sync_standbys_defined is false. Right?
> If so, what about updating the comments as follows?
>
> +        * Since this routine gets called every commit time, it's important to
> +        * exit quickly if sync replication is not requested. So we check
> +        * WalSndCtl->sync_standbys_defined flag without the lock and exit
> +        * immediately if it's false. If it's true, we need to check it again later
> +        * while holding the lock, to check the flag and operate the sync rep
> +        * queue atomically. This is necessary to avoid the race condition
> +        * described in SyncRepUpdateSyncStandbysDefined(). On the other
> +        * hand, if it's false, the lock is not necessary because we don't touch
> +        * the queue.
>
> >
> >>
> >> Attached is the updated version of the patch. I didn't change how to
> >> fix the issue. But I changed the check for fast exit so that it's called
> >> before setting the "mode", to avoid a few cycle.
> >>
> >
> >
> > Looks good to me.  There is a typo in the comment:
> >
> >      s/sync_standbys_define/sync_standbys_defined/
>
> Fixed. Thanks!
>

Both v2 and v3 look good to me too. IMO I'm okay with and without the
last sentence.


Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: SyncRepLock acquired exclusively in default configuration

From

Fujii Masao

Date:

02 September 2020, 01:58:58


On 2020/08/28 21:20, Masahiko Sawada wrote:
> On Fri, 28 Aug 2020 at 10:33, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>>
>>
>>
>> On 2020/08/27 15:59, Asim Praveen wrote:
>>>
>>>> On 26-Aug-2020, at 11:10 PM, Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>>>>
>>>> I added the following comments based on the suggestion by Sawada-san upthread. Thought?
>>>>
>>>> +     * Since this routine gets called every commit time, it's important to
>>>> +     * exit quickly if sync replication is not requested. So we check
>>>> +     * WalSndCtl->sync_standbys_define without the lock and exit
>>>> +     * immediately if it's false. If it's true, we check it again later
>>>> +     * while holding the lock, to avoid the race condition described
>>>> +     * in SyncRepUpdateSyncStandbysDefined().
>>>>
>>>
>>> +1.  May I suggest the following addition to the above comment (feel free to
>>> rephrase / reject)?
>>>
>>> "If sync_standbys_defined was being set from false to true and we observe it as
>>> false, it ok to skip the wait.  Replication was async and it is in the process
>>> of being changed to sync, due to user request.  Subsequent commits will observe
>>> the change and start waiting.”
>>
>> Thanks for the suggestion! I'm not sure if it's worth adding this because
>> it seems obvious thing. But maybe you imply that we need to comment
>> why the lock is not necessary when sync_standbys_defined is false. Right?
>> If so, what about updating the comments as follows?
>>
>> +        * Since this routine gets called every commit time, it's important to
>> +        * exit quickly if sync replication is not requested. So we check
>> +        * WalSndCtl->sync_standbys_defined flag without the lock and exit
>> +        * immediately if it's false. If it's true, we need to check it again later
>> +        * while holding the lock, to check the flag and operate the sync rep
>> +        * queue atomically. This is necessary to avoid the race condition
>> +        * described in SyncRepUpdateSyncStandbysDefined(). On the other
>> +        * hand, if it's false, the lock is not necessary because we don't touch
>> +        * the queue.
>>
>>>
>>>>
>>>> Attached is the updated version of the patch. I didn't change how to
>>>> fix the issue. But I changed the check for fast exit so that it's called
>>>> before setting the "mode", to avoid a few cycle.
>>>>
>>>
>>>
>>> Looks good to me.  There is a typo in the comment:
>>>
>>>       s/sync_standbys_define/sync_standbys_defined/
>>
>> Fixed. Thanks!
>>
> 
> Both v2 and v3 look good to me too. IMO I'm okay with and without the
> last sentence.

Asim and Sawada-san, thanks for the review! I pushed the patch.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: SyncRepLock acquired exclusively in default configuration

From

Andres Freund

Date:

02 September 2020, 02:12:05

On 2020-09-02 10:58:58 +0900, Fujii Masao wrote:
> Asim and Sawada-san, thanks for the review! I pushed the patch.

Thanks for all your combined work!