Home > mailing lists

Thread: Problem with synchronous replication

Problem with synchronous replication

From

"Dongming Liu"

Date:

25 October 2019, 07:18:34

Hi,

I recently discovered two possible bugs about synchronous replication.

1. SyncRepCleanupAtProcExit may delete an element that has been deleted

SyncRepCleanupAtProcExit first checks whether the queue is detached, if it is not detached,

acquires the SyncRepLock lock and deletes it. If this element has been deleted by walsender,

it will be deleted repeatedly, SHMQueueDelete will core with a segment fault.

IMO, like SyncRepCancelWait, we should lock the SyncRepLock first and then check

whether the queue is detached or not.

2. SyncRepWaitForLSN may not call SyncRepCancelWait if ereport check one interrupt.

For SyncRepWaitForLSN, if a query cancel interrupt arrives, we just terminate the wait

with suitable warning. As follows:

a. set QueryCancelPending to false

b. errport outputs one warning

c. calls SyncRepCancelWait to delete one element from the queue

If another cancel interrupt arrives when we are outputting warning at step b, the errfinish

will call CHECK_FOR_INTERRUPTS that will output an ERROR, such as "canceling autovacuum

task", then the process will jump to the sigsetjmp. Unfortunately, the step c will be skipped

and the element that should be deleted by SyncRepCancelWait is remained.

The easiest way to fix this is to swap the order of step b and step c. On the other hand,

let sigsetjmp clean up the queue may also be a good choice. What do you think?

Attached the patch, any feedback is greatly appreciated.

Best regards,

Dongming Liu

Attachment

0001-Fix-two-bugs-for-synchronous-replication.patch

Re: Problem with synchronous replication

From

"Dongming Liu"

Date:

29 October 2019, 05:40:41

Can someone help me to confirm that these two problems are bugs?

If they are bugs, please help review the patch or provide better fix suggestions.

Thanks.

Best regards,

Dongming Liu

------------------------------------------------------------------
From:LIU Dongming <lingce.ldm@alibaba-inc.com>
Sent At:2019 Oct. 25 (Fri.) 15:18
To:pgsql-hackers <pgsql-hackers@postgresql.org>
Subject:Problem with synchronous replication

Hi,

I recently discovered two possible bugs about synchronous replication.

1. SyncRepCleanupAtProcExit may delete an element that has been deleted
SyncRepCleanupAtProcExit first checks whether the queue is detached, if it is not detached,
acquires the SyncRepLock lock and deletes it. If this element has been deleted by walsender,
it will be deleted repeatedly, SHMQueueDelete will core with a segment fault.

IMO, like SyncRepCancelWait, we should lock the SyncRepLock first and then check
whether the queue is detached or not.

2. SyncRepWaitForLSN may not call SyncRepCancelWait if ereport check one interrupt.
For SyncRepWaitForLSN, if a query cancel interrupt arrives, we just terminate the wait
with suitable warning. As follows:

a. set QueryCancelPending to false
b. errport outputs one warning
c. calls SyncRepCancelWait to delete one element from the queue

If another cancel interrupt arrives when we are outputting warning at step b, the errfinish
will call CHECK_FOR_INTERRUPTS that will output an ERROR, such as "canceling autovacuum
task", then the process will jump to the sigsetjmp. Unfortunately, the step c will be skipped
and the element that should be deleted by SyncRepCancelWait is remained.

The easiest way to fix this is to swap the order of step b and step c. On the other hand,
let sigsetjmp clean up the queue may also be a good choice. What do you think?

Attached the patch, any feedback is greatly appreciated.

Best regards,
--
Dongming Liu

Re: Problem with synchronous replication

From

Kyotaro Horiguchi

Date:

29 October 2019, 10:50:01

Hello.

At Fri, 25 Oct 2019 15:18:34 +0800, "Dongming Liu" <lingce.ldm@alibaba-inc.com> wrote in 
> 
> Hi,
> 
> I recently discovered two possible bugs about synchronous replication.
> 
> 1. SyncRepCleanupAtProcExit may delete an element that has been deleted
> SyncRepCleanupAtProcExit first checks whether the queue is detached, if it is not detached, 
> acquires the SyncRepLock lock and deletes it. If this element has been deleted by walsender, 
> it will be deleted repeatedly, SHMQueueDelete will core with a segment fault. 
> 
> IMO, like SyncRepCancelWait, we should lock the SyncRepLock first and then check
> whether the queue is detached or not.

I think you're right here.

> 2. SyncRepWaitForLSN may not call SyncRepCancelWait if ereport check one interrupt.
> For SyncRepWaitForLSN, if a query cancel interrupt arrives, we just terminate the wait 
> with suitable warning. As follows:
> 
> a. set QueryCancelPending to false
> b. errport outputs one warning
> c. calls SyncRepCancelWait to delete one element from the queue
> 
> If another cancel interrupt arrives when we are outputting warning at step b, the errfinish
> will call CHECK_FOR_INTERRUPTS that will output an ERROR, such as "canceling autovacuum
> task", then the process will jump to the sigsetjmp. Unfortunately, the step c will be skipped
> and the element that should be deleted by SyncRepCancelWait is remained.
> 
> The easiest way to fix this is to swap the order of step b and step c. On the other hand, 
> let sigsetjmp clean up the queue may also be a good choice. What do you think?
> 
> Attached the patch, any feedback is greatly appreciated.

This is not right. It is in transaction commit so it is in a
HOLD_INTERRUPTS section. ProcessInterrupt does not respond to
cancel/die interrupts thus the ereport should return.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Problem with synchronous replication

From

Michael Paquier

Date:

30 October 2019, 01:11:35

On Tue, Oct 29, 2019 at 01:40:41PM +0800, Dongming Liu wrote:
> Can someone help me to confirm that these two problems are bugs?
> If they are bugs, please help review the patch or provide better fix
> suggestions.

There is no need to send periodic pings.  Sometimes it takes time to
get replies as time is an important resource that is always limited.
I can see that Horiguchi-san has already provided some feedback, and I
am looking now at your suggestions.
--
Michael

Attachment

signature.asc

Re: Problem with synchronous replication

From

Michael Paquier

Date:

30 October 2019, 01:45:11

On Tue, Oct 29, 2019 at 07:50:01PM +0900, Kyotaro Horiguchi wrote:
> At Fri, 25 Oct 2019 15:18:34 +0800, "Dongming Liu" <lingce.ldm@alibaba-inc.com> wrote in
>> I recently discovered two possible bugs about synchronous replication.
>>
>> 1. SyncRepCleanupAtProcExit may delete an element that has been deleted
>> SyncRepCleanupAtProcExit first checks whether the queue is detached, if it is not detached,
>> acquires the SyncRepLock lock and deletes it. If this element has been deleted by walsender,
>> it will be deleted repeatedly, SHMQueueDelete will core with a segment fault.
>>
>> IMO, like SyncRepCancelWait, we should lock the SyncRepLock first and then check
>> whether the queue is detached or not.
>
> I think you're right here.

Indeed.  Looking at the surroundings we expect some code paths to hold
SyncRepLock in exclusive or shared mode but we don't actually check
that the lock is hold.  So let's add some assertions while on it.

> This is not right. It is in transaction commit so it is in a
> HOLD_INTERRUPTS section. ProcessInterrupt does not respond to
> cancel/die interrupts thus the ereport should return.

Yeah.  There is an easy way to check after that: InterruptHoldoffCount
needs to be strictly positive.

My suggestions are attached.  Any thoughts?
--
Michael

Attachment

Re: Problem with synchronous replication

From

Kyotaro Horiguchi

Date:

30 October 2019, 03:34:28

At Wed, 30 Oct 2019 10:45:11 +0900, Michael Paquier <michael@paquier.xyz> wrote in 
> On Tue, Oct 29, 2019 at 07:50:01PM +0900, Kyotaro Horiguchi wrote:
> > At Fri, 25 Oct 2019 15:18:34 +0800, "Dongming Liu" <lingce.ldm@alibaba-inc.com> wrote in 
> >> I recently discovered two possible bugs about synchronous replication.
> >> 
> >> 1. SyncRepCleanupAtProcExit may delete an element that has been deleted
> >> SyncRepCleanupAtProcExit first checks whether the queue is detached, if it is not detached, 
> >> acquires the SyncRepLock lock and deletes it. If this element has been deleted by walsender, 
> >> it will be deleted repeatedly, SHMQueueDelete will core with a segment fault. 
> >> 
> >> IMO, like SyncRepCancelWait, we should lock the SyncRepLock first and then check
> >> whether the queue is detached or not.
> > 
> > I think you're right here.
> 
> Indeed.  Looking at the surroundings we expect some code paths to hold
> SyncRepLock in exclusive or shared mode but we don't actually check
> that the lock is hold.  So let's add some assertions while on it.

I looked around closer.

If we do that strictly, other functions like
SyncRepGetOldestSyncRecPtr need the same Assert()s. I think static
functions don't need Assert() and caution in their comments would be
enough.

On the other hand, the similar-looking code in SyncRepInitConfig and
SyncRepUpdateSyncStandbysDefined seems safe since AFAICS it doesn't
have (this kind of) racing condition on wirtes. It might need a
comment like that. Or, we could go to (apparently) safer-side by
applying the same amendment to it.

SyncRepReleaseWaiters reads MyWalSnd->sync_standby_priority without
holding SyncRepLock, which could lead to a message with wrong
priority. I'm not sure it matters, though.

> > This is not right. It is in transaction commit so it is in a
> > HOLD_INTERRUPTS section. ProcessInterrupt does not respond to
> > cancel/die interrupts thus the ereport should return.
> 
> Yeah.  There is an easy way to check after that: InterruptHoldoffCount
> needs to be strictly positive.
> 
> My suggestions are attached.  Any thoughts?

Seems reasonable for holdoffs. The same assertion would be needed in
more places but it's another issue.


By the way while I was looking this, I found a typo. Please find the
attached.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/replication/syncrep.c b/src/backend/replication/syncrep.c
index a21f7d3347..16aee1de4c 100644
--- a/src/backend/replication/syncrep.c
+++ b/src/backend/replication/syncrep.c
@@ -1065,8 +1065,8 @@ SyncRepUpdateSyncStandbysDefined(void)
 
         /*
          * If synchronous_standby_names has been reset to empty, it's futile
-         * for backends to continue to waiting.  Since the user no longer
-         * wants synchronous replication, we'd better wake them up.
+         * for backends to continue waiting.  Since the user no longer wants
+         * synchronous replication, we'd better wake them up.
          */
         if (!sync_standbys_defined)
         {

Re: Problem with synchronous replication

From

Dongming Liu

Date:

30 October 2019, 04:22:12

On Wed, Oct 30, 2019 at 9:12 AM Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Oct 29, 2019 at 01:40:41PM +0800, Dongming Liu wrote:
> Can someone help me to confirm that these two problems are bugs?
> If they are bugs, please help review the patch or provide better fix
> suggestions.

There is no need to send periodic pings. Sometimes it takes time to
get replies as time is an important resource that is always limited.

Thank you for your reply. I also realized my mistake, thank you for correcting me.

I can see that Horiguchi-san has already provided some feedback, and I
am looking now at your suggestions.

Thanks again.

Re: Problem with synchronous replication

From

lingce.ldm

Date:

30 October 2019, 06:27:33

On Oct 29, 2019, at 18:50, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:

Hello.

At Fri, 25 Oct 2019 15:18:34 +0800, "Dongming Liu" <lingce.ldm@alibaba-inc.com> wrote in

Hi,

I recently discovered two possible bugs about synchronous replication.

1. SyncRepCleanupAtProcExit may delete an element that has been deleted
SyncRepCleanupAtProcExit first checks whether the queue is detached, if it is not detached,
acquires the SyncRepLock lock and deletes it. If this element has been deleted by walsender,
it will be deleted repeatedly, SHMQueueDelete will core with a segment fault.

IMO, like SyncRepCancelWait, we should lock the SyncRepLock first and then check
whether the queue is detached or not.

I think you're right here.

Thanks.

2. SyncRepWaitForLSN may not call SyncRepCancelWait if ereport check one interrupt.
For SyncRepWaitForLSN, if a query cancel interrupt arrives, we just terminate the wait
with suitable warning. As follows:

a. set QueryCancelPending to false
b. errport outputs one warning
c. calls SyncRepCancelWait to delete one element from the queue

If another cancel interrupt arrives when we are outputting warning at step b, the errfinish
will call CHECK_FOR_INTERRUPTS that will output an ERROR, such as "canceling autovacuum
task", then the process will jump to the sigsetjmp. Unfortunately, the step c will be skipped
and the element that should be deleted by SyncRepCancelWait is remained.

The easiest way to fix this is to swap the order of step b and step c. On the other hand,
let sigsetjmp clean up the queue may also be a good choice. What do you think?

Attached the patch, any feedback is greatly appreciated.

This is not right. It is in transaction commit so it is in a
HOLD_INTERRUPTS section. ProcessInterrupt does not respond to
cancel/die interrupts thus the ereport should return.

I read the relevant code, you are right. I called SyncRepWaitForLSN somewhere else,

but forgot to put it in a HOLD_INTERRUPTS and triggered an exception.

regards.

—

Dongming Liu

Attachment

smime.p7s

Re: Problem with synchronous replication

From

lingce.ldm

Date:

30 October 2019, 06:27:46

On Oct 30, 2019, at 09:45, Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Oct 29, 2019 at 07:50:01PM +0900, Kyotaro Horiguchi wrote:
At Fri, 25 Oct 2019 15:18:34 +0800, "Dongming Liu" <lingce.ldm@alibaba-inc.com> wrote in
I recently discovered two possible bugs about synchronous replication.

1. SyncRepCleanupAtProcExit may delete an element that has been deleted
SyncRepCleanupAtProcExit first checks whether the queue is detached, if it is not detached,
acquires the SyncRepLock lock and deletes it. If this element has been deleted by walsender,
it will be deleted repeatedly, SHMQueueDelete will core with a segment fault.

IMO, like SyncRepCancelWait, we should lock the SyncRepLock first and then check
whether the queue is detached or not.

I think you're right here.

Indeed. Looking at the surroundings we expect some code paths to hold
SyncRepLock in exclusive or shared mode but we don't actually check
that the lock is hold. So let's add some assertions while on it.

This is not right. It is in transaction commit so it is in a
HOLD_INTERRUPTS section. ProcessInterrupt does not respond to
cancel/die interrupts thus the ereport should return.

Yeah. There is an easy way to check after that: InterruptHoldoffCount
needs to be strictly positive.

My suggestions are attached. Any thoughts?

Thanks, this patch looks good to me.

Attachment

smime.p7s

Re: Problem with synchronous replication

From

Fujii Masao

Date:

30 October 2019, 08:21:17

On Wed, Oct 30, 2019 at 4:16 PM lingce.ldm <lingce.ldm@alibaba-inc.com> wrote:
>
> On Oct 29, 2019, at 18:50, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
>
>
> Hello.
>
> At Fri, 25 Oct 2019 15:18:34 +0800, "Dongming Liu" <lingce.ldm@alibaba-inc.com> wrote in
>
>
> Hi,
>
> I recently discovered two possible bugs about synchronous replication.
>
> 1. SyncRepCleanupAtProcExit may delete an element that has been deleted
> SyncRepCleanupAtProcExit first checks whether the queue is detached, if it is not detached,
> acquires the SyncRepLock lock and deletes it. If this element has been deleted by walsender,
> it will be deleted repeatedly, SHMQueueDelete will core with a segment fault.
>
> IMO, like SyncRepCancelWait, we should lock the SyncRepLock first and then check
> whether the queue is detached or not.
>
>
> I think you're right here.

This change causes every ending backends to always take the exclusive lock
even when it's not in SyncRep queue. This may be problematic, for example,
when terminating multiple backends at the same time? If yes,
it might be better to check SHMQueueIsDetached() again after taking the lock.
That is,

if (!SHMQueueIsDetached(&(MyProc->syncRepLinks)))
{
    LWLockAcquire(SyncRepLock, LW_EXCLUSIVE);
    if (!SHMQueueIsDetached(&(MyProc->syncRepLinks)))
        SHMQueueDelete(&(MyProc->syncRepLinks));
    LWLockRelease(SyncRepLock);
}

Regards,

-- 
Fujii Masao

Re: Problem with synchronous replication

From

Kyotaro Horiguchi

Date:

30 October 2019, 08:43:04

Hello.

At Wed, 30 Oct 2019 17:21:17 +0900, Fujii Masao <masao.fujii@gmail.com> wrote in 
> This change causes every ending backends to always take the exclusive lock
> even when it's not in SyncRep queue. This may be problematic, for example,
> when terminating multiple backends at the same time? If yes,
> it might be better to check SHMQueueIsDetached() again after taking the lock.
> That is,

I'm not sure how much that harms but double-checked locking
(releasing) is simple enough for reducing possible congestion here, I
think. In short, + 1 for that.

> if (!SHMQueueIsDetached(&(MyProc->syncRepLinks)))
> {
>     LWLockAcquire(SyncRepLock, LW_EXCLUSIVE);
>     if (!SHMQueueIsDetached(&(MyProc->syncRepLinks)))
>         SHMQueueDelete(&(MyProc->syncRepLinks));
>     LWLockRelease(SyncRepLock);
> }

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Problem with synchronous replication

From

Michael Paquier

Date:

30 October 2019, 13:00:12

On Wed, Oct 30, 2019 at 05:21:17PM +0900, Fujii Masao wrote:
> This change causes every ending backends to always take the exclusive lock
> even when it's not in SyncRep queue. This may be problematic, for example,
> when terminating multiple backends at the same time? If yes,
> it might be better to check SHMQueueIsDetached() again after taking the lock.
> That is,
>
> if (!SHMQueueIsDetached(&(MyProc->syncRepLinks)))
> {
>     LWLockAcquire(SyncRepLock, LW_EXCLUSIVE);
>     if (!SHMQueueIsDetached(&(MyProc->syncRepLinks)))
>         SHMQueueDelete(&(MyProc->syncRepLinks));
>     LWLockRelease(SyncRepLock);
> }

Makes sense.  Thanks for the suggestion.
--
Michael

Attachment

signature.asc

Re: Problem with synchronous replication

From

Michael Paquier

Date:

31 October 2019, 01:30:56

On Wed, Oct 30, 2019 at 12:34:28PM +0900, Kyotaro Horiguchi wrote:
> If we do that strictly, other functions like
> SyncRepGetOldestSyncRecPtr need the same Assert()s. I think static
> functions don't need Assert() and caution in their comments would be
> enough.

Perhaps.  I'd rather be careful though if we meddle again with this
code in the future as it is shared across multiple places and
callers.

> SyncRepReleaseWaiters reads MyWalSnd->sync_standby_priority without
> holding SyncRepLock, which could lead to a message with wrong
> priority. I'm not sure it matters, though.

The WAL sender is the only writer of its info in shared memory, so
there is no problem to have it read data without its spin lock hold.

> Seems reasonable for holdoffs. The same assertion would be needed in
> more places but it's another issue.

Sure.

> By the way while I was looking this, I found a typo. Please find the
> attached.

Thanks, applied that one.
--
Michael

Attachment

signature.asc

Re: Problem with synchronous replication

From

Michael Paquier

Date:

31 October 2019, 02:11:03

On Wed, Oct 30, 2019 at 05:43:04PM +0900, Kyotaro Horiguchi wrote:
> At Wed, 30 Oct 2019 17:21:17 +0900, Fujii Masao <masao.fujii@gmail.com> wrote in
>> This change causes every ending backends to always take the exclusive lock
>> even when it's not in SyncRep queue. This may be problematic, for example,
>> when terminating multiple backends at the same time? If yes,
>> it might be better to check SHMQueueIsDetached() again after taking the lock.
>> That is,
>
> I'm not sure how much that harms but double-checked locking
> (releasing) is simple enough for reducing possible congestion here, I
> think.

FWIW, I could not measure any actual difference with pgbench -C, up to
500 sessions and an empty input file (just have one meta-command) and
-c 20.

I have added some comments in SyncRepCleanupAtProcExit(), and adjusted
the patch with the suggestion from Fujii-san.  Any comments?
--
Michael

Attachment

Re: Problem with synchronous replication

From

Fujii Masao

Date:

31 October 2019, 08:38:32

On Thu, Oct 31, 2019 at 11:12 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Wed, Oct 30, 2019 at 05:43:04PM +0900, Kyotaro Horiguchi wrote:
> > At Wed, 30 Oct 2019 17:21:17 +0900, Fujii Masao <masao.fujii@gmail.com> wrote in
> >> This change causes every ending backends to always take the exclusive lock
> >> even when it's not in SyncRep queue. This may be problematic, for example,
> >> when terminating multiple backends at the same time? If yes,
> >> it might be better to check SHMQueueIsDetached() again after taking the lock.
> >> That is,
> >
> > I'm not sure how much that harms but double-checked locking
> > (releasing) is simple enough for reducing possible congestion here, I
> > think.
>
> FWIW, I could not measure any actual difference with pgbench -C, up to
> 500 sessions and an empty input file (just have one meta-command) and
> -c 20.
>
> I have added some comments in SyncRepCleanupAtProcExit(), and adjusted
> the patch with the suggestion from Fujii-san.  Any comments?

Thanks for the patch! Looks good to me.

Regards,

-- 
Fujii Masao

Re: Problem with synchronous replication

From

"lingce.ldm"

Date:

01 November 2019, 05:27:54

On Oct 31, 2019, at 10:11, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Wed, Oct 30, 2019 at 05:43:04PM +0900, Kyotaro Horiguchi wrote:
>> At Wed, 30 Oct 2019 17:21:17 +0900, Fujii Masao <masao.fujii@gmail.com> wrote in
>>> This change causes every ending backends to always take the exclusive lock
>>> even when it's not in SyncRep queue. This may be problematic, for example,
>>> when terminating multiple backends at the same time? If yes,
>>> it might be better to check SHMQueueIsDetached() again after taking the lock.
>>> That is,
>>
>> I'm not sure how much that harms but double-checked locking
>> (releasing) is simple enough for reducing possible congestion here, I
>> think.
>
> FWIW, I could not measure any actual difference with pgbench -C, up to
> 500 sessions and an empty input file (just have one meta-command) and
> -c 20.
>
> I have added some comments in SyncRepCleanupAtProcExit(), and adjusted
> the patch with the suggestion from Fujii-san.  Any comments?

Thanks for the patch. Looks good to me +1.

Regards,

—
Dongming Liu

Attachment

smime.p7s

Re: Problem with synchronous replication

From

Michael Paquier

Date:

01 November 2019, 14:01:43

On Thu, Oct 31, 2019 at 05:38:32PM +0900, Fujii Masao wrote:
> Thanks for the patch! Looks good to me.

Thanks.  I have applied the core fix down to 9.4, leaving the new
assertion improvements only for HEAD.
--
Michael

Attachment

signature.asc