Re: [BUGS] BUG #14721: Assertion of synchronous replication - Mailing list pgsql-bugs

From Heikki Linnakangas
Subject Re: [BUGS] BUG #14721: Assertion of synchronous replication
Date
Msg-id 22a6cc48-0803-49d9-d1f3-dd5206f24c21@iki.fi
Whole thread Raw
In response to Re: [BUGS] BUG #14721: Assertion of synchronous replication  (Thomas Munro <thomas.munro@enterprisedb.com>)
List pgsql-bugs
On 06/29/2017 08:11 AM, Thomas Munro wrote:
> On Thu, Jun 29, 2017 at 4:27 PM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>>         /*
>>          * Acquiring the lock is not needed, the latch ensures proper
>>          * barriers. If it looks like we're done, we must really be done,
>>          * because once walsender changes the state to SYNC_REP_WAIT_COMPLETE,
>>          * it will never update it again, so we can't be seeing a stale value
>>          * in that case.
>>          */
>
> Yeah, counting on the latch for free barriers doesn't work if you
> happen to see SYNC_REP_WAIT_COMPLETE first time through the loop, or
> if you see it after a spurious signal woke you and then it's
> immediately set to SYNC_REP_WAIT_COMPLETE.  In those cases, the
> following Assert statement is making an assertion about cache
> coherency that doesn't work even on a friendly TSO system.

Ah yes, that's clearly broken.

> Can you reproduce the problem with this experimental patch applied?

I was able to reproduce this, after adding some sleeps (see attached). 
It must be pretty hard to hit it normally, or we would've gotten reports 
earlier. Although it only really shows with assertions enabled, so that 
could also be why.

In your patch, I think the pg_read_barrier() is only necessary, if 
assertions are enabled. Then again, better safe than sorry, and at least 
on my x86-64 machine with gcc, it doesn't actually change the produced 
machine code at all.

I committed your patch and backported to 9.5, with some additional 
comments. Thanks for the report Const!

- Heikki


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Attachment

pgsql-bugs by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: [BUGS] BUG #14654: With high statistics targets on ts_vector,unexpectedly high memory use & OOM are triggered
Next
From: Heikki Linnakangas
Date:
Subject: Re: [BUGS] BUG #14722: Segfault in tuplesort_heap_siftup, 32 bitoverflow