Thread: [BUGS] BUG #14680: startup process on standby encounter a deadlock ofTwoPhaseStateLock when redo 2PC xlog

The following bug has been logged on the website:

Bug reference:      14680
Logged by:          chuanting  wang
Email address:      wangchuanting@huawei.com
PostgreSQL version: 10beta1
Operating system:   SuSE
Description:

startup process on standby encounter a deadlock of TwoPhaseStateLock when
redo 2PC xlog.

the stack is:
Thread 6 (Thread 0x7f002e3ff700 (LWP 35305)):
#0  0x00007f1106167627 in semop () from /lib64/libc.so.6
#1  0x0000000000ce5622 in PGSemaphoreLock(PGSemaphoreData*, bool) ()
#2  0x0000000000d73e4d in LWLockAcquire(LWLockId, LWLockMode) ()
#3  0x0000000000969028 in RemoveGXact(GlobalTransactionData*) ()
#4  0x000000000096b254 in ProcessTwoPhaseBuffer(unsigned int, XLogRecPtr,
bool, bool, bool) ()
#5  0x000000000096b6b8 in PrescanPreparedTransactions(unsigned int**, int*)
()
#6  0x000000000097af97 in xlog_redo(XLogReaderState*) ()
#7  0x00000000009839df in StartupXLOG() ()
#8  0x0000000000cffd0b in StartupProcessMain() ()
#9  0x0000000000991539 in AuxiliaryProcessMain(int, char**) ()
#10 0x0000000000cfb209 in SubPostmasterMain(int, char**) ()
#11 0x0000000000cfb4f2 in MainStarterThreadFunc(void*) ()
#12 0x00007f11099a17b6 in start_thread () from /lib64/libpthread.so.0
#13 0x00007f1106165d6d in clone () from /lib64/libc.so.6
#14 0x0000000000000000 in ?? ()

the reason is:
when do PrescanPreparedTransactions, startup process already granted a
shared lock on TwoPhaseStateLock, then it called RemoveGXact, but in
RemoveGXact it acquired exclusive lock on TwoPhaseStateLock, so it blocked
by itself


anyone, has some idea on how to fix it?



--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

wangchuanting@huawei.com writes:
> startup process on standby encounter a deadlock of TwoPhaseStateLock when
> redo 2PC xlog.

Please provide an example of how to get into this state.
        regards, tom lane


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

On Wed, May 31, 2017 at 6:57 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> wangchuanting@huawei.com writes:
>> startup process on standby encounter a deadlock of TwoPhaseStateLock when
>> redo 2PC xlog.
>
> Please provide an example of how to get into this state.

That would help. Are you seeing in the logs something like "removing
future two-phase state from memory for XXX" or "removing stale
two-phase state from shared memory for XXX"?

Even with that, the light-weight lock sequence taken in those code
paths look definitely wrong to me, we should not take twice
TwoPhaseStateLock in the same code path. I think that we should remove
the lock acquisitions in RemoveGXact() and PrepareRedoRemove, and then
upgrade the locks of PrescanPreparedTransactions() and
StandbyRecoverPreparedTransactions() to be exclusive. We still need to
keep a lock as CheckPointTwoPhase() may still be triggered by the
checkpoint. Tom, what do you think?
-- 
Michael


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

we use Postgres-XC+pg10(4 coordinator + 4 datanode(pg 10 as datanode, and 1
master 1 standby for each datanode)), and benchmark tpcc, there is some
cross datanode transactions that use 2pc, during testing, we restart the
cluster, then one datanode standby can not recovery done and hangup with
TwoPhaseStateLock deadlock.

and sorry, we reinstall the cluster, and the log is removed, we will try to
reproduce, but anyway, the code is not right like Michael saied.



--
View this message in context:
http://www.postgresql-archive.org/BUG-14680-startup-process-on-standby-encounter-a-deadlock-of-TwoPhaseStateLock-when-redo-2PC-xlog-tp5964069p5964241.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

On Wed, May 31, 2017 at 12:30 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Wed, May 31, 2017 at 6:57 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> wangchuanting@huawei.com writes:
>>> startup process on standby encounter a deadlock of TwoPhaseStateLock when
>>> redo 2PC xlog.
>>
>> Please provide an example of how to get into this state.
>
> That would help. Are you seeing in the logs something like "removing
> future two-phase state from memory for XXX" or "removing stale
> two-phase state from shared memory for XXX"?
>
> Even with that, the light-weight lock sequence taken in those code
> paths look definitely wrong to me, we should not take twice
> TwoPhaseStateLock in the same code path. I think that we should remove
> the lock acquisitions in RemoveGXact() and PrepareRedoRemove, and then
> upgrade the locks of PrescanPreparedTransactions() and
> StandbyRecoverPreparedTransactions() to be exclusive. We still need to
> keep a lock as CheckPointTwoPhase() may still be triggered by the
> checkpoint. Tom, what do you think?

Attached is what I was thinking about for reference. I just came back
from a long flight and I am pretty tired, so my brain may have missed
something. I'll take again a look at this issue on Monday, an open
item has been added for now.
-- 
Michael

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Attachment
On Thu, Jun 1, 2017 at 12:11 AM, wangchuanting <wangchuanting@huawei.com> wrote:
> we use Postgres-XC+pg10(4 coordinator + 4 datanode(pg 10 as datanode, and 1
> master 1 standby for each datanode)), and benchmark tpcc, there is some
> cross datanode transactions that use 2pc, during testing, we restart the
> cluster, then one datanode standby can not recovery done and hangup with
> TwoPhaseStateLock deadlock.

(Former Postgres-XC maintainer here)
Are you aware of the fact that this is not going to work? Postgres
protocol has been extended between coordinators and datanodes to be
able to push down transaction ID, snapshot as well as timestamps when
running transactions across nodes. So by using a community Postgres as
a datanode you break the global consistency of the cluster. There are
also a couple of other things proper to datanodes.

> and sorry, we reinstall the cluster, and the log is removed, we will try to
> reproduce, but anyway, the code is not right like Michael said.

On that I agree.
-- 
Michael


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

>So by using a community Postgres as 
>a datanode you break the global consistency of the cluster. There are 
>also a couple of other things proper to datanodes. 

sorry, i confuse you, actually we merged "Speedup 2PC recovery by skipping
two phase state files in normal path" to pgxc.



--
View this message in context:
http://www.postgresql-archive.org/BUG-14680-startup-process-on-standby-encounter-a-deadlock-of-TwoPhaseStateLock-when-redo-2PC-xlog-tp5964069p5964398.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Hi Michael,

appreciate for your quick response.

i look carefully on the patch, because of we removed TwoPhaseStateLock
lwlock acquire in `RemoveGXact' and let caller held lwlock, so i think:
1. xact_redo also need held lwlock before call PrepareRedoRemove
2. RecoverPreparedTransactions also need held lwlock before call
ProcessTwoPhaseBuffer







--
View this message in context:
http://www.postgresql-archive.org/BUG-14680-startup-process-on-standby-encounter-a-deadlock-of-TwoPhaseStateLock-when-redo-2PC-xlog-tp5964069p5964407.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

On Thu, Jun 01, 2017 at 01:07:53AM -0700, Michael Paquier wrote:
> On Wed, May 31, 2017 at 12:30 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
> > On Wed, May 31, 2017 at 6:57 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> wangchuanting@huawei.com writes:
> >>> startup process on standby encounter a deadlock of TwoPhaseStateLock when
> >>> redo 2PC xlog.
> >>
> >> Please provide an example of how to get into this state.
> >
> > That would help. Are you seeing in the logs something like "removing
> > future two-phase state from memory for XXX" or "removing stale
> > two-phase state from shared memory for XXX"?
> >
> > Even with that, the light-weight lock sequence taken in those code
> > paths look definitely wrong to me, we should not take twice
> > TwoPhaseStateLock in the same code path. I think that we should remove
> > the lock acquisitions in RemoveGXact() and PrepareRedoRemove, and then
> > upgrade the locks of PrescanPreparedTransactions() and
> > StandbyRecoverPreparedTransactions() to be exclusive. We still need to
> > keep a lock as CheckPointTwoPhase() may still be triggered by the
> > checkpoint. Tom, what do you think?
> 
> Attached is what I was thinking about for reference. I just came back
> from a long flight and I am pretty tired, so my brain may have missed
> something. I'll take again a look at this issue on Monday, an open
> item has been added for now.

[Action required within three days.  This is a generic notification.]

The above-described topic is currently a PostgreSQL 10 open item.  Simon,
since you committed the patch believed to have created it, you own this open
item.  If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know.  Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message.  Include a date for your subsequent status update.  Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10.  Consequently, I will appreciate your efforts
toward speedy resolution.  Thanks.

[1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com



On Fri, Jun 2, 2017 at 10:59 AM, wangchuanting <wangchuanting@huawei.com> wrote:
> Appreciate for your quick response.

Back into business for this issue.

> i look carefully on the patch, because of we removed TwoPhaseStateLock
> lwlock acquire in `RemoveGXact' and let caller held lwlock, so i think:
> 1. xact_redo also need held lwlock before call PrepareRedoRemove
> 2. RecoverPreparedTransactions also need held lwlock before call
> ProcessTwoPhaseBuffer

Thanks for the input. I have reviewed all the code paths that have
been modified, and strengthened the code with assertions using
LWLockHeldByMeInMode() to make sure that the correct lock is always
hold in those code paths. There is actually no point in holding the
lock in restoreTwoPhaseData(), but as this makes the code less
consistent with the rest I added one. Also, I have replaced the lock
acquisition in PrepareRedoAdd() with acquisitions at higher levels,
and added an assertion in this routine. This makes the 2PC state data
addition and removal more consistent with each other.
-- 
Michael

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Attachment
On Mon, Jun 5, 2017 at 7:24 AM, Noah Misch <noah@leadboat.com> wrote:
> [Action required within three days.  This is a generic notification.]
>
> The above-described topic is currently a PostgreSQL 10 open item.  Simon,
> since you committed the patch believed to have created it, you own this open
> item.  If some other commit is more relevant or if this does not belong as a
> v10 open item, please let us know.  Otherwise, please observe the policy on
> open item ownership[1] and send a status update within three calendar days of
> this message.  Include a date for your subsequent status update.  Testers may
> discover new open items at any time, and I want to plan to get them all fixed
> well in advance of shipping v10.  Consequently, I will appreciate your efforts
> toward speedy resolution.  Thanks.

I have just provided a patch that takes care of solving this issue and
cleans up the lock handling for all the 2PC redo code paths.
-- 
Michael



Hi, Michael

I don't understand why the patch remove TwoPhaseStateLock in MarkAsPrepared.

if so:
1. does it need remove lock acquire in MarkAsPreparing also?
2. MarkAsPrepared will call ProcArrayAdd, and in ProcArrayAdd it acquires
ProcArrayLock, we should carefully handle the lock sequence about these two
locks.






--
View this message in context:
http://www.postgresql-archive.org/BUG-14680-startup-process-on-standby-encounter-a-deadlock-of-TwoPhaseStateLock-when-redo-2PC-xlog-tp5964069p5964796.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

On Mon, Jun 5, 2017 at 8:32 PM, wangchuanting <wangchuanting@huawei.com> wrote:
> I don't understand why the patch remove TwoPhaseStateLock in MarkAsPrepared.

This one is on purpose, no low-level routines in this patch acquire
TwoPhaseStateLock. Logically I agree that it does not change much
but... You wrote something below about that.

> if so:
> 1. does it need remove lock acquire in MarkAsPreparing also?

No, MarkAsPreparing() is used only in the non-redo code path.

> 2. MarkAsPrepared will call ProcArrayAdd, and in ProcArrayAdd it acquires
> ProcArrayLock, we should carefully handle the lock sequence about these two
> locks.

And this gives a reason to not complicate our lives.
-- 
Michael

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Attachment
On Sun, Jun 04, 2017 at 10:24:30PM +0000, Noah Misch wrote:
> On Thu, Jun 01, 2017 at 01:07:53AM -0700, Michael Paquier wrote:
> > On Wed, May 31, 2017 at 12:30 PM, Michael Paquier
> > <michael.paquier@gmail.com> wrote:
> > > On Wed, May 31, 2017 at 6:57 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >> wangchuanting@huawei.com writes:
> > >>> startup process on standby encounter a deadlock of TwoPhaseStateLock when
> > >>> redo 2PC xlog.
> > >>
> > >> Please provide an example of how to get into this state.
> > >
> > > That would help. Are you seeing in the logs something like "removing
> > > future two-phase state from memory for XXX" or "removing stale
> > > two-phase state from shared memory for XXX"?
> > >
> > > Even with that, the light-weight lock sequence taken in those code
> > > paths look definitely wrong to me, we should not take twice
> > > TwoPhaseStateLock in the same code path. I think that we should remove
> > > the lock acquisitions in RemoveGXact() and PrepareRedoRemove, and then
> > > upgrade the locks of PrescanPreparedTransactions() and
> > > StandbyRecoverPreparedTransactions() to be exclusive. We still need to
> > > keep a lock as CheckPointTwoPhase() may still be triggered by the
> > > checkpoint. Tom, what do you think?
> > 
> > Attached is what I was thinking about for reference. I just came back
> > from a long flight and I am pretty tired, so my brain may have missed
> > something. I'll take again a look at this issue on Monday, an open
> > item has been added for now.
> 
> [Action required within three days.  This is a generic notification.]
> 
> The above-described topic is currently a PostgreSQL 10 open item.  Simon,
> since you committed the patch believed to have created it, you own this open
> item.  If some other commit is more relevant or if this does not belong as a
> v10 open item, please let us know.  Otherwise, please observe the policy on
> open item ownership[1] and send a status update within three calendar days of
> this message.  Include a date for your subsequent status update.  Testers may
> discover new open items at any time, and I want to plan to get them all fixed
> well in advance of shipping v10.  Consequently, I will appreciate your efforts
> toward speedy resolution.  Thanks.
> 
> [1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

This PostgreSQL 10 open item is past due for your status update.  Kindly send
a status update within 24 hours, and include a date for your subsequent status
update.  Refer to the policy on open item ownership:
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com



On Fri, Jun 9, 2017 at 3:17 PM, Noah Misch <noah@leadboat.com> wrote:
> This PostgreSQL 10 open item is past due for your status update.  Kindly send
> a status update within 24 hours, and include a date for your subsequent status
> update.  Refer to the policy on open item ownership:
> https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

I have sent an updated patch simplifying the locking here:
https://www.postgresql.org/message-id/CAB7nPqQthLP9GvD2242epHKOBkDMd+03tSuFvK3cVZsGarQyWA@mail.gmail.com
-- 
Michael



On Thu, Jun 08, 2017 at 11:17:38PM -0700, Noah Misch wrote:
> On Sun, Jun 04, 2017 at 10:24:30PM +0000, Noah Misch wrote:
> > On Thu, Jun 01, 2017 at 01:07:53AM -0700, Michael Paquier wrote:
> > > On Wed, May 31, 2017 at 12:30 PM, Michael Paquier
> > > <michael.paquier@gmail.com> wrote:
> > > > On Wed, May 31, 2017 at 6:57 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > >> wangchuanting@huawei.com writes:
> > > >>> startup process on standby encounter a deadlock of TwoPhaseStateLock when
> > > >>> redo 2PC xlog.
> > > >>
> > > >> Please provide an example of how to get into this state.
> > > >
> > > > That would help. Are you seeing in the logs something like "removing
> > > > future two-phase state from memory for XXX" or "removing stale
> > > > two-phase state from shared memory for XXX"?
> > > >
> > > > Even with that, the light-weight lock sequence taken in those code
> > > > paths look definitely wrong to me, we should not take twice
> > > > TwoPhaseStateLock in the same code path. I think that we should remove
> > > > the lock acquisitions in RemoveGXact() and PrepareRedoRemove, and then
> > > > upgrade the locks of PrescanPreparedTransactions() and
> > > > StandbyRecoverPreparedTransactions() to be exclusive. We still need to
> > > > keep a lock as CheckPointTwoPhase() may still be triggered by the
> > > > checkpoint. Tom, what do you think?
> > > 
> > > Attached is what I was thinking about for reference. I just came back
> > > from a long flight and I am pretty tired, so my brain may have missed
> > > something. I'll take again a look at this issue on Monday, an open
> > > item has been added for now.
> > 
> > [Action required within three days.  This is a generic notification.]
> > 
> > The above-described topic is currently a PostgreSQL 10 open item.  Simon,
> > since you committed the patch believed to have created it, you own this open
> > item.  If some other commit is more relevant or if this does not belong as a
> > v10 open item, please let us know.  Otherwise, please observe the policy on
> > open item ownership[1] and send a status update within three calendar days of
> > this message.  Include a date for your subsequent status update.  Testers may
> > discover new open items at any time, and I want to plan to get them all fixed
> > well in advance of shipping v10.  Consequently, I will appreciate your efforts
> > toward speedy resolution.  Thanks.
> > 
> > [1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com
> 
> This PostgreSQL 10 open item is past due for your status update.  Kindly send
> a status update within 24 hours, and include a date for your subsequent status
> update.  Refer to the policy on open item ownership:
> https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

IMMEDIATE ATTENTION REQUIRED.  This PostgreSQL 10 open item is long past due
for your status update.  Please reacquaint yourself with the policy on open
item ownership[1] and then reply immediately.  If I do not hear from you by
2017-06-11 07:00 UTC, I will transfer this item to release management team
ownership without further notice.

[1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com



Noah Misch wrote:

> IMMEDIATE ATTENTION REQUIRED.  This PostgreSQL 10 open item is long past due
> for your status update.  Please reacquaint yourself with the policy on open
> item ownership[1] and then reply immediately.  If I do not hear from you by
> 2017-06-11 07:00 UTC, I will transfer this item to release management team
> ownership without further notice.

I volunteer to own this item.  I'll report on Wednesday 14th at the latest.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Michael Paquier wrote:

> > i look carefully on the patch, because of we removed TwoPhaseStateLock
> > lwlock acquire in `RemoveGXact' and let caller held lwlock, so i think:
> > 1. xact_redo also need held lwlock before call PrepareRedoRemove
> > 2. RecoverPreparedTransactions also need held lwlock before call
> > ProcessTwoPhaseBuffer
> 
> Thanks for the input. I have reviewed all the code paths that have
> been modified, and strengthened the code with assertions using
> LWLockHeldByMeInMode() to make sure that the correct lock is always
> hold in those code paths. There is actually no point in holding the
> lock in restoreTwoPhaseData(), but as this makes the code less
> consistent with the rest I added one. Also, I have replaced the lock
> acquisition in PrepareRedoAdd() with acquisitions at higher levels,
> and added an assertion in this routine. This makes the 2PC state data
> addition and removal more consistent with each other.

Thanks for the patch -- I agree that the new coding looks better.

I also agree that restoreTwoPhaseData doesn't need to hold the lock,
since we don't expect anything running concurrently, but that it's okay
to hold it and makes the whole thing simpler.

We could try to apply an equivalent rationale to
RecoverPreparedTransactions: even though we have already been running in
HOT standby mode for a while, there's no possibility of concurrent
activity either: since we exited recovery, master cannot write any more
2PC xacts, and we haven't yet set the flag that lets open sessions write
WAL.  However, it seems mildly dangerous to run the bottom sections of
the loop with the lock held, since it acquires other lwlocks and
generally does a lot of crap.

Also, let's add an lwlock-is-held assertion to MarkAsPreparingGuts.

BTW: I think that saving one redundant line of code is not worth the
ugliness that it costs in xact_redo, so let's undo that.  The patch is a
bit bulky, but the resulting code is simpler.

In short, I propose the attached.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Attachment
Alvaro Herrera wrote:

For some reason I sent a preliminary version of the email I was writing.
Some additional thoughts that were supposed to be in there:

> I also agree that restoreTwoPhaseData doesn't need to hold the lock,
> since we don't expect anything running concurrently, but that it's okay
> to hold it and makes the whole thing simpler.

It's okay to hold the lock for the whole duration of the function.

> We could try to apply an equivalent rationale to
> RecoverPreparedTransactions: even though we have already been running in
> HOT standby mode for a while, there's no possibility of concurrent
> activity either: since we exited recovery, master cannot write any more
> 2PC xacts, and we haven't yet set the flag that lets open sessions write
> WAL.  However, it seems mildly dangerous to run the bottom sections of
> the loop with the lock held, since it acquires other lwlocks and
> generally does a lot of crap.

Therefore, let's adopt your idea of acquiring the lock only for the
specific low-level calls that require it.  But the comment atop the loop
needs a lot of work to explain why it's doing that (i.e. how come it's
reading TwoPhaseState without a lock), as in my proposed patch.

(At first, I didn't really like this very much and wanted to add more
locking to that function, but it turned quite messy, so I back-tracked).

Thanks

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

wangchuanting wrote:
> we use Postgres-XC+pg10(4 coordinator + 4 datanode(pg 10 as datanode, and 1
> master 1 standby for each datanode)), and benchmark tpcc, there is some
> cross datanode transactions that use 2pc, during testing, we restart the
> cluster, then one datanode standby can not recovery done and hangup with
> TwoPhaseStateLock deadlock.
> 
> and sorry, we reinstall the cluster, and the log is removed, we will try to
> reproduce, but anyway, the code is not right like Michael saied.

Did you have any luck reproducing the original issue?  It would be good
if you could confirm that the bug is gone with the patch I posted in
https://www.postgresql.org/message-id/20170611022536.goef4cdbmgicye5g@alvherre.pgsql

Thanks

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

On Sun, Jun 11, 2017 at 12:24 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Alvaro Herrera wrote:
>
> For some reason I sent a preliminary version of the email I was writing.
> Some additional thoughts that were supposed to be in there:

No problem. Thanks for the input.

>> I also agree that restoreTwoPhaseData doesn't need to hold the lock,
>> since we don't expect anything running concurrently, but that it's okay
>> to hold it and makes the whole thing simpler.
>
> It's okay to hold the lock for the whole duration of the function.

Yup.

>> We could try to apply an equivalent rationale to
>> RecoverPreparedTransactions: even though we have already been running in
>> HOT standby mode for a while, there's no possibility of concurrent
>> activity either: since we exited recovery, master cannot write any more
>> 2PC xacts, and we haven't yet set the flag that lets open sessions write
>> WAL.  However, it seems mildly dangerous to run the bottom sections of
>> the loop with the lock held, since it acquires other lwlocks and
>> generally does a lot of crap.
>
> Therefore, let's adopt your idea of acquiring the lock only for the
> specific low-level calls that require it.  But the comment atop the loop
> needs a lot of work to explain why it's doing that (i.e. how come it's
> reading TwoPhaseState without a lock), as in my proposed patch.
>
> (At first, I didn't really like this very much and wanted to add more
> locking to that function, but it turned quite messy, so I back-tracked).

Yes, I got the same temptation but give up as well because of the
unnecessary complications. GXactLoadSubxactData() does nothing fancy,
but my worries are in ProcessRecords() and PostPrepare_Twophase(),
particularly if they do more complicated and fancy stuff in the future
for whatever reason. The comment you have added is this one:
        /*
-        * Don't need a lock in the recovery phase.
+        * It's okay to access TwoPhaseState without a lock here: recovery is
+        * finished (so if we were a standby, there's no master that can prepare
+        * transactions anymore), and we haven't yet set WAL as open for writes,
+        * so local existing backends, if any, cannot do so either.
We could use a
+        * coding pattern similar to restoreTwoPhaseData, i.e., run
the whole loop
+        * with the lock held; but this loop is far more complex, so
instead only
+        * grab the lock while calling the low-level functions that require it.
         */
I have reworked the comment as follows:
    /*
-    * Don't need a lock in the recovery phase.
+    * It is fine to access TwoPhaseState without a lock here: recovery is
+    * finished (so if we were a standby, there's no master that can prepare
+    * transactions anymore), and we haven't yet set WAL as open for writes,
+    * so local existing backends, if any, cannot do so either.  We could use a
+    * coding pattern similar to restoreTwoPhaseData, i.e., run the whole loop
+    * with the lock held; but this loop is far more complex, so instead only
+    * grab the lock while calling the low-level functions working directly on
+    * manipulating the two-phase state data.  Functions working directly on
+    * PGPROC entries linked with the two-phase transaction work with other
+    * types of locks but we don't want to complicate that more than necessary.
     */
The v5 attached does not have any other changes than what you did in
v4 (plus typos noticed).

-   if (info == XLOG_XACT_COMMIT || info == XLOG_XACT_COMMIT_PREPARED)
+   if (info == XLOG_XACT_COMMIT)
    {
        xl_xact_commit *xlrec = (xl_xact_commit *) XLogRecGetData(record);
        xl_xact_parsed_commit parsed;
No objections to the shuffling done here.

+ * An exclusive lock on TwoPhaseStateLock is taken for the duration of the
+ * scan of TwoPhaseState data as its may be updated.
Typo noticed here => s/its/it/.
-- 
Michael

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Attachment
> Did you have any luck reproducing the original issue?  

sorry, not reproduced



--
View this message in context:
http://www.postgresql-archive.org/BUG-14680-startup-process-on-standby-encounter-a-deadlock-of-TwoPhaseStateLock-when-redo-2PC-xlog-tp5964069p5966050.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

On Mon, Jun 12, 2017 at 3:25 PM, wangchuanting <wangchuanting@huawei.com> wrote:
>> Did you have any luck reproducing the original issue?
>
> sorry, not reproduced

I am a bit confused. Do you mean that you applied the patch and did
not see the issue again? Or do you mean that you did not have the time
to check it?
-- 
Michael


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

>I am a bit confused. Do you mean that you applied the patch and did 
>not see the issue again? Or do you mean that you did not have the time 
>to check it?

i means: this issue occurs only once in my test, and have not reproduce it.

i review the patch, and i think the patch fix the issue(but i have not
applied it, i will apply it tomorrow).




--
View this message in context:
http://www.postgresql-archive.org/BUG-14680-startup-process-on-standby-encounter-a-deadlock-of-TwoPhaseStateLock-when-redo-2PC-xlog-tp5964069p5966106.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Michael Paquier wrote:
> On Sun, Jun 11, 2017 at 12:24 PM, Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:

> I have reworked the comment as follows:
>     /*
> -    * Don't need a lock in the recovery phase.
> +    * It is fine to access TwoPhaseState without a lock here: recovery is
> +    * finished (so if we were a standby, there's no master that can prepare
> +    * transactions anymore), and we haven't yet set WAL as open for writes,
> +    * so local existing backends, if any, cannot do so either.  We could use a
> +    * coding pattern similar to restoreTwoPhaseData, i.e., run the whole loop
> +    * with the lock held; but this loop is far more complex, so instead only
> +    * grab the lock while calling the low-level functions working directly on
> +    * manipulating the two-phase state data.  Functions working directly on
> +    * PGPROC entries linked with the two-phase transaction work with other
> +    * types of locks but we don't want to complicate that more than necessary.
>      */

Hmm.  Honestly I don't like the final sentence you added.  I find it
more confusing than useful, because it doesn't explain what these "other
types of locks" are, or why we care.

However, I found out that this rationale is likely not true, because the
checkpointer may be running concurrently with this code from startup
process, and checkpointer does process 2PC data.  Maybe there are other
reasons why there's no live bug here, but it looks wrong (I didn't try
to reproduce a problem).

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

On Tue, Jun 13, 2017 at 7:49 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> However, I found out that this rationale is likely not true, because the
> checkpointer may be running concurrently with this code from startup
> process, and checkpointer does process 2PC data.  Maybe there are other
> reasons why there's no live bug here, but it looks wrong (I didn't try
> to reproduce a problem).

The current coding is actually safe because the checkpointer does not
remove or add any 2PC entry in the array while holding
TwoPhaseStateLock, it just updates some values that need to be read
and/or written while holding the lock. Well, to be honest, HEAD is
wrong because it can read a flag value while the checkpointer updates
it, and the patch is careful to change that to be correct. The wrong
part is when calling ProcessTwoPhaseBuffer() in
RecoverPreparedTransactions() which accesses gxact->ondisk and
prepare_start_lsn without locking things.
-- 
Michael


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Michael Paquier wrote:

> The current coding is actually safe because the checkpointer does not
> remove or add any 2PC entry in the array while holding
> TwoPhaseStateLock, it just updates some values that need to be read
> and/or written while holding the lock. Well, to be honest, HEAD is
> wrong because it can read a flag value while the checkpointer updates
> it, and the patch is careful to change that to be correct. The wrong
> part is when calling ProcessTwoPhaseBuffer() in
> RecoverPreparedTransactions() which accesses gxact->ondisk and
> prepare_start_lsn without locking things.

Honestly I don't like this rationale very much.  Even if doing the
unlocked access is safe today, it looks like installing a landmine for
the future, and for what?  I don't think there's a lot to be gained:
RecoverPreparedTransactions only runs once in the life of a server, and
CheckPointTwoPhase is supposed to have a very short runtime (per
explanation in comments therein).  It seems better to me to continue our
tradition of using the appropriate locks instead of playing a dangerous
game.

So I propose that RecoverPreparedTransactions grabs exclusive lock at
the top, and only the bottom part of the loop is done unlocked, which
AFAICS should be safe.  (MarkAsPrepared gained a boolean argument
indicating that caller already holds lock).

Here's a patch along those lines.  The full testsuite is running now,
but the recovery tests pass fine.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Attachment
On Wed, Jun 14, 2017 at 8:08 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> So I propose that RecoverPreparedTransactions grabs exclusive lock at
> the top, and only the bottom part of the loop is done unlocked, which
> AFAICS should be safe.  (MarkAsPrepared gained a boolean argument
> indicating that caller already holds lock).

Logically both approaches are really close as with this approach the
additional locked area is when scanning the entries of TwoPhaseState.
But I agree that your suggestion would be safer in the long run. This
looks good to me after a close look.
-- 
Michael


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Michael Paquier wrote:
> On Wed, Jun 14, 2017 at 8:08 AM, Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
> > So I propose that RecoverPreparedTransactions grabs exclusive lock at
> > the top, and only the bottom part of the loop is done unlocked, which
> > AFAICS should be safe.  (MarkAsPrepared gained a boolean argument
> > indicating that caller already holds lock).
> 
> Logically both approaches are really close as with this approach the
> additional locked area is when scanning the entries of TwoPhaseState.
> But I agree that your suggestion would be safer in the long run. This
> looks good to me after a close look.

Thanks, pushed.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

On Thu, Jun 15, 2017 at 12:35 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Michael Paquier wrote:
>> On Wed, Jun 14, 2017 at 8:08 AM, Alvaro Herrera
>> <alvherre@2ndquadrant.com> wrote:
>> > So I propose that RecoverPreparedTransactions grabs exclusive lock at
>> > the top, and only the bottom part of the loop is done unlocked, which
>> > AFAICS should be safe.  (MarkAsPrepared gained a boolean argument
>> > indicating that caller already holds lock).
>>
>> Logically both approaches are really close as with this approach the
>> additional locked area is when scanning the entries of TwoPhaseState.
>> But I agree that your suggestion would be safer in the long run. This
>> looks good to me after a close look.
>
> Thanks, pushed.

Thanks for the commit.
-- 
Michael


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs